首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Ensemble-based approaches to RNA secondary structure prediction have become increasingly appreciated in recent years. Here, we utilize sampling and clustering of the Boltzmann ensemble of RNA secondary structures to investigate whether biological sequences exhibit ensemble features that are distinct from their random shuffles. Representative messenger RNAs (mRNAs), structural RNAs, and precursor microRNAs (miRNAs) are analyzed for nine ensemble features. These include structure clustering features, the energy gap between the minimum free energy (MFE) and the ensemble, the numbers of high-frequency base pairs in the ensemble and in clusters, the average base-pair distance between the MFE structure and the ensemble, and between-cluster and within-cluster sums of squares. For each of the features, we observe a lack of significant distinction between mRNAs and their random shuffles. For five features, significant differences are found between structural RNAs and random counterparts. For seven features including the five for structural RNAs, much greater differences are observed between precursor miRNAs and random shuffles. These findings reveal differences in the Boltzmann structure ensemble among different types of functional RNAs. In addition, for two ensemble features, we observe distinctive, non-overlapping distributions for precursor miRNAs and random shuffles. A distributional separation can be particularly useful for the prediction of miRNA genes.  相似文献   

3.
Structure clustering features on the Sfold Web server   总被引:2,自引:0,他引:2  
SUMMARY: The energy landscape of RNA secondary structures is often complex, and the Boltzmann-weighted ensemble usually contains distinct clusters. Furthermore, the minimum free energy structure often lies outside of the cluster containing the structure determined by comparative sequence analysis. We have developed procedures to characterize and visualize the Boltzmann-weighted ensemble, and have made them available on the Sfold Web server. The new features on the Web server include clustering statistics, ensemble and cluster centroids, multi-dimensional scaling display and energy landscape representation of the Boltzmann-weighted ensemble. AVAILABILITY: http://sfold.wadsworth.org; http://www.bioinfo.rpi.edu/applications/sfold CONTACT: chanc@wadsworth.org.  相似文献   

4.
A novel method is presented for predicting the common secondary structures and alignment of two homologous RNA sequences by sampling the ‘structural alignment’ space, i.e. the joint space of their alignments and common secondary structures. The structural alignment space is sampled according to a pseudo-Boltzmann distribution based on a pseudo-free energy change that combines base pairing probabilities from a thermodynamic model and alignment probabilities from a hidden Markov model. By virtue of the implicit comparative analysis between the two sequences, the method offers an improvement over single sequence sampling of the Boltzmann ensemble. A cluster analysis shows that the samples obtained from joint sampling of the structural alignment space cluster more closely than samples generated by the single sequence method. On average, the representative (centroid) structure and alignment of the most populated cluster in the sample of structures and alignments generated by joint sampling are more accurate than single sequence sampling and alignment based on sequence alone, respectively. The ‘best’ centroid structure that is closest to the known structure among all the centroids is, on average, more accurate than structure predictions of other methods. Additionally, cluster analysis identifies, on average, a few clusters, whose centroids can be presented as alternative candidates. The source code for the proposed method can be downloaded at http://rna.urmc.rochester.edu.  相似文献   

5.
Free energy minimization has been the most popular method for RNA secondary structure prediction for decades. It is based on a set of empirical free energy change parameters derived from experiments using a nearest-neighbor model. In this study, a program, MaxExpect, that predicts RNA secondary structure by maximizing the expected base-pair accuracy, is reported. This approach was first pioneered in the program CONTRAfold, using pair probabilities predicted with a statistical learning method. Here, a partition function calculation that utilizes the free energy change nearest-neighbor parameters is used to predict base-pair probabilities as well as probabilities of nucleotides being single-stranded. MaxExpect predicts both the optimal structure (having highest expected pair accuracy) and suboptimal structures to serve as alternative hypotheses for the structure. Tested on a large database of different types of RNA, the maximum expected accuracy structures are, on average, of higher accuracy than minimum free energy structures. Accuracy is measured by sensitivity, the percentage of known base pairs correctly predicted, and positive predictive value (PPV), the percentage of predicted pairs that are in the known structure. By favoring double-strandedness or single-strandedness, a higher sensitivity or PPV of prediction can be favored, respectively. Using MaxExpect, the average PPV of optimal structure is improved from 66% to 68% at the same sensitivity level (73%) compared with free energy minimization.  相似文献   

6.
We have developed a new combined approach for ab initio protein structure prediction. The protein conformation is described as a lattice chain connecting C(alpha) atoms, with attached C(beta) atoms and side-chain centers of mass. The model force field includes various short-range and long-range knowledge-based potentials derived from a statistical analysis of the regularities of protein structures. The combination of these energy terms is optimized through the maximization of correlation for 30 x 60,000 decoys between the root mean square deviation (RMSD) to native and energies, as well as the energy gap between native and the decoy ensemble. To accelerate the conformational search, a newly developed parallel hyperbolic sampling algorithm with a composite movement set is used in the Monte Carlo simulation processes. We exploit this strategy to successfully fold 41/100 small proteins (36 approximately 120 residues) with predicted structures having a RMSD from native below 6.5 A in the top five cluster centroids. To fold larger-size proteins as well as to improve the folding yield of small proteins, we incorporate into the basic force field side-chain contact predictions from our threading program PROSPECTOR where homologous proteins were excluded from the data base. With these threading-based restraints, the program can fold 83/125 test proteins (36 approximately 174 residues) with structures having a RMSD to native below 6.5 A in the top five cluster centroids. This shows the significant improvement of folding by using predicted tertiary restraints, especially when the accuracy of side-chain contact prediction is >20%. For native fold selection, we introduce quantities dependent on the cluster density and the combination of energy and free energy, which show a higher discriminative power to select the native structure than the previously used cluster energy or cluster size, and which can be used in native structure identification in blind simulations. These procedures are readily automated and are being implemented on a genomic scale.  相似文献   

7.
Flexible docking between a protein (lysozyme) and an inhibitor (tri-N-acetyl-D-glucosamine, tri-NAG) was carried out by an enhanced conformational sampling method, multicanonical molecular dynamics simulation. We used a flexible all-atom model to express lysozyme, tri-NAG, and water molecules surrounding the two bio-molecules. The advantages of this sampling method are as follows: the conformation of system is widely sampled without trapping at energy minima, a thermally equilibrated conformational ensemble at an arbitrary temperature can be reconstructed from the simulation trajectory, and the thermodynamic weight can be assigned to each sampled conformation. During the simulation, exchanges between the binding and free (i.e., unbinding) states of the protein and the inhibitor were repeatedly observed. The conformational ensemble reconstructed at 300 K involved various conformational clusters. The main outcome of the current study is that the most populated conformational cluster (i.e., the cluster of the lowest free energy) was assigned to the native complex structure (i.e., the X-ray complex structure). The simulation also produced non-native complex structures, where the protein and the inhibitor bound with different modes from that of the native complex structure, as well as the unbinding structures. A free-energy barrier (i.e., activation free energy) was clearly detected between the native complex structures and the other structures. The thermal fluctuations of tri-NAG in the lowest free-energy complex correlated well with the X-ray B-factors of tri-NAG in the X-ray complex structure. The existence of the free-energy barrier ensures that the lowest free-energy structure can be discriminated naturally from the other structures. In other words, the multicanonical molecular dynamics simulation can predict the native complex structure without any empirical objective function. The current study also manifested that the flexible all-atom model and the physico-chemically defined atomic-level force field can reproduce the native complex structure. A drawback of the current method is that it requires a time consuming computation due to the exhaustive conformational sampling. We discussed a possibility for combining the current method with conventional docking methods.  相似文献   

8.
The hypothesis that RNA coaxial stacking can be predicted by free energy minimization using nearest-neighbor parameters is tested. The results show 58.2% positive predictive value (PPV) and 65.7% sensitivity for accuracy of the lowest free energy configuration compared with crystal structures. The probability of each stacking configuration can be predicted using a partition function calculation. Based on the dependence of accuracy on the calculated probability of the stacks, a probability threshold of 0.7 was chosen for predicting coaxial stacks. When scoring these likely stacks, the PPV was 66.7% at a sensitivity of 51.9%. It is observed that the coaxial stacks of helices that are not separated by unpaired nucleotides can be predicted with a significantly higher accuracy (74.0% PPV, 66.1% sensitivity) than the coaxial stacks mediated by noncanonical base pairs (55.9% PPV, 36.5% sensitivity). It is also shown that the prediction accuracy does not show any obvious trend with multibranch loop complexity as measured by three different parameters.  相似文献   

9.
The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.  相似文献   

10.
The monomeric Alzheimer's beta amyloid peptide, Abeta, is known to adopt a disordered state in water at room temperature, and a circular dichroism (CD) spectroscopy experiment has provided the secondary-structure contents for the disordered state: 70% random, 25% beta-structural, and 5% helical. We performed an enhanced conformational sampling (multicanonical molecular dynamics simulation) of a 25-residue segment (residues 12-36) of Abeta in explicit water and obtained the conformational ensemble over a wide temperature range. The secondary-structure contents calculated from the conformational ensemble at 300 degrees K reproduced the experimental secondary-structure contents. The constructed free-energy landscape at 300 degrees K was not plain but rugged with five clearly distinguishable clusters, and each cluster had its own characteristic tertiary structure: a helix-structural cluster, two beta-structural clusters, and two random-structural clusters. This indicates that the contribution from the five individual clusters determines the secondary-structure contents experimentally measured. The helical cluster had a similarity with a stable helical structure for monomeric Abeta in 2,2,2-trifluoroethanol (TFE)/water determined by an NMR experiment: The positions of helices in the helical cluster were the same as those in the NMR structure, and the residue-residue contact patterns were also similar with those of the NMR structure. The cluster-cluster separation in the conformational space indicates that free-energy barriers separate the clusters at 300 degrees K. The two beta-structural clusters were characterized by different strand-strand hydrogen-bond (H-bond) patterns, suggesting that the free-energy barrier between the two clusters is due to the H-bond rearrangements.  相似文献   

11.
The protein folding process is described by a cluster model based on the assumption that local structures or clusters are formed at an early stage in different regions of the polypeptide chain. Possible local structural elements in a globular protein are helices, bends, and hydrophobic cores whose formation is presumably determined by the interaction with the environment. Thus the tendency of local structure formation is expressed by a surface free energy of the cluster, which is assigned to the interface between the cluster and its environment. The probability of finding the chain of N residues with k clusters and m residues in the cluster is represented by a cluster distribution map. The cluster model exhibits a distinct two-state-like equilibrium transition, which can be seen on this map as well-separated native and denatured populations at the midpoint of the transition. The native population is localized at k ≈ 1 and mN, while the position of the denatured population can vary significantly depending on the surface free energy of the cluster. If the surface free energy is strong, the denatured population is localized near k = 0 and m = 0. On the other hand, if the surface free energy is weak, the denatured population is localized at high k and m values. The dynamics of the cluster model are treated as a stochastic process involving the transition from a state (k,m) to one of its six neighbors. The transition probability for each transition is determined by the free energy difference between two states; thus no activation process is assumed. However, the conversion of the two macrostates, native and denatured populations, involves the free energy activation due to the cooperative interaction of the macrosystem. The dynamics are analyzed by following the time evolution of the population profile on the cluster distribution map. Kinetic schemes are proposed to describe the multistep mechanism of protein folding and unfolding.  相似文献   

12.
An enhanced conformational sampling method, multicanonical molecular dynamics (McMD), was applied to the ab intio folding of the 57-residue first repeat of human glutamyl- prolyl-tRNA synthetase (EPRS-R1) in explicit solvent. The simulation started from a fully extended structure of EPRS-R1 and did not utilize prior structural knowledge. A canonical ensemble, which is a conformational ensemble thermodynamically probable at an arbitrary temperature, was constructed by reweighting the sampled structures. Conformational clusters were obtained from the canonical ensemble at 300 K, and the largest cluster (i.e., the lowest free-energy cluster), which contained 34% of the structures in the ensemble, was characterized by the highest similarity to the NMR structure relative to all alternative clusters. This lowest free-energy cluster included native-like structures composed of two anti-parallel α-helices. The canonical ensemble at 300 K also showed that a short Gly-containing segment, which adopts an α-helix in the native structure, has a tendency to be structurally disordered. Atomic-level analyses demonstrated clearly that inter-residue hydrophobic interactions drive the helix formation of the Gly-containing segment, and that increasing the hydrophobic contacts accompanies exclusion of water molecules from the vicinity of this segment. This study has shown, for the first time, that the free-energy landscape of a structurally well-ordered protein of about 60 residues is obtainable with an all atom model in explicit water without prior structural knowledge.  相似文献   

13.
14.
Single-molecule fluorescence (F?rster) resonance energy transfer (FRET) experiments were performed on surface-immobilized RNase H molecules as a function of the concentration of the chemical denaturant guanidinium chloride (GdmCl). For comparison, we measured ensemble FRET on RNase H solutions. The single-molecule approach allowed us to study FRET distributions of the subpopulation of unfolded molecules without interference from the folded population. The unfolded ensemble experienced a continuous shift of the FRET efficiency distribution with increasing concentration of GdmCl, indicating a heterogeneous population of expanding, unfolded polypeptide chains. We have analyzed the behavior of the unfolded state quantitatively with a model in which the unfolded state is described by a continuum of substates, with the free energy of each substate linearly coupled to its m-value, the proportionality coefficient between free energy and denaturant activity. By fitting this model to the data, we have derived energetic and structural parameters that describe the unfolded state ensemble. Specifically, we have found that the average size of the unfolded state increases from 23-38 A between 0 and 6 M denaturant. Excellent agreement was achieved between the fitted model and our FRET measurements, and with previously published nuclear magnetic resonance and small-angle X-ray scattering data.  相似文献   

15.

Background

Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings.

Results

Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition.

Conclusion

The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification.  相似文献   

16.
With the help of the crystal structure of rhodopsin an ab initio method has been developed to calculate the three-dimensional structure of the loops that connect the transmembrane helices (TMHs). The goal of this procedure is to calculate the loop structures in other G-protein coupled receptors (GPCRs) for which only model coordinates of the TMHs are available. To mimic this situation a construct of rhodopsin was used that only includes the experimental coordinates of the TMHs while the rest of the structure, including the terminal domains, has been removed. To calculate the structure of the loops a method was designed based on Monte Carlo (MC) simulations which use a temperature annealing protocol, and a scaled collective variables (SCV) technique with proper structural constraints. Because only part of the protein is used in the calculations the usual approach of modeling loops, which consists of finding a single, lowest energy conformation of the system, is abandoned because such a single structure may not be a representative member of the native ensemble. Instead, the method was designed to generate structural ensembles from which the single lowest free energy ensemble is identified as representative of the native folding of the loop. To find the native ensemble a successive series of SCV-MC simulations are carried out to allow the loops to undergo structural changes in a controlled manner. To increase the chances of finding the native funnel for the loop, some of the SCV-MC simulations are carried out at elevated temperatures. The native ensemble can be identified by an MC search starting from any conformation already in the native funnel. The hypothesis is that native structures are trapped in the conformational space because of the high-energy barriers that surround the native funnel. The existence of such ensembles is demonstrated by generating multiple copies of the loops from their crystal structures in rhodopsin and carrying out an extended SCV-MC search. For the extracellular loops e1 and e3, and the intracellular loop i1 that were used in this work, the procedure resulted in dense clusters of structures with Calpha-RMSD approximately 0.5 angstroms. To test the predictive power of the method the crystal structure of each loop was replaced by its extended conformations. For e1 and i1 the procedure identifies native clusters with Calpha-RMSD approximately 0.5 angstroms and good structural overlap of the side chains; for e3, two clusters were found with Calpha-RMSD approximately 1.1 angstroms each, but with poor overlap of the side chains. Further searching led to a single cluster with lower Calpha-RMSD but higher energy than the two previous clusters. This discrepancy was found to be due to the missing elements in the constructs available from experiment for use in the calculations. Because this problem will likely appear whenever parts of the structural information are missing, possible solutions are discussed.  相似文献   

17.
Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence.  相似文献   

18.
We evaluate the grand potential of a cluster of two molecular species, equivalent to its free energy of formation from a binary vapour phase, using a non-equilibrium molecular dynamics technique where guide particles, each tethered to a molecule by a harmonic force, move apart to disassemble a cluster into its components. The mechanical work performed in an ensemble of trajectories is analysed using the Jarzynski equality to obtain a free energy of disassembly, a contribution to the cluster grand potential. We study clusters of sulphuric acid and water at 300 K, using a classical interaction scheme, and contrast two modes of guided disassembly. In one, the cluster is broken apart through simple pulling by the guide particles, but we find the trajectories tend to be mechanically irreversible. In the second approach, the guide motion and strength of tethering are modified in a way that prises the cluster apart, a procedure that seems more reversible. We construct a surface representing the cluster grand potential, and identify a critical cluster for droplet nucleation under given vapour conditions. We compare the equilibrium populations of clusters with calculations reported by Henschel et al. [J. Phys. Chem. A 2014;118:2599] based on optimised quantum chemical structures.  相似文献   

19.
The density functional theory (DFT) calculations are carried out to study the mechanism details and the ensemble effect of methanol dehydrogenation over Pt(3) and PtAu(2) clusters, which present the smallest models of pure Pt clusters and bimetallic PtAu clusters. The energy diagrams are drawn out along both the initial O-H and C-H bond scission pathways via the four sequential dehydrogenation processes, respectively, i.e., CH(3)OH → CH(2)OH → CH(2)O → CHO → CO and CH(3)OH → CH(3)O → CH(2)O → CHO → CO, respectively. It is revealed that the reaction kinetics over PtAu(2) is significantly different from that over Pt(3). For the Pt(3)-mediated reaction, the C-H bond scission pathway, where an ensemble composed of two Pt atoms is required to complete methanol dehydrogenation, is energetically more favorable than the O-H bond scission pathway, and the maximum barrier along this pathway is calculated to be 12.99 kcal mol(-1). In contrast, PtAu(2) cluster facilitates the reaction starting from the O-H bond scission, where the Pt atom acts as the active center throughout each elementary step of methanol dehydrogenation, and the initial O-H bond scission with a barrier of 21.42 kcal mol(-1) is the bottom-neck step of methanol decomposition. Importantly, it is shown that the complete dehydrogenation product of methanol, CO, can more easily dissociate from PtAu(2) cluster than from Pt(3) cluster. The calculated results over the model clusters provide assistance to some extent for understanding the improved catalytic activity of bimetal PtAu catalysts toward methanol oxidation in comparison with pure Pt catalysts.  相似文献   

20.
If, contrary to conventional models of muscle, it is assumed that molecular forces equilibrate among rather than within molecular motors, an equation of state and an expression for energy output can be obtained for a near-equilibrium, coworking ensemble of molecular motors. These equations predict clear, testable relationships between motor structure, motor biochemistry, and ensemble motor function, and we discuss these relationships in the context of various experimental studies. In this model, net work by molecular motors is performed with the relaxation of a near-equilibrium intermediate step in a motor-catalyzed reaction. The free energy available for work is localized to this step, and the rate at which this free energy is transferred to work is accelerated by the free energy of a motor-catalyzed reaction. This thermodynamic model implicitly deals with a motile cell system as a dynamic network (not a rigid lattice) of molecular motors within which the mechanochemistry of one motor influences and is influenced by the mechanochemistry of other motors in the ensemble.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号