首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A program for predicting significant RNA secondary structures   总被引:1,自引:0,他引:1  
We describe a program for the analysis of RNA secondary structure.There are two new features in this program. (i) To get vectorspeeds on a vector pipeline machine (such as Cray X-MP/24) wehave vectorized the secondary structure dynamic algorithm. (ii)The statistical significance of a locally ‘optimal’secondary structure is assessed by a Monte Carlo method. Theresults can be depicted graphically including profiles of thestability of local secondary structures and the distributionof the potentially significant secondary structures in the RNAmolecules. Interesting regions where both the potentially significantsecondary structures and ‘open’ structures (single-strandedcoils) occur can be identified by the plots mentioned above.Furthermore, the speed of the vectorized code allows repeatedMonte Carlo simulations with different overlapping window sizes.Thus, the optimal size of the significant secondary structureoccurring in the interesting region can be assessed by repeatingthe Monte Carlo simulation. The power of the program is demonstratedin the analysis of local secondary structures of human T-celllymphotrophic virus type III (HIV). Received on August 17, 1987; accepted on January 5, 1988  相似文献   

2.
Prediction of RNA secondary structure based on helical regions distribution   总被引:5,自引:0,他引:5  
MOTIVATION: RNAs play an important role in many biological processes and knowing their structure is important in understanding their function. Due to difficulties in the experimental determination of RNA secondary structure, the methods of theoretical prediction for known sequences are often used. Although many different algorithms for such predictions have been developed, this problem has not yet been solved. It is thus necessary to develop new methods for predicting RNA secondary structure. The most-used at present is Zuker's algorithm which can be used to determine the minimum free energy secondary structure. However many RNA secondary structures verified by experiments are not consistent with the minimum free energy secondary structures. In order to solve this problem, a method used to search a group of secondary structures whose free energy is close to the global minimum free energy was developed by Zuker in 1989. When considering a group of secondary structures, if there is no experimental data, we cannot tell which one is better than the others. This case also occurs in combinatorial and heuristic methods. These two kinds of methods have several weaknesses. Here we show how the central limit theorem can be used to solve these problems. RESULTS: An algorithm for predicting RNA secondary structure based on helical regions distribution is presented, which can be used to find the most probable secondary structure for a given RNA sequence. It consists of three steps. First, list all possible helical regions. Second, according to central limit theorem, estimate the occurrence probability of every helical region based on the Monte Carlo simulation. Third, add the helical region with the biggest probability to the current structure and eliminate the helical regions incompatible with the current structure. The above processes can be repeated until no more helical regions can be added. Take the current structure as the final RNA secondary structure. In order to demonstrate the confidence of the program, a test on three RNA sequences: tRNAPhe, Pre-tRNATyr, and Tetrahymena ribosomal RNA intervening sequence, is performed. AVAILABILITY: The program is written in Turbo Pascal 7.0. The source code is available upon request. CONTACT: Wujj@nic.bmi.ac.cn or Liwj@mail.bmi.ac.cn   相似文献   

3.
4.
5.
We report a set of atomistic folding/unfolding simulations for the hairpin ribozyme using a Monte Carlo algorithm. The hairpin ribozyme folds in solution and catalyzes self-cleavage or ligation via a specific two-domain structure. The minimal active ribozyme has been studied extensively, showing stabilization of the active structure by cations and dynamic motion of the active structure. Here, we introduce a simple model of tertiary-structure formation that leads to a phase diagram for the RNA as a function of temperature and tertiary-structure strength. We then employ this model to capture many folding/unfolding events and to examine the transition-state ensemble (TSE) of the RNA during folding to its active “docked” conformation. The TSE is compact but with few tertiary interactions formed, in agreement with single-molecule dynamics experiments. To compare with experimental kinetic parameters, we introduce a novel method to benchmark Monte Carlo kinetic parameters to docking/undocking rates collected over many single molecular trajectories. We find that topology alone, as encoded in a biased potential that discriminates between secondary and tertiary interactions, is sufficient to predict the thermodynamic behavior and kinetic folding pathway of the hairpin ribozyme. This method should be useful in predicting folding transition states for many natural or man-made RNA tertiary structures.  相似文献   

6.
MOTIVATION: Several results in the literature suggest that biologically interesting RNAs have secondary structures that are more stable than expected by chance. Based on these observations, we developed a scanning algorithm for detecting noncoding RNA genes in genome sequences, using a fully probabilistic version of the Zuker minimum-energy folding algorithm. RESULTS: Preliminary results were encouraging, but certain anomalies led us to do a carefully controlled investigation of this class of methods. Ultimately, our results argue that for the probabilistic model there is indeed a statistical effect, but it comes mostly from local base-composition bias and not from RNA secondary structure. For the thermodynamic implementation (which evaluates statistical significance by doing Monte Carlo shuffling in fixed-length sequence windows, thus eliminating the base-composition effect) the signals for noncoding RNAs are still usually indistinguishable from noise, especially when certain statistical artifacts resulting from local base-composition inhomogeneity are taken into account. We conclude that although a distinct, stable secondary structure is undoubtedly important in most noncoding RNAs, the stability of most noncoding RNA secondary structures is not sufficiently different from the predicted stability of a random sequence to be useful as a general genefinding approach.  相似文献   

7.
One of the major challenges facing genome-scan studies to discover disease genes is the assessment of the genomewide significance. The assessment becomes particularly challenging if the scan involves a large number of markers collected from a relatively small number of meioses. Typically, this assessment has two objectives: to assess genomewide significance under the null hypothesis of no linkage and to evaluate true-positive and false-positive prediction error rates under alternative hypotheses. The distinction between these goals allows one to formulate the problem in the well-established paradigm of statistical hypothesis testing. Within this paradigm, we evaluate the traditional criterion of LOD score 3.0 and a recent suggestion of LOD score 3.6, using the Monte Carlo simulation method. The Monte Carlo experiments show that the type I error varies with the chromosome length, with the number of markers, and also with sample sizes. For a typical setup with 50 informative meioses on 50 markers uniformly distributed on a chromosome of average length (i.e., 150 cM), the use of LOD score 3.0 entails an estimated chromosomewide type I error rate of.00574, leading to a genomewide significance level >.05. In contrast, the corresponding type I error for LOD score 3.6 is.00191, giving a genomewide significance level of slightly <.05. However, with a larger sample size and a shorter chromosome, a LOD score between 3.0 and 3.6 may be preferred, on the basis of proximity to the targeted type I error. In terms of reliability, these two LOD-score criteria appear not to have appreciable differences. These simulation experiments also identified factors that influence power and reliability, shedding light on the design of genome-scan studies.  相似文献   

8.
Cheon S  Liang F 《Bio Systems》2008,91(1):94-107
Monte Carlo methods have received much attention recently in the literature of phylogenetic tree construction. However, they often suffer from two difficulties, the curse of dimensionality and the local-trap problem. The former one is due to that the number of possible phylogenetic trees increases at a super-exponential rate as the number of taxa increases. The latter one is due to that the phylogenetic tree has often a rugged energy landscape. In this paper, we propose a new phylogenetic tree construction method, which attempts to alleviate these two difficulties simultaneously by making use of the sequential structure of phylogenetic trees in conjunction with stochastic approximation Monte Carlo (SAMC) simulations. The use of the sequential structure of the problem provides substantial help to reduce the curse of dimensionality in simulations, and SAMC effectively prevents the system from getting trapped in local energy minima. The new method is compared with a variety of existing Bayesian and non-Bayesian methods on simulated and real datasets. Numerical results are in favor of the new method in terms of quality of the resulting phylogenetic trees.  相似文献   

9.
Monte Carlo methods have received much attention in the recent literature of phylogeny analysis. However, the conventional Markov chain Monte Carlo algorithms, such as the Metropolis–Hastings algorithm, tend to get trapped in a local mode in simulating from the posterior distribution of phylogenetic trees, rendering the inference ineffective. In this paper, we apply an advanced Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm, to Bayesian phylogeny analysis. Our method is compared with two popular Bayesian phylogeny software, BAMBE and MrBayes, on simulated and real datasets. The numerical results indicate that our method outperforms BAMBE and MrBayes. Among the three methods, SAMC produces the consensus trees which have the highest similarity to the true trees, and the model parameter estimates which have the smallest mean square errors, but costs the least CPU time.  相似文献   

10.
The stability of potential RNA stem-loop structures in human immunodeficiency virus isolates, HTLV-III and ARV, has been calculated, and the relevance to the local significant secondary structures in the sequence has been tested statistically using a Monte Carlo simulation method. Potentially significant structures exist in the 5'non-coding region, the boundary regions between the protein coding frames, and the 3' non-coding region. The locally optimal secondary structure occurring in the 5' terminal region has been assessed using different overlapping segment sizes and the Monte Carlo method. The results show that the most favorable structure for the 5' mRNA leader sequence of HIV has two stem-loops folded at nucleotides 5-104 in the R region (stem-loop I, 5-54 and stem-loop II, 58-104). A large fluctuation of segment score of the local optimal secondary structure also occurs in the boundary between the exterior glycosylated protein or outer membrane protein and transmembrane protein coding region. This finding is surprising since no RNA signals or RNA processing are expected to occur at this site. In addition, regions of the genome predicted to have significantly more open structure at the RNA level correlate closely with hypervariable sites found in these viral genomes. The possible importance of local secondary structure to the biological function of the human immunodeficiency virus genome is discussed.  相似文献   

11.
12.
Understanding protein folding: small proteins in silico   总被引:1,自引:0,他引:1  
Recent improvements in methodology and increased computer power now allow atomistic computer simulations of protein folding. We briefly review several advanced Monte Carlo algorithms that have contributed to this development. Details of folding simulations of three designed mini proteins are shown. Adding global translations and rotations has allowed us to handle multiple chains and to simulate the aggregation of six beta-amyloid fragments. In a different line of research we have developed several algorithms to predict local features from sequence. In an outlook we sketch how such biasing could extend the application spectrum of Monte Carlo simulations to structure prediction of larger proteins.  相似文献   

13.
We predict the metastable secondary structure for the CAR (cis anti-repressor sequence) which is active in the regulation of the interaction with the Rev protein of HIV-1. We prove that the active structure sustained between nts 7364 and 7559 of the env RNA is the most probable metastable structure whose formation is kinetically governed and not thermodynamically determined. The structure is obtained by means of a Monte Carlo simulation which computes refolding events which occur as the CAR portion of the viral RNA is being assembled. Thus, the regulatory role of the secondary structure is determined as soon as the new RNA has been synthesized and it is preserved for further recognition by the Rev protein. In analogy with previous work by the author, it is shown that the destabilization by site-directed mutagenesis of the secondary structure for a non-chain-specific recognition site enhances the Rev response.  相似文献   

14.
We have developed a method for detecting more stable and significantfolding regions relative to others in the sequence. The algorithmis based on the calculation of the lowest free energy of RNAsecondary structures and Monte Carlo simulation. For any givenRNA segment, the stability and statistical significance of RNAfolding are assessed by two measures: the stability score andthe significance score. The stability score measures the degreeof thermodynamic stability of the segment between all possiblebiological segments in the RNA sequence. The significance scorecharacterizes the specific arrangement of the nucleotides inthe segment that could imply a structural role for the sequenceinformation. Using these two measures, we are able to detecta series of distinct folding regions where highly stable andstatistically significant secondary structures occur in humanimmunodeficiency virus (HIV) and simian immunodeficiency virus(SIV) sequences. Received on April 4, 1990; accepted on October 2, 1990  相似文献   

15.
We have studied the use of a new Monte Carlo (MC) chain generation algorithm, introduced by T. Garel and H. Orland[(1990) Journal of Physics A, Vol. 23, pp. L621–L626], for examining the thermodynamics of protein folding transitions and for generating candidate Cαbackbone structures as starting points for a de now protein structure paradigm. This algorithm, termed the guided replication Monte Carlo method, allows a rational approach to the introduction of known “native” folded characteristics as constraints in the chain generation process. We have shown this algorithm to be computationally very efficient in generating large ensembles of candidate Cαchains on the face centered cubic lattice, and illustrate its use by calculating a number of thermodynamic quantities related to protein folding characteristics. In particular, we have used this static MC algorithm to compute such temperature-dependent quantities as the ensemble mean energy, ensemble mean free energy, the heat capacity, and the mean-square radius of gyration. We also demonstrate the use of several simple “guide fields” for introducing protein-specific constraints into the ensemble generation process. Several extensions to our current model are suggested, and applications of the method to other folding related problems are discussed. © 1995 John Wiley & Sons, Inc.  相似文献   

16.
MOTIVATION: Implementation and development of statistical methods for high-dimensional data often require high-dimensional Monte Carlo simulations. Simulations are used to assess performance, evaluate robustness, and in some cases for implementation of algorithms. But simulation in high dimensions is often very complex, cumbersome and slow. As a result, performance evaluations are often limited, robustness minimally investigated and dissemination impeded by implementation challenges. This article presents a method for converting complex, slow high-dimensional Monte Carlo simulations into simpler, faster lower dimensional simulations. RESULTS: We implement the method by converting a previous Monte Carlo algorithm into this novel Monte Carlo, which we call AROHIL Monte Carlo. AROHIL Monte Carlo is shown to exactly or closely match pure Monte Carlo results in a number of examples. It is shown that computing time can be reduced by several orders of magnitude. The confidence bound method implemented using AROHIL outperforms the pure Monte Carlo method. Finally, the utility of the method is shown by application to a number of real microarray datasets.  相似文献   

17.
Ecosystem nutrient budgets often report values for pools and fluxes without any indication of uncertainty, which makes it difficult to evaluate the significance of findings or make comparisons across systems. We present an example, implemented in Excel, of a Monte Carlo approach to estimating error in calculating the N content of vegetation at the Hubbard Brook Experimental Forest in New Hampshire. The total N content of trees was estimated at 847 kg ha−1 with an uncertainty of 8%, expressed as the standard deviation divided by the mean (the coefficient of variation). The individual sources of uncertainty were as follows: uncertainty in allometric equations (5%), uncertainty in tissue N concentrations (3%), uncertainty due to plot variability (6%, based on a sample of 15 plots of 0.05 ha), and uncertainty due to tree diameter measurement error (0.02%). In addition to allowing estimation of uncertainty in budget estimates, this approach can be used to assess which measurements should be improved to reduce uncertainty in the calculated values. This exercise was possible because the uncertainty in the parameters and equations that we used was made available by previous researchers. It is important to provide the error statistics with regression results if they are to be used in later calculations; archiving the data makes resampling analyses possible for future researchers. When conducted using a Monte Carlo framework, the analysis of uncertainty in complex calculations does not have to be difficult and should be standard practice when constructing ecosystem budgets.  相似文献   

18.
Summary In epidemics of infectious diseases such as influenza, an individual may have one of four possible final states: prior immune, escaped from infection, infected with symptoms, and infected asymptomatically. The exact state is often not observed. In addition, the unobserved transmission times of asymptomatic infections further complicate analysis. Under the assumption of missing at random, data‐augmentation techniques can be used to integrate out such uncertainties. We adapt an importance‐sampling‐based Monte Carlo Expectation‐Maximization (MCEM) algorithm to the setting of an infectious disease transmitted in close contact groups. Assuming the independence between close contact groups, we propose a hybrid EM‐MCEM algorithm that applies the MCEM or the traditional EM algorithms to each close contact group depending on the dimension of missing data in that group, and discuss the variance estimation for this practice. In addition, we propose a bootstrap approach to assess the total Monte Carlo error and factor that error into the variance estimation. The proposed methods are evaluated using simulation studies. We use the hybrid EM‐MCEM algorithm to analyze two influenza epidemics in the late 1970s to assess the effects of age and preseason antibody levels on the transmissibility and pathogenicity of the viruses.  相似文献   

19.
Site specific incorporation of molecular probes such as fluorescent- and nitroxide spin-labels into biomolecules, and subsequent analysis by F?rster resonance energy transfer (FRET) and double electron-electron resonance (DEER) can elucidate the distance and distance-changes between the probes. However, the probes have an intrinsic conformational flexibility due to the linker by which they are conjugated to the biomolecule. This property minimizes the influence of the label side chain on the structure of the target molecule, but complicates the direct correlation of the experimental inter-label distances with the macromolecular structure or changes thereof. Simulation methods that account for the conformational flexibility and orientation of the probe(s) can be helpful in overcoming this problem. We performed distance measurements using FRET and DEER and explored different simulation techniques to predict inter-label distances using the Rpo4/7 stalk module of the M. jannaschii RNA polymerase. This is a suitable model system because it is rigid and a high-resolution X-ray structure is available. The conformations of the fluorescent labels and nitroxide spin labels on Rpo4/7 were modeled using in vacuo molecular dynamics simulations (MD) and a stochastic Monte Carlo sampling approach. For the nitroxide probes we also performed MD simulations with explicit water and carried out a rotamer library analysis. Our results show that the Monte Carlo simulations are in better agreement with experiments than the MD simulations and the rotamer library approach results in plausible distance predictions. Because the latter is the least computationally demanding of the methods we have explored, and is readily available to many researchers, it prevails as the method of choice for the interpretation of DEER distance distributions.  相似文献   

20.

Background

Although Monte Carlo simulations of light propagation in full segmented three-dimensional MRI based anatomical models of the human head have been reported in many articles. To our knowledge, there is no patient-oriented simulation for individualized calibration with NIRS measurement. Thus, we offer an approach for brain modeling based on image segmentation process with in vivo MRI T1 three-dimensional image to investigate the individualized calibration for NIRS measurement with Monte Carlo simulation.

Methods

In this study, an individualized brain is modeled based on in vivo MRI 3D image as five layers structure. The behavior of photon migration was studied for this individualized brain detections based on three-dimensional time-resolved Monte Carlo algorithm. During the Monte Carlo iteration, all photon paths were traced with various source-detector separations for characterization of brain structure to provide helpful information for individualized design of NIRS system.

Results

Our results indicate that the patient-oriented simulation can provide significant characteristics on the optimal choice of source-detector separation within 3.3 cm of individualized design in this case. Significant distortions were observed around the cerebral cortex folding. The spatial sensitivity profile penetrated deeper to the brain in the case of expanded CSF. This finding suggests that the optical method may provide not only functional signal from brain activation but also structural information of brain atrophy with the expanded CSF layer. The proposed modeling method also provides multi-wavelength for NIRS simulation to approach the practical NIRS measurement.

Conclusions

In this study, the three-dimensional time-resolved brain modeling method approaches the realistic human brain that provides useful information for NIRS systematic design and calibration for individualized case with prior MRI data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号