首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
We demonstrate that the recently proposed pruned-enriched Rosenbluth method (PERM) (Grassberger, Phys. Rev. E 56:3682, 1997) leads to extremely efficient algorithms for the folding of simple model proteins. We test it on several models for lattice heteropolymers, and compare it to published Monte Carlo studies of the properties of particular sequences. In all cases our method is faster than the previous ones, and in several cases we find new minimal energy states. In addition to producing more reliable candidates for ground states, our method gives detailed information about the thermal spectrum and thus allows one to analyze thermodynamic aspects of the folding behavior of arbitrary sequences. Proteins 32:52–66, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

2.
A Monte Carlo computer simulation program is designed in orderto describe the spatial and time evolution of a population ofliving individuals under preassigned environmental conditionsof energy. The simulation is inspired by previous techniquesdeveloped in physics — in particular, in molecular dynamicsand simulations of liquids — and it already provides somenew insights regarding macroscopic deterministic models in ecologyand concerning eventual control of artificial biomass productionplants. Received on July 15, 1986; accepted on October 9, 1986  相似文献   

3.
Zhang H 《Proteins》1999,34(4):464-471
A new Hybrid Monte Carlo (HMC) algorithm has been developed to test protein potential functions and, ultimately, refine protein structures. The main principle of this algorithm is, in each cycle, a new trial conformation is generated by carrying out a short period of molecular dynamics (MD) iterations with a set of random parameters (including the MD time step, the number of MD steps, the MD temperature, and the seed for initial MD velocity assignment); then to accept or reject the new conformation on the basis of the Metropolis criterion. The novelty in this paper is that the potential in MD iterations is different from that in the MC step. In the former, it is a molecular mechanics potential, in the latter it is a knowledge-based potential (KBP). Directed by the KBP, the MD iteration is used to search conformational space for realistic conformations with low KBP energy. It circumvents the difficulty in using KBP functions directly in MD simulation, as KBP functions are typically incomplete, and do not always have continuous derivatives required for the calculation of the forces. The new algorithm has been tested in explorations of conformational space. In these test calculations the KBP energy was found to drop below the value for the native conformation, and the correlation between the root mean square deviation (RMSD) and the KBP energy was shown to be different from the test results in other references. At the present time, the algorithm is useful for testing new KBP functions. Furthermore, if a KBP function can be found for which the native conformation has the lowest energy and the energy/RMSD correlation is good, then this new algorithm also will be a tool for refinement of the theory-based structural models.  相似文献   

4.

Background  

The ab initio protein folding problem consists of predicting protein tertiary structure from a given amino acid sequence by minimizing an energy function; it is one of the most important and challenging problems in biochemistry, molecular biology and biophysics. The ab initio protein folding problem is computationally challenging and has been shown to be -hard even when conformations are restricted to a lattice. In this work, we implement and evaluate the replica exchange Monte Carlo (REMC) method, which has already been applied very successfully to more complex protein models and other optimization problems with complex energy landscapes, in combination with the highly effective pull move neighbourhood in two widely studied Hydrophobic Polar (HP) lattice models.  相似文献   

5.
6.
A new and efficient Monte Carlo algorithm for sampling protein configurations in the continuous space is presented; the efficiency of this algorithm, named Local Moves for Proteins (LMProt), was compared to other alternative algorithms. For this purpose, we used an intrachain interaction energy function that is proportional to the root mean square deviation (rmsd) with respect to alpha-carbons from native structures of real proteins. For phantom chains, the LMProt method is approximately 10(4) and 20 times faster than the algorithms Thrashing (no local moves) and Sevenfold Way (local moves), respectively. Additionally, the LMProt was tested for real chains (excluded-volume all-atoms model); proteins 5NLL (138 residues) and 1BFF (129 residues) were used to determine the folding success xi as a function of the number eta of residues involved in the chain movements, and as a function of the maximum amplitude of atomic displacement delta r(max). Our results indicate that multiple local moves associated with relative chain flexibility, controlled by appropriate adjustments for eta and delta r(max), are essential for configurational search efficiency.  相似文献   

7.
MOTIVATION: We consider the problem of identifying low-complexity regions (LCRs) in a protein sequence. LCRs are regions of biased composition, normally consisting of different kinds of repeats. RESULTS: We define new complexity measures to compute the complexity of a sequence based on a given scoring matrix, such as BLOSUM 62. Our complexity measures also consider the order of amino acids in the sequence and the sequence length. We develop a novel graph-based algorithm called GBA to identify LCRs in a protein sequence. In the graph constructed for the sequence, each vertex corresponds to a pair of similar amino acids. Each edge connects two pairs of amino acids that can be grouped together to form a longer repeat. GBA finds short subsequences as LCR candidates by traversing this graph. It then extends them to find longer subsequences that may contain full repeats with low complexities. Extended subsequences are then post-processed to refine repeats to LCRs. Our experiments on real data show that GBA has significantly higher recall compared to existing algorithms, including 0j.py, CARD, and SEG. AVAILABILITY: The program is available on request.  相似文献   

8.
An evolutionary Monte Carlo algorithm for predicting DNA hybridization   总被引:1,自引:0,他引:1  
Kim JS  Lee JW  Noh YK  Park JY  Lee DY  Yang KA  Chai YG  Kim JC  Zhang BT 《Bio Systems》2008,91(1):69-75
Many DNA-based technologies, such as DNA computing, DNA nanoassembly and DNA biochips, rely on DNA hybridization reactions. Previous hybridization models have focused on macroscopic reactions between two DNA strands at the sequence level. Here, we propose a novel population-based Monte Carlo algorithm that simulates a microscopic model of reacting DNA molecules. The algorithm uses two essential thermodynamic quantities of DNA molecules: the binding energy of bound DNA strands and the entropy of unbound strands. Using this evolutionary Monte Carlo method, we obtain a minimum free energy configuration in the equilibrium state. We applied this method to a logical reasoning problem and compared the simulation results with the experimental results of the wet-lab DNA experiments performed subsequently. Our simulation predicted the experimental results quantitatively.  相似文献   

9.
Over the past three decades, a number of powerful simulation algorithms have been introduced to the protein folding problem. For many years, the emphasis has been placed on how to both overcome the multiple minima problem and find the conformation with the global minimum potential energy. Since the new view of the protein folding mechanism (based on the free energy landscape of the protein system) arose in the past few years, however, it is now of interest to obtain a global knowledge of the phase space, including the intermediate and denatured states of proteins. Monte Carlo methods have proved especially valuable for these purposes. As well as new, powerful optimization techniques, novel algorithms that can sample much a wider phase space than conventional methods have been established.  相似文献   

10.
Mao Y  Xu S 《Heredity》2005,94(3):305-315
Identity-By-Descent (IBD) is a general measurement of the relationship between two groups of genes. If the two groups consist of two homologous genes, one from each individual, the IBD is called the coancestry between the two individuals. Coancestry is an important concept in both population and quantitative genetics. It is the probability that both genes are copies of the same gene in the genealogy. The average coancestry value at a random locus in a population reflects the level of population diversity, effective population size, the level of inbreeding and other attributes. Coancestry is also the building block for the covariance structure used to estimate the additive genetic variance component for a quantitative trait. There are many other types of IBD matrices, depending on the natures of the genes included in each group, and these IBD matrices vary from locus to locus. Molecular markers distributed along the genome provide information that can be used to infer these locus-specific IBD matrices. As a result, we can estimate and test the variance components of a quantitative trait contributed by these loci using the inferred IBD matrices. In this study, we develop the concept of locus-specific epistatic IBD matrices and a Monte Carlo method to infer these IBD matrices. The method is suitable for large pedigrees with arbitrary complexity and various levels of missing marker information. With these locus-specific IBD matrices, we are ready to search for quantitative trait loci along the genome in complicated pedigrees.  相似文献   

11.
12.
Amino acid sequences have already been examined in some detail in order to relate them to structural aspects, homology and gene duplication. This report introduces the concept of internal uniqueness of tripeptides within protein sequences and uses the Monte Carlo method to study this property. Some idea of internal uniqueness may be obtained from such an analysis using only a single sequence if the probability of the random occurrence is about 0.001 or less. This method of analysis is similar to that used in quantitative evaluations of homology. When the probability of the random occurrence is larger than 0.001 a homologous group of sequences is required and the random probabilities may be compared with the real occurrences within the group. From such an examination insulin and cytochrome c are identified as protein sequences with high internal uniqueness. A comparison of data from internal uniqueness and gene duplication analyses shows that these two properties need not be related. Results of the analysis point to internal uniqueness as an additional parameter for inclusion in speculations on why twenty amino acids are coded in protein structure.  相似文献   

13.
A Monte Carlo simulation procedure was used to estimate the exact level of the standardized X 2 test statistic (X s 2) for randomness in the FSM methodology for the identification of fragile sites from chromosomal breakage data for single individuals. A random-number generator was used to simulate 10 000 chromosomal breakage data sets, each corresponding to the null hypothesis of no fragile sites for numbers of chromosomal breaks (n) from 1 to 2000 and at three levels of chromosomal band resolution (k). The reliability of the test was assessed by comparisons of the empirical and nominal α levels for each of the corresponding values of n and k. These analyses indicate that the sparse and discrete nature of chromosomal breakage data results in large and unpredictable discrepancies between the empirical and nominal α levels when fragile site identifications are based on small numbers of breaks (n < 0.5 k). With n≥ 0.5 k, the distribution of X s 2 appears to be stable and non-significant differences in the empirical and nominal α levels are generally obtained. These results are inherent to the nature of the data and are, therefore, relevant to any statistical model for the identification of fragile sites from chromosomal breakage data. For FSM identification of fragile sites at α = 0.05, we suggest that n≥ 0.5 k is the minimum reliable number of mapped chromosomal breaks per individual. Received: 28 April 1997 / Accepted: 1 July 1997  相似文献   

14.
Mayewski S 《Proteins》2005,59(2):152-169
A new multibody, whole-residue potential for protein tertiary structure is described. The potential is based on the local environment surrounding each main-chain alpha carbon (CA), defined as the set of all residues whose CA coordinates lie within a spherical volume of set radius in 3-dimensional (3D) space surrounding that position. It is shown that the relative positions of the CAs in these local environments belong to a set of preferred templates. The templates are derived by cluster analysis of the presently available database of over 3000 protein chains (750,000 residues) having not more than 30% sequence similarity. For each template is derived also a set of residue propensities for each topological position in the template. Using lookup tables of these derived templates, it is then possible to calculate an energy for any conformation of a given protein sequence. The application of the potential to ab initio protein tertiary structure prediction is evaluated by performing Monte Carlo simulated annealing on test protein sequences.  相似文献   

15.
16.
Effective probabilistic modeling approaches have been developed to find motifs of biological function in DNA sequences. However, the problem of automated model choice remains largely open and becomes more essential as the number of sequences to be analyzed is constantly increasing. Here we propose a reversible jump Markov chain Monte Carlo algorithm for estimating both parameters and model dimension of a Bayesian hidden semi-Markov model dedicated to bacterial promoter motif discovery. Bacterial promoters are complex motifs composed of two boxes separated by a spacer of variable but constrained length and occurring close to the protein translation start site. The algorithm allows simultaneous estimations of the width of the boxes, of the support size of the spacer length distribution, and of the order of the Markovian model used for the "background" nucleotide composition. The application of this method on three sequence sets points out the good behavior of the algorithm and the biological relevance of the estimated promoter motifs.  相似文献   

17.
18.
A new interactive graphics program is described that provides a quick and simple procedure for identifying, displaying, and manipulating the indentations, cavities, or holes in a known protein structure. These regions are defined as, e.g., the X0, y0, Z0 values at which a test sphere of radius r can be placed without touching the centers of any protein atoms, subject to the condition that there is some x < x0 and some x > x0 where the sphere does touch the protein atoms. The surfaces of these pockets are modeled using a modification of the marching cubes algorithm. This modification provides identification of each closed surface so that by “clicking” on any line of the surface, the entire surface can be selected. The surface can be displayed either as a line grid or as a solid surface. After the desired “pocket” has been selected, the amino acid residues and atoms that surround this pocket can be selected and displayed. The protein database that is input can have more than one protein “segment,” allowing identification of the pockets at the interface between proteins. The use of the program is illustrated with several specific examples. The program is written in C and requires Silicon Graphics graphics routines.  相似文献   

19.
Biological membranes contain a high density of protein molecules, many of which associate into two-dimensional microdomains with important physiological functions. We have used Monte Carlo simulations to examine the self-association of idealized protein species in two dimensions. The proteins have defined bond strengths and bond angles, allowing us to estimate the size and composition of the aggregates they produce at equilibrium. With a single species of protein, the extent of cluster formation and the sizes of individual clusters both increase in non-linear fashion, showing a phase change with protein concentration and bond strength. With multiple co-aggregating proteins, we find that the extent of cluster formation also depends on the relative proportions of participating species. For some lattice geometries, a stoichiometric excess of particular species depresses cluster formation and moreover distorts the composition of clusters that do form. Our results suggest that the self-assembly of microdomains might require a critical level of subunits and that for optimal co-aggregation, proteins should be present in the membrane in the correct stoichiometric ratios.  相似文献   

20.
A popular way to represent clustered binary, count, or other data is via the generalized linear mixed model framework, which accommodates correlation through incorporation of random effects. A standard assumption is that the random effects follow a parametric family such as the normal distribution; however, this may be unrealistic or too restrictive to represent the data. We relax this assumption and require only that the distribution of random effects belong to a class of 'smooth' densities and approximate the density by the seminonparametric (SNP) approach of Gallant and Nychka (1987). This representation allows the density to be skewed, multi-modal, fat- or thin-tailed relative to the normal and includes the normal as a special case. Because an efficient algorithm to sample from an SNP density is available, we propose a Monte Carlo EM algorithm using a rejection sampling scheme to estimate the fixed parameters of the linear predictor, variance components and the SNP density. The approach is illustrated by application to a data set and via simulation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号