首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT: BACKGROUND: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3-10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. RESULTS: The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. CONCLUSIONS: Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods.  相似文献   

2.
Pierri CL  De Grassi A  Turi A 《Proteins》2008,73(2):351-361
In the study of the protein folding problem with ab initio methods, the protein backbone can be built on some periodic lattices. Any vertex of these lattices can be occupied by a "ball," which can represent the mass center of an amino acid in a simplified coarse-grained model of the protein. The backbone, at a coarse-grained level, can be constituted of a No Reverse Self Avoiding Walk, which cannot intersect itself and cannot go back on itself. There is still much debate between those who use lattices to simplify the study of the protein folding problem and those preferring to work by using an off-lattice approach. Lattices can help to identify the protein tertiary structure in a computational less-expensive way, than off-lattice approaches that have to consider a potentially infinite number of possible structures. However, the use of a lattice, constituted of insufficiently accurate direction vectors, constrains the predictive ability of the model. The aim of this study is to perform a systematic screening of 7 known classic and 11 newly proposed lattices in terms of predictive power. The crystal structures of 42 different proteins (14 mainly alpha helical, 14 mainly beta sheet and 14 mixed structure proteins) were compared to the most accurate simulated models for each lattice. This strategy defines a scale of fitness for all the analyzed lattices and demonstrates that an increase in the coordination number and in the degrees of freedom is necessary but not sufficient to reach the best result. Instead, the introduction of a good set of direction vectors, as developed and tested in this study, strongly increases the lattice performance.  相似文献   

3.
4.
Prospects for ab initio protein structural genomics   总被引:2,自引:0,他引:2  
We present the results of a large-scale testing of the ROSETTA method for ab initio protein structure prediction. Models were generated for two independently generated lists of small proteins (up to 150 amino acid residues), and the results were evaluated using traditional rmsd based measures and a novel measure based on the structure-based comparison of the models to the structures in the PDB using DALI. For 111 of 136 all alpha and alpha/beta proteins 50 to 150 residues in length, the method produced at least one model within 7 A rmsd of the native structure in 1000 attempts. For 60 of these proteins, the closest structure match in the PDB to at least one of the ten most frequently generated conformations was found to be structurally related (four standard deviations above background) to the native protein. These results suggest that ab initio structure prediction approaches may soon be useful for generating low resolution models and identifying distantly related proteins with similar structures and perhaps functions for these classes of proteins on the genome scale.  相似文献   

5.
蛋白质结构从头预测是不依赖模板仅从氨基酸序列信息得到天然结构。它的关键是正确定义能量函数、精确选用计算机搜索算法来寻找能量最低值。基于此,本文系统介绍了能量函数和构象搜索策略,并列举了几种比较成功的从头预测方法,通过比较得出结论:基于统计学知识的能量函数是近年来从头预测发展的主要方向,现有从头预测的构象搜索都用到Monte Carlo法。这表明随着蛋白质结构预测研究的深入,能量函数的构建、构象搜索方法的选择、大分子蛋白质结构的从头预测等关键性问题都取得了突破性进展。  相似文献   

6.
Phylomat: an automated protein motif analysis tool for phylogenomics   总被引:2,自引:0,他引:2  
Recent progress in genomics, proteomics, and bioinformatics enables unprecedented opportunities to examine the evolutionary history of molecular, cellular, and developmental pathways through phylogenomics. Accordingly, we have developed a motif analysis tool for phylogenomics (Phylomat, http://alg.ncsa.uiuc.edu/pmat) that scans predicted proteome sets for proteins containing highly conserved amino acid motifs or domains for in silico analysis of the evolutionary history of these motifs/domains. Phylomat enables the user to download results as full protein or extracted motif/domain sequences from each protein. Tables containing the percent distribution of a motif/domain in organisms normalized to proteome size are displayed. Phylomat can also align the set of full protein or extracted motif/domain sequences and predict a neighbor-joining tree from relative sequence similarity. Together, Phylomat serves as a user-friendly data-mining tool for the phylogenomic analysis of conserved sequence motifs/domains in annotated proteomes from the three domains of life.  相似文献   

7.
Ab initio quantum chemical calculations of molecular properties such as, e.g., torsional potential energies, require massive computational effort even for moderately sized molecules, if basis sets with a reasonable quality are employed. Using ab initio data on conformational properties of the cofactor (6R,1′R,2′S)-5,6,7,8-tetrahydrobiopterin, we demonstrate that error backpropagation networks can be established that efficiently approximate complicated functional relationships such as torsional potential energy surfaces of a flexible molecule. Our pilot simulations suggest that properly trained neural networks might provide an extremely compact storage medium for quantum chemically obtained information. Moreover, they are outstandingly comfortable tools when it comes to making use of the stored information. One possible application is demonstrated, namely, computation of relaxed torsional energy surfaces.  相似文献   

8.
Contact order and ab initio protein structure prediction   总被引:1,自引:0,他引:1       下载免费PDF全文
Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structures; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such a strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down beta-sheets that may resemble precursors to amyloid formation.  相似文献   

9.
This work shows that indigo's high stability can be attributed both to the large π conjugation inside the molecule and to intra- and intermolecular hydrogen bonds. The theoretical investigation of indigo's electronic structure has been performed using high-level methods. To understand the interactions in solid state, calculations of the dimer system with both molecules in the same plane was carried out. In the monomer, two intramolecular hydrogen bridges between amino and carbonyl groups occupy positions that would otherwise be the most reactive ones for nucleophilic and electrophilic attacks. In the dimer, amino and carbonyl groups on different monomers form intermolecular multicentred non-linear hydrogen bonds in six-member rings, protecting again the same reactive centres and explaining the limited solubility of indigo. The addition of the free radical OH breaks the central C = C double bond, the conjugation and the hydrogen bridges as a first step. The Gibbs energy calculation favours the addition of OH radical over C1.  相似文献   

10.
NMR residual dipolar couplings (RDCs), in the form of the projection angles between the respective internuclear bond vectors, are used as structural restraints in the ab initio structure prediction of a test set of six proteins. The restraints are applied using a recently developed SICHO (SIde-CHain-Only) lattice protein model that employs a replica exchange Monte Carlo (MC) algorithm to search conformational space. Using a small number of RDC restraints, the quality of the predicted structures is improved as reflected by lower RMSD/dRMSD (root mean square deviation/distance root mean square deviation) values from the corresponding native structures and by the higher correlation of the most cooperative mode of motion of each predicted structure with that of the native structure. The latter, in particular, has possible implications for the structure-based functional analysis of predicted structures.  相似文献   

11.
The number of natural proteins although large is significantly smaller than the theoretical number of proteins that can be obtained combining the 20 natural amino acids, the so-called “never born proteins” (NBPs). The study of the structure and properties of these proteins allows to investigate the sources of the natural proteins being of unique characteristics or special properties. However the structural study of NPBs can also been intended as an ideal test for evaluating the efficiency of software packages for the ab initio protein structure prediction. In this research, 10.000 three-dimensional structures of proteins of completely random sequence generated according to ROSETTA and FOD model were compared. The results show the limits of these software packages, but at the same time indicate that in many cases there is a significant agreement between the prediction obtained.  相似文献   

12.
13.
Ab initio protein structure prediction methods have improved dramatically in the past several years. Because these methods require only the sequence of the protein of interest, they are potentially applicable to the open reading frames in the many organisms whose sequences have been and will be determined. Ab initio methods cannot currently produce models of high enough resolution for use in rational drug design, but there is an exciting potential for using the methods for functional annotation of protein sequences on a genomic scale. Here we illustrate how functional insights can be obtained from low-resolution predicted structures using examples from blind ab initio structure predictions from the third and fourth critical assessment of structure prediction (CASP3, CASP4) experiments.  相似文献   

14.
The routine prediction of three-dimensional protein structure from sequence remains a challenge in computational biochemistry. It has been intuited that calculated energies from physics-based scoring functions are able to distinguish native from nonnative folds based on previous performance with small proteins and that conformational sampling is the fundamental bottleneck to successful folding. We demonstrate that as protein size increases, errors in the computed energies become a significant problem. We show, by using error probability density functions, that physics-based scores contain significant systematic and random errors relative to accurate reference energies. These errors propagate throughout an entire protein and distort its energy landscape to such an extent that modern scoring functions should have little chance of success in finding the free energy minima of large proteins. Nonetheless, by understanding errors in physics-based score functions, they can be reduced in a post-hoc manner, improving accuracy in energy computation and fold discrimination.  相似文献   

15.
Dong Xu  Yang Zhang 《Proteins》2013,81(2):229-239
Fragment assembly using structural motifs excised from other solved proteins has shown to be an efficient method for ab initio protein‐structure prediction. However, how to construct accurate fragments, how to derive optimal restraints from fragments, and what the best fragment length is are the basic issues yet to be systematically examined. In this work, we developed a gapless‐threading method to generate position‐specific structure fragments. Distance profiles and torsion angle pairs are then derived from the fragments by statistical consistency analysis, which achieved comparable accuracy with the machine‐learning‐based methods although the fragments were taken from unrelated proteins. When measured by both accuracies of the derived distance profiles and torsion angle pairs, we come to a consistent conclusion that the optimal fragment length for structural assembly is around 10, and at least 100 fragments at each location are needed to achieve optimal structure assembly. The distant profiles and torsion angle pairs as derived by the fragments have been successfully used in QUARK for ab initio protein structure assembly and are provided by the QUARK online server at http://zhanglab.ccmb. med.umich.edu/QUARK/ . Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

16.

Background  

Disordered regions are segments of the protein chain which do not adopt stable structures. Such segments are often of interest because they have a close relationship with protein expression and functionality. As such, protein disorder prediction is important for protein structure prediction, structure determination and function annotation.  相似文献   

17.
De Santis L  Carloni P 《Proteins》1999,37(4):611-618
In serine proteases (SPs), the H-bond between His57 and Asp102 and that between Gly193 and the transition state intermediate play a crucial role in enzymatic function. To shed light on the nature of these interactions, we have carried out ab initio molecular dynamics simulations on complexes representing adducts between the reaction intermediate and elastase (one protein belonging to the SP family). Our calculations indicate the presence of a low-barrier H-bond between His57 and Asp102, in complete agreement with NMR experiments on enzyme-transition state analogue complexes. Comparison with an ab initio molecular dynamics simulation on a model of the substrate-enzyme adduct indicates that the Gly193-induced strong stabilization of the intermediate is accomplished by charge/dipole interactions and not by H-bonding as previously suggested. Inclusion of the protein electric field in the calculations does not affect significantly the charge distribution.  相似文献   

18.
Five ab initio programs (FGENESH, GeneMark.hmm, GENSCAN, GlimmerR and Grail) were evaluated for their accuracy in predicting maize genes. Two of these programs, GeneMark.hmm and GENSCAN had been trained for maize; FGENESH had been trained for monocots (including maize), and the others had been trained for rice or Arabidopsis. Initial evaluations were conducted using eight maize genes (gl8a, pdc2, pdc3, rf2c, rf2d, rf2e1, rth1, and rth3) of which the sequences were not released to the public prior to conducting this evaluation. The significant advantage of this data set for this evaluation is that these genes could not have been included in the training sets of the prediction programs. FGENESH yielded the most accurate and GeneMark.hmm the second most accurate predictions. The five programs were used in conjunction with RT-PCR to identify and establish the structures of two new genes in the a1-sh2 interval of the maize genome. FGENESH, GeneMark.hmm and GENSCAN were tested on a larger data set consisting of maize assembled genomic islands (MAGIs) that had been aligned to ESTs. FGENESH, GeneMark.hmm and GENSCAN correctly predicted gene models in 773, 625, and 371 MAGIs, respectively, out of the 1353 MAGIs that comprise data set 2.these authors contributed equally to this work  相似文献   

19.
20.
A constant pressure ab initio MD technique and density functional theory with a generalized gradient approximation (GGA) was used to study the pressure-induced phase transition in zinc-blende CdTe. We found that CdTe undergoes a structural first-order phase transition to $ {\text{I}}\overline 4 {\text{m2}} $ (binary β-tin) tetragonal structure in the constant pressure molecular dynamics simulation at 20 GPa. When the pressure was increased to 50 GPa, the phase of tetragonal structure converted to a new Imm2 orthorhombic structure. These phase transformations were also calculated by using the enthalpy calculations. Transition phases, lattice parameters and bulk properties we attained are comparable with experimental and theoretical data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号