共查询到19条相似文献,搜索用时 125 毫秒
1.
蛋白质的序列决定结构,结构决定功能。新一代准确的蛋白质结构预测工具为结构生物学、结构生物信息学、药物研发和生命科学等许多领域带来了全新的机遇与挑战,单链蛋白质结构预测的准确率达到与试验方法相媲美的水平。本综述概述了蛋白质结构预测领域的理论基础、发展历程与最新进展,讨论了大量预测的蛋白质结构和基于人工智能的方法如何影响实验结构生物学,最后,分析了当前蛋白质结构预测领域仍未解决的问题以及未来的研究方向。 相似文献
2.
3.
4.
基于氨基酸的16种分类模型,给出蛋白质序列的派生序列,进而结合加权拟熵和LZ复杂度构造出34维特征向量来表示蛋白质序列。借助于贝叶斯分类器对同源性不超过25%的640数据集进行蛋白质结构类预测,准确度达到71.28%。 相似文献
5.
蛋白质二级结构预测是蛋白质结构研究的一个重要环节,大量的新预测方法被提出的同时,也不断有新的蛋白质二级结构预测服务器出现。试验选取7种目前常用的蛋白质二级结构预测服务器:PSRSM、SPOT-1D、MUFOLD、Spider3、RaptorX,Psipred和Jpred4,对它们进行了使用方法的介绍和预测效果的评估。随机选取了PDB在2018年8月至11月份发布的180条蛋白质作为测试集,评估角度为:Q3、Sov、边界识别率、内部识别率、转角C识别率,折叠E识别率和螺旋H识别率七种角度。上述服务器180条测试数据的Q3结果分别为:89.96%、88.18%、86.74%、85.77%、83.61%,79.72%和78.29%。结果表明PSRSM的预测结果最好。180条测试集中,以同源性30%,40%,70%分类的实验结果中,PSRSM的Q3结果分别为:89.49%、90.53%、89.87%,均优于其他服务器。实验结果表明,蛋白质二级结构预测可从结合多种深度学习方法以及使用大数据训练模型方向做进一步的研究。 相似文献
6.
7.
蛋白质结构预测研究进展 总被引:1,自引:0,他引:1
蛋白质结构预测是生物信息学当前的主要挑战之一.按照蛋白质结构预测对PDB数据 库信息的依赖程度,可以将其划分成两类:模板依赖模型和从头预测方法.其中模板依赖模 型又可以分为同源模型与穿线法.本文介绍了各种预测方法主要步骤,分析了制约各种方法 的瓶颈,及其研究进展.同源模型所取得的结构精度较高,但其对模板依赖性强;用于低同 源性的穿线法是模板依赖的模型重要的研究方向;从头预测法中统计学函数与物理函数的综 合使用取得了很好的效果,但是对于超过150个残基的片段,依然是巨大的挑战. 相似文献
8.
9.
蛋白质结构预测的现状与展望 总被引:6,自引:0,他引:6
蛋白质分子是由20种不同的氨基酸通过共价键连接而成的线性多肽链,然而天然的球状蛋白质分子的水溶液中并不是一条走向无规的松散肽链,每一种蛋白质在天然条件下都有自己特定的空间结构。遗传信息由DNA到RNA再到蛋白质的过程,是分子生物学研究的中心,通常称之... 相似文献
10.
遗传算法在蛋白质结构预测中的应用 总被引:2,自引:0,他引:2
遗传算法(geneticalgorithm,GA)作为一种自适应启发式概率性迭代式全局搜索算法,具有不依赖于问题模型的特性、全局最优性、隐含并行性、高效性、解决不同非线性问题的鲁棒性特点,目前已经广泛应用于自动控制、机器人学、计算机科学、模式识别、模糊人工神经和工程优化等设计领域。本文首先介绍了GA的基本原理,即搜索的基本过程;随后总结了GA与传统算法相比所具有的优点;第三部分则分别综述了GA在蛋白质结构预测中主要使用的模型、设计和执行策略,以及使用GA与其他算法相互结合预测蛋白质结构的研究进展;最后提出了作者对GA研究中存在问题的认识和研究展望。 相似文献
11.
Tsang HH Wiese KC 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2010,7(4):727-740
Ribonucleic acid (RNA), a single-stranded linear molecule, is essential to all biological systems. Different regions of the same RNA strand will fold together via base pair interactions to make intricate secondary and tertiary structures that guide crucial homeostatic processes in living organisms. Since the structure of RNA molecules is the key to their function, algorithms for the prediction of RNA structure are of great value. In this article, we demonstrate the usefulness of SARNA-Predict, an RNA secondary structure prediction algorithm based on Simulated Annealing (SA). A performance evaluation of SARNA-Predict in terms of prediction accuracy is made via comparison with eight state-of-the-art RNA prediction algorithms: mfold, Pseudoknot (pknotsRE), NUPACK, pknotsRG-mfe, Sfold, HotKnots, ILM, and STAR. These algorithms are from three different classes: heuristic, dynamic programming, and statistical sampling techniques. An evaluation for the performance of SARNA-Predict in terms of prediction accuracy was verified with native structures. Experiments on 33 individual known structures from eleven RNA classes (tRNA, viral RNA, antigenomic HDV, telomerase RNA, tmRNA, rRNA, RNaseP, 5S rRNA, Group I intron 23S rRNA, Group I intron 16S rRNA, and 16S rRNA) were performed. The results presented in this paper demonstrate that SARNA-Predict can out-perform other state-of-the-art algorithms in terms of prediction accuracy. Furthermore, there is substantial improvement of prediction accuracy by incorporating a more sophisticated thermodynamic model (efn2). 相似文献
12.
MOTIVATION: As more non-coding RNAs are discovered, the importance of methods for RNA analysis increases. Since the structure of ncRNA is intimately tied to the function of the molecule, programs for RNA structure prediction are necessary tools in this growing field of research. Furthermore, it is known that RNA structure is often evolutionarily more conserved than sequence. However, few existing methods are capable of simultaneously considering multiple sequence alignment and structure prediction. RESULT: We present a novel solution to the problem of simultaneous structure prediction and multiple alignment of RNA sequences. Using Markov chain Monte Carlo in a simulated annealing framework, the algorithm MASTR (Multiple Alignment of STructural RNAs) iteratively improves both sequence alignment and structure prediction for a set of RNA sequences. This is done by minimizing a combined cost function that considers sequence conservation, covariation and basepairing probabilities. The results show that the method is very competitive to similar programs available today, both in terms of accuracy and computational efficiency. AVAILABILITY: Source code available from http://mastr.binf.ku.dk/ 相似文献
13.
V. Collura J. Higo J. Garnier 《Protein science : a publication of the Protein Society》1993,2(9):1502-1510
A method is presented to model loops of protein to be used in homology modeling of proteins. This method employs the ESAP program of Higo et al. (Higo, J., Collura, V., & Garnier, J., 1992, Biopolymers 32, 33-43) and is based on a fast Monte Carlo simulation and a simulated annealing algorithm. The method is tested on different loops or peptide segments from immunoglobulin, bovine pancreatic trypsin inhibitor, and bovine trypsin. The predicted structure is obtained from the ensemble average of the coordinates of the Monte Carlo simulation at 300 K, which exhibits the lowest internal energy. The starting conformation of the loop prior to modeling is chosen to be completely extended, and a closing harmonic potential is applied to N, CA, C, and O atoms of the terminal residues. A rigid geometry potential of Robson and Platt (1986, J. Mol. Biol. 188, 259-281) with a united atom representation is used. This we demonstrate to yield a loop structure with good hydrogen bonding and torsion angles in the allowed regions of the Ramachandran map. The average accuracy of the modeling evaluated on the eight modeled loops is 1 A root mean square deviation (rmsd) for the backbone atoms and 2.3 A rmsd for all heavy atoms. 相似文献
14.
Single-chain monellin (SCM), which is an engineered 94-residue polypeptide, has proven to be as sweet as native two-chain monellin. SCM is more stable than the native monellin for both heat and acidic environments. Data from gel filtration HPLC and NMR indicate that the SCM exists as a monomer in aqueous solution. The solution structure of SCM has been determined by nuclear magnetic resonance (NMR) spectroscopy and dynamical simulated annealing calculations. A stable alpha-helix spanning residues Phe11-Ile26 and an antiparallel beta-sheet formed by residues 2-5, 36-38, 41-47, 54-64, 69-75, and 83-88 have been identified. The sheet was well defined by backbone-backbone NOEs, and the corresponding beta-strands were further confirmed by hydrogen bond networks based on amide hydrogen exchange data. Strands beta2 and beta3 are connected by a small bulge comprising residues Ile38-Cys41. A total of 993 distance and 56 dihedral angle restraints were used for simulated annealing calculations. The final simulated annealing structures (k) converged well with a root-mean-square deviation (rmsd) between backbone atoms of 0.49 A for secondary structural regions and 0.70 A for backbone atoms excluding two loop regions. The average restraint energy-minimized (REM) structure exhibited root-mean-square deviations of 1.19 A for backbone atoms and 0.85 A for backbone atoms excluding two loop regions with respect to 20 k structures. The solution structure of SCM revealed that the long alpha-helix was folded into the concave side of a six-stranded antiparallel beta-sheet. The side chains of Tyr63 and Asp66 which are common to all sweet peptides showed an opposite orientation relative to H1 helix, and they were all solvent-exposed. Residues at the proposed dimeric interface in the X-ray structure were observed to be mostly solvent-exposed and demonstrated high degrees of flexibility. 相似文献
15.
De novo protein structure prediction by dynamic fragment assembly and conformational space annealing
Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. 相似文献
16.
Evaluation and improvement of multiple sequence methods for protein secondary structure prediction 总被引:38,自引:0,他引:38
A new dataset of 396 protein domains is developed and used to evaluate the performance of the protein secondary structure prediction algorithms DSC, PHD, NNSSP, and PREDATOR. The maximum theoretical Q3 accuracy for combination of these methods is shown to be 78%. A simple consensus prediction on the 396 domains, with automatically generated multiple sequence alignments gives an average Q3 prediction accuracy of 72.9%. This is a 1% improvement over PHD, which was the best single method evaluated. Segment Overlap Accuracy (SOV) is 75.4% for the consensus method on the 396-protein set. The secondary structure definition method DSSP defines 8 states, but these are reduced by most authors to 3 for prediction. Application of the different published 8- to 3-state reduction methods shows variation of over 3% on apparent prediction accuracy. This suggests that care should be taken to compare methods by the same reduction method. Two new sequence datasets (CB513 and CB251) are derived which are suitable for cross-validation of secondary structure prediction methods without artifacts due to internal homology. A fully automatic World Wide Web service that predicts protein secondary structure by a combination of methods is available via http://barton.ebi.ac.uk/. 相似文献
17.
A theoretical and computational approach to ab initio structure prediction for polypeptides in water is described and applied to selected amino acid sequences for testing and preliminary validation. The method builds systematically on the extensive efforts applied to parameterization of molecular dynamics (MD) force fields, employs an empirically well-validated continuum dielectric model for solvation, and an eminently parallelizable approach to conformational search. The effective free energy of polypeptide chains is estimated from AMBER united atom potential functions, with internal degrees of freedom for both backbone and amino acid side chains explicitly treated. The hydration free energy of each structure is determined using the Generalized Born/Solvent Accessibility (GBSA) method, modified and reparameterized to include atom types consistent with the AMBER force field. The conformational search procedure employs a multiple copy, Monte Carlo simulated annealing (MCSA) protocol in full torsion angle space, applied iteratively on sets of structures of progressively lower free energy until a prediction of a structure with lowest effective free energy is obtained. Calibration tests for the effective energy function and search algorithm are performed on the alanine dipeptide, selected protein crystal structures, and united atom decoys on barnase, crambin, and six examples from the Rosetta set. Specific demonstration cases of the method are provided for the 8-mer sequence of Ala residues, a 12-residue peptide with longer side chains QLLKKLLQQLKQ, a de novo designed 16 residue peptide of sequence (AAQAA)3Y, a 15-residue sequence with a beta sheet motif, GEWTWDATKTFTVTE, and a 36 residue small protein, Villin headpiece. The Ala 8-mer readily formed an alpha-helix. An alpha-helix structure was predicted for the 16-mer, consistent with observed results from IR and CD spectroscopy and with the pattern in psi/straight phi angles of known protein structures. The predicted structure for the 12-mer, composed of a mix of helix and less regular elements of secondary structure, lies 2.65 A RMS from the observed crystal structure. Structure prediction for the 8-mer beta-motif resulted in form 4.50 A RMS from the crystal geometry. For Villin, the predicted native form is very close to the crystal structure, RMS values of 3.5 A (including sidechains), and 1.01 A (main chain only). The methodology permits a detailed analysis of the molecular forces which dominate various segments of the predicted folding trajectory. Analysis of the results in terms of internal torsional, electrostatic and van der Waals and the electrostatic and non-electrostatic contributions to hydration, including the hydrophobic effect, is presented. 相似文献
18.
Decoys 'R' Us: a database of incorrect conformations to improve protein structure prediction 下载免费PDF全文
The development of an energy or scoring function for protein structure prediction is greatly enhanced by testing the function on a set of computer-generated conformations (decoys) to determine whether it can readily distinguish native-like conformations from nonnative ones. We have created "Decoys 'R' Us," a database containing many such sets of conformations, to provide a resource that allows scoring functions to be improved. 相似文献
19.
Murzin AG 《Nature structural biology》2001,8(2):110-112