共查询到20条相似文献,搜索用时 0 毫秒
1.
Bujnicki JM Elofsson A Fischer D Rychlewski L 《Protein science : a publication of the Protein Society》2001,10(2):352-361
We present a novel, continuous approach aimed at the large-scale assessment of the performance of available fold-recognition servers. Six popular servers were investigated: PDB-Blast, FFAS, T98-lib, GenTHREADER, 3D-PSSM, and INBGU. The assessment was conducted using as prediction targets a large number of selected protein structures released from October 1999 to April 2000. A target was selected if its sequence showed no significant similarity to any of the proteins previously available in the structural database. Overall, the servers were able to produce structurally similar models for one-half of the targets, but significantly accurate sequence-structure alignments were produced for only one-third of the targets. We further classified the targets into two sets: easy and hard. We found that all servers were able to find the correct answer for the vast majority of the easy targets if a structurally similar fold was present in the server's fold libraries. However, among the hard targets--where standard methods such as PSI-BLAST fail--the most sensitive fold-recognition servers were able to produce similar models for only 40% of the cases, half of which had a significantly accurate sequence-structure alignment. Among the hard targets, the presence of updated libraries appeared to be less critical for the ranking. An \"ideally combined consensus\" prediction, where the results of all servers are considered, would increase the percentage of correct assignments by 50%. Each server had a number of cases with a correct assignment, where the assignments of all the other servers were wrong. This emphasizes the benefits of considering more than one server in difficult prediction tasks. The LiveBench program (http://BioInfo.PL/LiveBench) is being continued, and all interested developers are cordially invited to join. 相似文献
2.
Fischer D 《Proteins》2003,51(3):434-441
To gain a better understanding of the biological role of proteins encoded in genome sequences, knowledge of their three-dimensional (3D) structure and function is required. The computational assignment of folds is becoming an increasingly important complement to experimental structure determination. In particular, fold-recognition methods aim to predict approximate 3D models for proteins bearing no sequence similarity to any protein of known structure. However, fully automated structure-prediction methods can currently produce reliable models for only a fraction of these sequences. Using a number of semiautomated procedures, human expert predictors are often able to produce more and better predictions than automated methods. We describe a novel, fully automatic, fold-recognition meta-predictor, named 3D-SHOTGUN, which incorporates some of the strategies human predictors have successfully applied. This new method is reminiscent of the so-called cooperative algorithms of Computer Vision. The input to 3D-SHOTGUN are the top models predicted by a number of independent fold-recognition servers. The meta-predictor consists of three steps: (i) assembly of hybrid models, (ii) confidence assignment, and (iii) selection. We have applied 3D-SHOTGUN to an unbiased test set of 77 newly released protein structures sharing no sequence similarity to proteins previously released. Forty-six correct rank-1 predictions were obtained, 30 of which had scores higher than that of the first incorrect prediction-a significant improvement over the performance of all individual servers. Furthermore, the predicted hybrid models were, on average, more similar to their corresponding native structures than those produced by the individual servers. This opens the possibility of generating more accurate, full-atom homology models for proteins with no sequence similarity to proteins of known structure. These improvements represent a step forward toward the wider applicability of fully automated structure-prediction methods at genome scales. 相似文献
3.
Lundström J Rychlewski L Bujnicki J Elofsson A 《Protein science : a publication of the Protein Society》2001,10(11):2354-2362
During recent years many protein fold recognition methods have been developed, based on different algorithms and using various kinds of information. To examine the performance of these methods several evaluation experiments have been conducted. These include blind tests in CASP/CAFASP, large scale benchmarks, and long-term, continuous assessment with newly solved protein structures. These studies confirm the expectation that for different targets different methods produce the best predictions, and the final prediction accuracy could be improved if the available methods were combined in a perfect manner. In this article a neural-network-based consensus predictor, Pcons, is presented that attempts this task. Pcons attempts to select the best model out of those produced by six prediction servers, each using different methods. Pcons translates the confidence scores reported by each server into uniformly scaled values corresponding to the expected accuracy of each model. The translated scores as well as the similarity between models produced by different servers is used in the final selection. According to the analysis based on two unrelated sets of newly solved proteins, Pcons outperforms any single server by generating approximately 8%-10% more correct predictions. Furthermore, the specificity of Pcons is significantly higher than for any individual server. From analyzing different input data to Pcons it can be shown that the improvement is mainly attributable to measurement of the similarity between the different models. Pcons is freely accessible for the academic community through the protein structure-prediction metaserver at http://bioinfo.pl/meta/. 相似文献
4.
Dietlind L. Gerloff Fred E. Cohen Chantal Korostensky Marcel Turcotte Gaston H. Gonnet Steven A. Benner 《Proteins》1997,27(3):450-458
A secondary structure has been predicted for the heat shock protein HSP90 family from an aligned set of homologous protein sequences by using a transparent method in both manual and automated implementation that extracts conformational information from patterns of variation and conservation within the family. No statistically significant sequence similarity relates this family to any protein with known crystal structure. However, the secondary structure prediction, together with the assignment of active site positions and possible biochemical properties, suggest that the fold is similar to that seen in N-terminal domain of DNA gyrase B (the ATPase fragment). Proteins 27:450–458, 1997. © 1997 Wiley-Liss, Inc. 相似文献
5.
6.
Meruelo AD Samish I Bowie JU 《Protein science : a publication of the Protein Society》2011,20(7):1256-1264
A hallmark of membrane protein structure is the large number of distorted transmembrane helices. Because of the prevalence of bends, it is important to not only understand how they are generated but also to learn how to predict their occurrence. Here, we find that there are local sequence preferences in kinked helices, most notably a higher abundance of proline, which can be exploited to identify bends from local sequence information. A neural network predictor identifies over two-thirds of all bends (sensitivity 0.70) with high reliability (specificity 0.89). It is likely that more structural data will allow for better helix distortion predictors with increased coverage in the future. The kink predictor, TMKink, is available at http://tmkinkpredictor.mbi.ucla.edu/. 相似文献
7.
In recent years, the protein-folding problem has attracted the attention of molecular biologists. Efforts have focused on developing heuristic and energy-based algorithms to predict the three-dimensional structure of a protein from its amino acid sequence. We have applied a series of heuristic algorithms to the sequence of human growth hormone. A family of five structures which are generically right-handed fourfold alpha-helical bundles are found from an investigation of approximately 10(8) structures. A plausible receptor binding site is suggested. Independent crystallographic analysis confirms some aspects of these predictions. These methods only deal with the "core" structure, and conformations of many residues are not defined. Further work is required to identify a unique set of coordinates and to clarify the topological alternative available to alpha-helical proteins. 相似文献
8.
The protein universe can be organized in families that group proteins sharing common ancestry. Such families display variable levels of structural and functional divergence, from homogenous families, where all members have the same function and very similar structure, to very divergent families, where large variations in function and structure are observed. For practical purposes of structure and function prediction, it would be beneficial to identify sub-groups of proteins with highly similar structures (iso-structural) and/or functions (iso-functional) within divergent protein families. We compared three algorithms in their ability to cluster large protein families and discuss whether any of these methods could reliably identify such iso-structural or iso-functional groups. We show that clustering using profile-sequence and profile-profile comparison methods closely reproduces clusters based on similarities between 3D structures or clusters of proteins with similar biological functions. In contrast, the still commonly used sequence-based methods with fixed thresholds result in vast overestimates of structural and functional diversity in protein families. As a result, these methods also overestimate the number of protein structures that have to be determined to fully characterize structural space of such families. The fact that one can build reliable models based on apparently distantly related templates is crucial for extracting maximal amount of information from new sequencing projects. 相似文献
9.
Arguably, 2020 was the year of high-accuracy protein structure predictions, with AlphaFold 2.0 achieving previously unseen accuracy in the Critical Assessment of Protein Structure Prediction (CASP). In 2021, DeepMind and EMBL-EBI developed the AlphaFold Protein Structure Database to make an unprecedented number of reliable protein structure predictions easily accessible to the broad scientific community. We provide a brief overview and describe the latest developments in the AlphaFold database. We highlight how the fields of data services, bioinformatics, structural biology, and drug discovery are directly affected by the influx of protein structure data. We also show examples of cutting-edge research that took advantage of the AlphaFold database. It is apparent that connections between various fields through protein structures are now possible, but the amount of data poses new challenges. Finally, we give an outlook regarding the future direction of the database, both in terms of data sets and new functionalities. 相似文献
10.
Substantial progresses in protein structure prediction have been made by utilizing deep-learning and residue-residue distance prediction since CASP13. Inspired by the advances, we improve our CASP14 MULTICOM protein structure prediction system by incorporating three new components: (a) a new deep learning-based protein inter-residue distance predictor to improve template-free (ab initio) tertiary structure prediction, (b) an enhanced template-based tertiary structure prediction method, and (c) distance-based model quality assessment methods empowered by deep learning. In the 2020 CASP14 experiment, MULTICOM predictor was ranked seventh out of 146 predictors in tertiary structure prediction and ranked third out of 136 predictors in inter-domain structure prediction. The results demonstrate that the template-free modeling based on deep learning and residue-residue distance prediction can predict the correct topology for almost all template-based modeling targets and a majority of hard targets (template-free targets or targets whose templates cannot be recognized), which is a significant improvement over the CASP13 MULTICOM predictor. Moreover, the template-free modeling performs better than the template-based modeling on not only hard targets but also the targets that have homologous templates. The performance of the template-free modeling largely depends on the accuracy of distance prediction closely related to the quality of multiple sequence alignments. The structural model quality assessment works well on targets for which enough good models can be predicted, but it may perform poorly when only a few good models are predicted for a hard target and the distribution of model quality scores is highly skewed. MULTICOM is available at https://github.com/jianlin-cheng/MULTICOM_Human_CASP14/tree/CASP14_DeepRank3 and https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0 . 相似文献
11.
12.
13.
We present heuristic-based predictions of the secondary and tertiary structures of the cyclins A, B, and D, representatives of the cyclin superfamily. The list of suggested constraints for tertiary structure assembly was left unrefined in order to submit this report before an announced crystal structure for cyclin A becomes available. To predict these constraints, a master sequence alignment over 270 positions of cyclin types A, B, and D was adjusted based on individual secondary structure predictions for each type. We used new heuristics for predicting aromatic residues at protein-protein interfaces and to identify sequentially distinct regions in the protein chain that cluster in the folded structure. The boundaries of two conjectured domains in the cyclin fold were predicted based on experimental data in the literature. The domain that is important for interaction of the cyclins with cyclin-dependent kinases (CDKs) is predicted to contain six helices; the second domain in the consensus model contains both helices and a β-sheet that is formed by sequentially distant regions in the protein chain. A plausible phosphorylation site is identified. This work represents a blinded test of the method for prediction of secondary and, to a lesser extent, tertiary structure from a set of homologous protein sequences. Evaluation of our predictions will become possible with the publication of the announced crystal structure. 相似文献
14.
A bona fide consensus prediction for the secondary and supersecondary structure of the serine–threonine specific protein phosphatases is presented. The prediction includes assignments of active site segments, an internal helix, and a region of possible 310 helical structure. An experimental structure for a member of this family of proteins should appear shortly, allowing this prediction to be evaluated. © 1995 Wiley-Liss, Inc. 相似文献
15.
16.
17.
神经网络在蛋白质二级结构预测中的应用 总被引:3,自引:0,他引:3
介绍了蛋白质二级结构预测的研究意义,讨论了用在蛋白质二级结构预测方面的神经网络设计问题,并且较详尽地评述了近些年来用神经网络方法在蛋白质二级结构预测中的主要工作进展情况,展望了蛋白质结构预测的前景。 相似文献
18.
The results of a protein structure prediction contest are reviewed. Twelve different groups entered predictions on 14 proteins of known sequence whose structures had been determined but not yet disseminated to the scientific community. Thus, these represent true tests of the current state of structure prediction methodologies. From this work, it is clear that accurate tertiary structure prediction is not yet possible. However, protein fold and motif prediction are possible when the motif is recognizably similar to another known structure. Internal symmetry and the information inherent in an aligned family of homologous sequences facilitate predictive efforts. Novel folds remain a major challenge for prediction efforts. © 1995 Wiley-Liss, Inc. 相似文献
19.
The substrate specificity of the facilitated hexose transporter, GLUT, family, (gene SLC2A) is highly varied. Some appear to be able to translocate both glucose and fructose, while the ability to handle 2-deoxyglucose and galactose does not necessarily correlate with the other two hexoses. It has become generally accepted that a central substrate binding/translocation site determines which hexoses can be transported. However, a recent study showed that a single point mutation of a hydrophobic residue in GLUTs 2, 5 & 7 removed their ability to transport fructose without affecting the kinetics of glucose permeation. This residue is in the 7th transmembrane helix, facing the aqueous pore and lies close to the opening of the exofacial vestibule. This study expands these observations to include the other class II GLUTs (9 & 11) and shows that a three amino acid motif (NXI/NXV) appears to be critical in determining if fructose can access the translocation mechanism. GLUT11 can also transport fructose, but it has the motif DSV at the same position, which appears to function in the same manner as NXI and when all three residues are replaced with NAV fructose transport lost. These results are discussed in relation to possible roles for hydrophobic residues lining the aqueous pore at the opening of the exofacial vestibule. Finally, the possibility that the translocation binding site may not be the sole determinant of substrate specificity for these proteins is examined. 相似文献
20.
The ability to separate correct models of protein structures from less correct models is of the greatest importance for protein structure prediction methods. Several studies have examined the ability of different types of energy function to detect the native, or native-like, protein structure from a large set of decoys. In contrast to earlier studies, we examine here the ability to detect models that only show limited structural similarity to the native structure. These correct models are defined by the existence of a fragment that shows significant similarity between this model and the native structure. It has been shown that the existence of such fragments is useful for comparing the performance between different fold recognition methods and that this performance correlates well with performance in fold recognition. We have developed ProQ, a neural-network-based method to predict the quality of a protein model that extracts structural features, such as frequency of atom-atom contacts, and predicts the quality of a model, as measured either by LGscore or MaxSub. We show that ProQ performs at least as well as other measures when identifying the native structure and is better at the detection of correct models. This performance is maintained over several different test sets. ProQ can also be combined with the Pcons fold recognition predictor (Pmodeller) to increase its performance, with the main advantage being the elimination of a few high-scoring incorrect models. Pmodeller was successful in CASP5 and results from the latest LiveBench, LiveBench-6, indicating that Pmodeller has a higher specificity than Pcons alone. 相似文献