首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 171 毫秒
AlphaFold2 is a promising new tool for researchers to predict protein structures and generate high-quality models, with low backbone and global root-mean-square deviation (RMSD) when compared with experimental structures. However, it is unclear if the structures predicted by AlphaFold2 will be valuable targets of docking. To address this question, we redocked ligands in the PDBbind datasets against the experimental co-crystallized receptor structures and against the AlphaFold2 structures using AutoDock-GPU. We find that the quality measure provided during structure prediction is not a good predictor of docking performance, despite accurately reflecting the quality of the alpha carbon alignment with experimental structures. Removing low-confidence regions of the predicted structure and making side chains flexible improves the docking outcomes. Overall, despite high-quality prediction of backbone conformation, fine structural details limit the naive application of AlphaFold2 models as docking targets.  相似文献   

The protein structure field is experiencing a revolution. From the increased throughput of techniques to determine experimental structures, to developments such as cryo-EM that allow us to find the structures of large protein complexes or, more recently, the development of artificial intelligence tools, such as AlphaFold, that can predict with high accuracy the folding of proteins for which the availability of homology templates is limited. Here we quantify the effect of the recently released AlphaFold database of protein structural models in our knowledge on human proteins. Our results indicate that our current baseline for structural coverage of 48%, considering experimentally-derived or template-based homology models, elevates up to 76% when including AlphaFold predictions. At the same time the fraction of dark proteome is reduced from 26% to just 10% when AlphaFold models are considered. Furthermore, although the coverage of disease-associated genes and mutations was near complete before AlphaFold release (69% of Clinvar pathogenic mutations and 88% of oncogenic mutations), AlphaFold models still provide an additional coverage of 3% to 13% of these critically important sets of biomedical genes and mutations. Finally, we show how the contribution of AlphaFold models to the structural coverage of non-human organisms, including important pathogenic bacteria, is significantly larger than that of the human proteome. Overall, our results show that the sequence-structure gap of human proteins has almost disappeared, an outstanding success of direct consequences for the knowledge on the human genome and the derived medical applications.  相似文献   

Significant advances have been achieved in protein structure prediction, especially with the recent development of the AlphaFold2 and the RoseTTAFold systems. This article reviews the progress in deep learning-based protein structure prediction methods in the past two years. First, we divide the representative methods into two categories: the two-step approach and the end-to-end approach. Then, we show that the two-step approach is possible to achieve similar accuracy to the state-of-the-art end-to-end approach AlphaFold2. Compared to the end-to-end approach, the two-step approach requires fewer computing resources. We conclude that it is valuable to keep developing both approaches. Finally, a few outstanding challenges in function-orientated protein structure prediction are pointed out for future development.  相似文献   

Arguably, 2020 was the year of high-accuracy protein structure predictions, with AlphaFold 2.0 achieving previously unseen accuracy in the Critical Assessment of Protein Structure Prediction (CASP). In 2021, DeepMind and EMBL-EBI developed the AlphaFold Protein Structure Database to make an unprecedented number of reliable protein structure predictions easily accessible to the broad scientific community. We provide a brief overview and describe the latest developments in the AlphaFold database. We highlight how the fields of data services, bioinformatics, structural biology, and drug discovery are directly affected by the influx of protein structure data. We also show examples of cutting-edge research that took advantage of the AlphaFold database. It is apparent that connections between various fields through protein structures are now possible, but the amount of data poses new challenges. Finally, we give an outlook regarding the future direction of the database, both in terms of data sets and new functionalities.  相似文献   

Protein aggregation is a widespread phenomenon with important implications in many scientific areas. Although amyloid formation is typically considered as detrimental, functional amyloids that perform physiological roles have been identified in all kingdoms of life. Despite their functional and pathological relevance, the structural details of the majority of molecular species involved in the amyloidogenic process remains elusive. Here, we explore the application of AlphaFold, a highly accurate protein structure predictor, in the field of protein aggregation. While we envision a straightforward application of AlphaFold in assisting the design of globular proteins with improved solubility for biomedical and industrial purposes, the use of this algorithm for predicting the structure of aggregated species seems far from trivial. First, in amyloid diseases, the presence of multiple amyloid polymorphs and the heterogeneity of aggregation intermediates challenges the “one sequence, one structure” paradigm, inherent to sequence-based predictions. Second, aberrant aggregation is not the subject of positive selective pressure, precluding the use of evolutionary-based approaches, which are the core of the AlphaFold pipeline. Instead, amyloid polymorphism seems to be constrained by the need for a defined structure-activity relationship in functional amyloids. They may thus provide a starting point for the application of AlphaFold in the amyloid landscape.  相似文献   

DeepMind’s AlphaFold2 software has ushered in a revolution in high quality, 3D protein structure prediction. In very recent work by the DeepMind team, structure predictions have been made for entire proteomes of twenty-one organisms, with >360,000 structures made available for download. Here we show that thousands of novel binding sites for iron-sulfur (Fe-S) clusters and zinc (Zn) ions can be identified within these predicted structures by exhaustive enumeration of all potential ligand-binding orientations. We demonstrate that AlphaFold2 routinely makes highly specific predictions of ligand binding sites: for example, binding sites that are comprised exclusively of four cysteine sidechains fall into three clusters, representing binding sites for 4Fe-4S clusters, 2Fe-2S clusters, or individual Zn ions. We show further: (a) that the majority of known Fe-S cluster and Zn binding sites documented in UniProt are recovered by the AlphaFold2 structures, (b) that there are occasional disputes between AlphaFold2 and UniProt with AlphaFold2 predicting highly plausible alternative binding sites, (c) that the Fe-S cluster binding sites that we identify in E. coli agree well with previous bioinformatics predictions, (d) that cysteines predicted here to be part of ligand binding sites show little overlap with those shown via chemoproteomics techniques to be highly reactive, and (e) that AlphaFold2 occasionally appears to build erroneous disulfide bonds between cysteines that should instead coordinate a ligand. These results suggest that AlphaFold2 could be an important tool for the functional annotation of proteomes, and the methodology presented here is likely to be useful for predicting other ligand-binding sites.  相似文献   

High‐resolution experimental structural determination of protein–protein interactions has led to valuable mechanistic insights, yet due to the massive number of interactions and experimental limitations there is a need for computational methods that can accurately model their structures. Here we explore the use of the recently developed deep learning method, AlphaFold, to predict structures of protein complexes from sequence. With a benchmark of 152 diverse heterodimeric protein complexes, multiple implementations and parameters of AlphaFold were tested for accuracy. Remarkably, many cases (43%) had near‐native models (medium or high critical assessment of predicted interactions accuracy) generated as top‐ranked predictions by AlphaFold, greatly surpassing the performance of unbound protein–protein docking (9% success rate for near‐native top‐ranked models), however AlphaFold modeling of antibody–antigen complexes within our set was unsuccessful. We identified sequence and structural features associated with lack of AlphaFold success, and we also investigated the impact of multiple sequence alignment input. Benchmarking of a multimer‐optimized version of AlphaFold (AlphaFold‐Multimer) with a set of recently released antibody–antigen structures confirmed a low rate of success for antibody–antigen complexes (11% success), and we found that T cell receptor–antigen complexes are likewise not accurately modeled by that algorithm, showing that adaptive immune recognition poses a challenge for the current AlphaFold algorithm and model. Overall, our study demonstrates that end‐to‐end deep learning can accurately model many transient protein complexes, and highlights areas of improvement for future developments to reliably model any protein–protein interaction of interest.  相似文献   

Many proteins exert their function by switching among different structures. Knowing the conformational ensembles affiliated with these states is critical to elucidate key mechanistic aspects that govern protein function. While experimental determination efforts are still bottlenecked by cost, time, and technical challenges, the machine-learning technology AlphaFold showed near experimental accuracy in predicting the three-dimensional structure of monomeric proteins. However, an AlphaFold ensemble of models usually represents a single conformational state with minimal structural heterogeneity. Consequently, several pipelines have been proposed to either expand the structural breadth of an ensemble or bias the prediction toward a desired conformational state. Here, we analyze how those pipelines work, what they can and cannot predict, and future directions.  相似文献   

The announcement of the outstanding performance of AlphaFold 2 in the CASP 14 protein structure prediction competition came at the end of a long year defined by the COVID-19 pandemic. With an infectious organism dominating the world stage, the developers of Alphafold 2 were keen to play their part, accurately predicting novel structures of two proteins from SARS-CoV-2. In their blog post of December 2020, they highlighted this contribution, writing “we’ve also seen signs that protein structure prediction could be useful in future pandemic response efforts”. So, what role does structural biology play in guiding vaccine immunogen design and what might be the contribution of AlphaFold 2?  相似文献   

It has been a landmark year for artificial intelligence (AI) and biotechnology. Perhaps the most noteworthy of these advances was Google DeepMind’s AlphaFold2 algorithm which smashed records in protein structure prediction (Jumper et al., 2021, Nature, 596, 583) complemented by progress made by other research groups around the globe (Baek et al., 2021, Science, 373, 871; Zheng et al., 2021, Proteins). For the first time in history, AI achieved protein structure models rivalling the accuracy of experimentally determined structures. The power of accurate protein structure prediction at our fingertips has countless implications for drug discovery, de novo protein design and fundamental research in chemical biology. While acknowledging the significance of these breakthroughs, this perspective aims to cut through the hype and examine some key limitations using AlphaFold2 as a lens to consider the broader implications of AI for microbial biotechnology for the next 15 years and beyond.  相似文献   

The past century has witnessed an exponential increase in our atomic-level understanding of molecular and cellular mechanisms from a structural perspective, with multiple landmark achievements contributing to the field. This, coupled with recent and continuing breakthroughs in artificial intelligence methods such as AlphaFold2, and enhanced computational power, is enabling our understanding of protein structure and function at unprecedented levels of accuracy and predictivity. Here, we describe some of the major recent advances across these fields, and describe, as these technologies coalesce, the potential to utilise our enhanced knowledge of intricate cellular and molecular systems to discover novel therapeutics to alleviate human suffering.  相似文献   

Cryogenic electron microscopy (cryo-EM) has become in the past 10 years one of the major tools for the structure determination of proteins. Nowadays, the structure prediction field is experiencing the same revolution and, using AlphaFold2, it is possible to have high-confidence atomic models for virtually any polypeptide chain, smaller than 4000 amino acids, in a simple click. Even in a scenario where all polypeptide chain folding were to be known, cryo-EM retains specific characteristics that make it a unique tool for the structure determination of macromolecular complexes. Using cryo-EM, it is possible to obtain near-atomic structures of large and flexible mega-complexes, describe conformational panoramas, and potentially develop a structural proteomic approach from fully ex vivo specimens.  相似文献   

Lim Heo  Michael Feig 《Proteins》2020,88(5):637-642
Protein structure prediction has long been available as an alternative to experimental structure determination, especially via homology modeling based on templates from related sequences. Recently, models based on distance restraints from coevolutionary analysis via machine learning to have significantly expanded the ability to predict structures for sequences without templates. One such method, AlphaFold, also performs well on sequences where templates are available but without using such information directly. Here we show that combining machine-learning based models from AlphaFold with state-of-the-art physics-based refinement via molecular dynamics simulations further improves predictions to outperform any other prediction method tested during the latest round of CASP. The resulting models have highly accurate global and local structures, including high accuracy at functionally important interface residues, and they are highly suitable as initial models for crystal structure determination via molecular replacement.  相似文献   

I outline how over my career as a protein scientist Machine Learning has impacted my area of science and one of my pastimes, chess, where there are some interesting parallels. In 1968, modelling of three-dimensional structures was initiated based on a known structure as a template, the problem of the pathway of protein folding was posed and bets were taken in the emerging field of Machine Learning on whether computers could outplay humans at chess. Half a century later, Machine Learning has progressed from using computational power combined with human knowledge in solving problems to playing chess without human knowledge being used, where it has produced novel strategies. Protein structures are being solved by Machine Learning based on human-derived knowledge but without templates. There is much promise that programs like AlphaFold based on Machine Learning will be powerful tools for designing entirely novel protein folds and new activities. But, will they produce novel ideas on protein folding pathways and provide new insights into the principles that govern folds?  相似文献   

The prediction of the three‐dimensional (3D) structure of proteins from the amino acid sequence made a stunning breakthrough reaching atomic accuracy. Using the neural network‐based method AlphaFold2, 3D structures of almost the entire human proteome have been predicted and made available (https://www.alphafold.ebi.ac.uk). To gain insight into how well AlphaFold2 structures represent the conformation of proteins in solution, I here compare the AlphaFold2 structures of selected small proteins with their 3D structures that were determined by nuclear magnetic resonance (NMR) spectroscopy. Proteins were selected for which the 3D solution structures were determined on the basis of a very large number of distance restraints and residual dipolar couplings and are thus some of the best‐resolved solution structures of proteins to date. The quality of the backbone conformation of the AlphaFold2 structures is assessed by fitting a large set of experimental residual dipolar couplings (RDCs). The analysis shows that experimental RDCs fit extremely well to the AlphaFold2 structures predicted for GB3, DinI, and ubiquitin. In the case of GB3, the accuracy of the AlphaFold2 structure even surpasses that of a 1.1 Å crystal structure. Fitting of experimental RDCs furthermore allows identification of AlphaFold2 structures that are best representative of the protein''s conformation in solution as seen for the EF hands of the N‐terminal domain of Ca2+‐ligated calmodulin. Taken together, the analysis shows that structures predicted by AlphaFold2 can be highly representative of the solution conformation of proteins. The combination of AlphaFold2 structures with RDCs promises to be a powerful approach to study structural changes in proteins.  相似文献   

The protein folding problem was apparently solved recently by the advent of a deep learning method for protein structure prediction called AlphaFold. However, this program is not able to make predictions about the protein folding pathways. Moreover, it only treats about half of the human proteome, as the remaining proteins are intrinsically disordered or contain disordered regions. By definition these proteins differ from natively folded proteins and do not adopt a properly folded structure in solution. However these intrinsically disordered proteins (IDPs) also systematically differ in amino acid composition and uniquely often become folded upon binding to an interaction partner. These factors preclude solving IDP structures by current machine-learning methods like AlphaFold, which also cannot solve the protein aggregation problem, since this meta-folding process can give rise to different aggregate sizes and structures. An alternative computational method is provided by molecular dynamics simulations that already successfully explored the energy landscapes of IDP conformational switching and protein aggregation in multiple cases. These energy landscapes are very different from those of ‘simple’ protein folding, where one energy funnel leads to a unique protein structure. Instead, the energy landscapes of IDP conformational switching and protein aggregation feature a number of minima for different competing low-energy structures. In this review, I discuss the characteristics of these multifunneled energy landscapes in detail, illustrated by molecular dynamics simulations that elucidated the underlying conformational transitions and aggregation processes.  相似文献   

Bolstered by recent methodological and hardware advances, deep learning has increasingly been applied to biological problems and structural proteomics. Such approaches have achieved remarkable improvements over traditional machine learning methods in tasks ranging from protein contact map prediction to protein folding, prediction of protein–protein interaction interfaces, and characterization of protein–drug binding pockets. In particular, emergence of ab initio protein structure prediction methods including AlphaFold2 has revolutionized protein structural modeling. From a protein function perspective, numerous deep learning methods have facilitated deconvolution of the exact amino acid residues and protein surface regions responsible for binding other proteins or small molecule drugs. In this review, we provide a comprehensive overview of recent deep learning methods applied in structural proteomics.  相似文献   

《Biophysical journal》2022,121(15):2840-2848
The recent revolution in cryo-electron microscopy (cryo-EM) has made it possible to determine macromolecular structures directly from cell extracts. However, identifying the correct protein from the cryo-EM map is still challenging and often needs additional sequence information from other techniques, such as tandem mass spectrometry and/or bioinformatics. Here, we present DeepTracer-ID, a server-based approach to identify the candidate protein in a user-provided organism de novo from a cryo-EM map, without the need for additional information. Our method first uses DeepTracer to generate a protein backbone model that best represents the cryo-EM map, and this model is then searched against the library of AlphaFold2 predictions for all proteins in the given organism. This method is highly accurate and robust for high-resolution cryo-EM maps: in all 13 experimental maps tested blindly, DeepTracer-ID identified the correct proteins as the top candidates. Eight of the maps were of known structures, while the other five unpublished maps were validated by prior protein annotation and careful inspection of the model refined into the map. The program also showed promising results for both homomeric and heteromeric protein complexes. This platform is possible because of the recent breakthroughs in large-scale three-dimensional protein structure prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号