首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Rapid progress in structural modeling of proteins and their interactions is powered by advances in knowledge-based methodologies along with better understanding of physical principles of protein structure and function. The pool of structural data for modeling of proteins and protein–protein complexes is constantly increasing due to the rapid growth of protein interaction databases and Protein Data Bank. The GWYRE (Genome Wide PhYRE) project capitalizes on these developments by advancing and applying new powerful modeling methodologies to structural modeling of protein–protein interactions and genetic variation. The methods integrate knowledge-based tertiary structure prediction using Phyre2 and quaternary structure prediction using template-based docking by a full-structure alignment protocol to generate models for binary complexes. The predictions are incorporated in a comprehensive public resource for structural characterization of the human interactome and the location of human genetic variants. The GWYRE resource facilitates better understanding of principles of protein interaction and structure/function relationships. The resource is available at http://www.gwyre.org.  相似文献   

2.
Protein structure prediction   总被引:2,自引:0,他引:2  
The prediction of protein structure, based primarily on sequence and structure homology, has become an increasingly important activity. Homology models have become more accurate and their range of applicability has increased. Progress has come, in part, from the flood of sequence and structure information that has appeared over the past few years, and also from improvements in analysis tools. These include profile methods for sequence searches, the use of three-dimensional structure information in sequence alignment and new homology modeling tools, specifically in the prediction of loop and side-chain conformations. There have also been important advances in understanding the physical chemical basis of protein stability and the corresponding use of physical chemical potential functions to identify correctly folded from incorrectly folded protein conformations.  相似文献   

3.
The objective of science is to understand the natural world; we argue that prediction is the only way to demonstrate scientific understanding, implying that prediction should be a fundamental aspect of all scientific disciplines. Reproducibility is an essential requirement of good science and arises from the ability to develop models that make accurate predictions on new data. Ecology, however, with a few exceptions, has abandoned prediction as a central focus and faces its own crisis of reproducibility. Models are where ecological understanding is stored and they are the source of all predictions – no prediction is possible without a model of the world. Models can be improved in three ways: model variables, functional relationships among dependent and independent variables, and in parameter estimates. Ecologists rarely test to assess whether new models have made advances by identifying new and important variables, elucidating functional relationships, or improving parameter estimates. Without these tests it is difficult to know if we understand more today than we did yesterday. A new commitment to prediction in ecology would lead to, among other things, more mature (i.e. quantitative) hypotheses, prioritization of modeling techniques that are more appropriate for prediction (e.g. using continuous independent variables rather than categorical) and, ultimately, advancement towards a more general understanding of the natural world. Synthesis Ecology, with a few exceptions, has abandoned prediction and therefore the ability to demonstrate understanding. Here we address how this has inhibited progress in ecology and explore how a renewed focus on prediction would benefit ecologists. The lack of emphasis on prediction has resulted in a discipline that tests qualitative, imprecise hypotheses with little concern for whether the results are generalizable beyond where and when the data were collected. A renewed commitment to prediction would allow ecologists to address critical questions about the generalizability of our results and the progress we are making towards understanding the natural world.  相似文献   

4.
5.
Computational prediction of enzyme mechanism and protein function requires accurate physics-based models and suitable sampling. We discuss recent advances in large-scale quantum mechanical (QM) modeling of biochemical systems that have reduced the cost of high-accuracy models. Tradeoffs between sampling and accuracy have motivated modeling with molecular mechanics (MM) in a multiscale QM/MM or iterative approach. Limitations to both conventional density-functional theory and classical MM force fields remain for describing noncovalent interactions in comparison to experiment or wavefunction theory. Because predictions of enzyme action (i.e. electrostatics), free energy barriers, and mechanisms are sensitive to the protocol and embedding method in QM/MM, convergence tests and systematic methods for quantifying QM-level interactions are a needed, active area of development.  相似文献   

6.
Prediction of protein structure from sequence has been intensely studied for many decades, owing to the problem's importance and its uniquely well-defined physical and computational bases. While progress has historically ebbed and flowed, the past two years saw dramatic advances driven by the increasing “neuralization” of structure prediction pipelines, whereby computations previously based on energy models and sampling procedures are replaced by neural networks. The extraction of physical contacts from the evolutionary record; the distillation of sequence–structure patterns from known structures; the incorporation of templates from homologs in the Protein Databank; and the refinement of coarsely predicted structures into finely resolved ones have all been reformulated using neural networks. Cumulatively, this transformation has resulted in algorithms that can now predict single protein domains with a median accuracy of 2.1 Å, setting the stage for a foundational reconfiguration of the role of biomolecular modeling within the life sciences.  相似文献   

7.
The combination of the wide availability of protein backbone and side-chain NMR chemical shifts with advances in understanding of their relationship to protein structure makes these parameters useful for the assessment of structural-dynamic protein models. A new chemical shift predictor (PPM) is introduced, which is solely based on physical?Cchemical contributions to the chemical shifts for both the protein backbone and methyl-bearing amino-acid side chains. To explicitly account for the effects of protein dynamics on chemical shifts, PPM was directly refined against 100?ns long molecular dynamics (MD) simulations of 35 proteins with known experimental NMR chemical shifts. It is found that the prediction of methyl-proton chemical shifts by PPM from MD ensembles is improved over other methods, while backbone C??, C??, C??, N, and HN chemical shifts are predicted at an accuracy comparable to the latest generation of chemical shift prediction programs. PPM is particularly suitable for the rapid evaluation of large protein conformational ensembles on their consistency with experimental NMR data and the possible improvement of protein force fields from chemical shifts.  相似文献   

8.
9.

Background  

A number of sequence-based methods exist for protein secondary structure prediction. Protein secondary structures can also be determined experimentally from circular dichroism, and infrared spectroscopic data using empirical analysis methods. It has been proposed that comparable accuracy can be obtained from sequence-based predictions as from these biophysical measurements. Here we have examined the secondary structure determination accuracies of sequence prediction methods with the empirically determined values from the spectroscopic data on datasets of proteins for which both crystal structures and spectroscopic data are available.  相似文献   

10.
Applied ecology is based on an assumption that a management action will result in a predicted outcome. Testing the prediction accuracy of ecological models is the most powerful way of evaluating the knowledge implicit in this cause-effect relationship, however, the prevalence of predictive modeling and prediction testing are spreading slowly in ecology. The challenge of prediction testing is particularly acute for small-scale studies, because withholding data for prediction testing (e.g., via k-fold cross validation) can reduce model precision. However, by necessity small-scale studies are common. We use one such study that explored small mammal abundance along an elevational gradient to test prediction accuracy of models with varying degrees of information content. For each of three small mammal species, we conducted 5000 iterations of the following process: (1) randomly selected 75 % of the data to develop generalized linear models of species abundance that used detailed site measurements as covariates, (2) used an information theoretic approach to compare the top model with detailed covariates to habitat type-only and null models constructed with the same data, (3) tested those models’ ability to predict the 25 % of the randomly withheld data, and (4) evaluated prediction accuracy with a quadratic loss function. Detailed models fit the model-evaluation data best but had greater expected prediction error when predicting out-of-sample data relative to the habitat type models. Relationships between species and detailed site variables may be evident only within the framework of explicitly hierarchical analyses. We show that even with a small but relatively typical dataset (n = 28 sampling locations across 125 km over two years), researchers can effectively compare models with different information content and measure models’ predictive power, thus evaluating their own ecological understanding and defining the limits of their inferences. Identifying the appropriate scope of inference through prediction testing is ecologically valuable and is attainable even with small datasets.  相似文献   

11.

Background  

The delivery of DNA into human cells has been the basis of advances in the understanding of gene function and the development of genetic therapies. Numerous chemical and physical approaches have been used to deliver the DNA, but their efficacy has been variable and is highly dependent on the cell type to be transfected.  相似文献   

12.

Background  

Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both.  相似文献   

13.

Background  

Proteochemometrics is a new methodology that allows prediction of protein function directly from real interaction measurement data without the need of 3D structure information. Several reported proteochemometric models of ligand-receptor interactions have already yielded significant insights into various forms of bio-molecular interactions. The proteochemometric models are multivariate regression models that predict binding affinity for a particular combination of features of the ligand and protein. Although proteochemometric models have already offered interesting results in various studies, no detailed statistical evaluation of their average predictive power has been performed. In particular, variable subset selection performed to date has always relied on using all available examples, a situation also encountered in microarray gene expression data analysis.  相似文献   

14.
The mechanism of the complex enzyme nitrogenase has long been one of the most challenging problems in bioinorganic chemistry. The complexity of the metal centers of nitrogenase has stretched the boundaries of biochemical, physical and computational tools for providing insights into its structure and chemical function. Recently, there have been several key advances in crystallography and spectroscopy that have impacted the way the nitrogenase mechanism is approached. These advances have opened new frontiers in nitrogenase research, which has started to reveal novel details about the molecular structure, substrate binding and reduction. Here, we discuss these recent advances and their implications on the future of nitrogenase research.  相似文献   

15.
In the cellular context, proteins participate in communities to perform their function. The detection and identification of these communities as well as in-community interactions has long been the subject of investigation, mainly through proteomics analysis with mass spectrometry. With the advent of cryogenic electron microscopy and the “resolution revolution,” their visualization has recently been made possible, even in complex, native samples. The advances in both fields have resulted in the generation of large amounts of data, whose analysis requires advanced computation, often employing machine learning approaches to reach the desired outcome. In this work, we first performed a robust proteomics analysis of mass spectrometry (MS) data derived from a yeast native cell extract and used this information to identify protein communities and inter-protein interactions. Cryo-EM analysis of the cell extract provided a reconstruction of a biomolecule at medium resolution (∼8 Å (FSC = 0.143)). Utilizing MS-derived proteomics data and systematic fitting of AlphaFold-predicted atomic models, this density was assigned to the 2.6 MDa complex of yeast fatty acid synthase. Our proposed workflow identifies protein complexes in native cell extracts from Saccharomyces cerevisiae by combining proteomics, cryo-EM, and AI-guided protein structure prediction.  相似文献   

16.
Predictive understanding of cell signaling network operation based on general prior knowledge but consistent with empirical data in a specific environmental context is a current challenge in computational biology. Recent work has demonstrated that Boolean logic can be used to create context-specific network models by training proteomic pathway maps to dedicated biochemical data; however, the Boolean formalism is restricted to characterizing protein species as either fully active or inactive. To advance beyond this limitation, we propose a novel form of fuzzy logic sufficiently flexible to model quantitative data but also sufficiently simple to efficiently construct models by training pathway maps on dedicated experimental measurements. Our new approach, termed constrained fuzzy logic (cFL), converts a prior knowledge network (obtained from literature or interactome databases) into a computable model that describes graded values of protein activation across multiple pathways. We train a cFL-converted network to experimental data describing hepatocytic protein activation by inflammatory cytokines and demonstrate the application of the resultant trained models for three important purposes: (a) generating experimentally testable biological hypotheses concerning pathway crosstalk, (b) establishing capability for quantitative prediction of protein activity, and (c) prediction and understanding of the cytokine release phenotypic response. Our methodology systematically and quantitatively trains a protein pathway map summarizing curated literature to context-specific biochemical data. This process generates a computable model yielding successful prediction of new test data and offering biological insight into complex datasets that are difficult to fully analyze by intuition alone.  相似文献   

17.
Protein structure prediction has great potential of understanding the function of proteins at the molecular level and designing novel protein functions. Here, we report rapid and accurate structure prediction system running in an automated manner. Since fold recognition of the target protein to be modeled is the starting point of the template-guided model building process, various approaches – such as profile analysis, threading, and SCOP fold classification – have been applied to generate the template library and to select the best template structure. After the best template was determined, fold consistency within the template candidates was considered using TM-score and SCOP database to select additional template structures among the template library. To generate a total of 100 decoy sets, MODELLER was used with the selected template structure. The predicted decoys were clustered with the RMSD deviation criterion of 3 Å to obtain centroids from each cluster. Finally, the selected centroids were subject to side-chain rearrangement using SCWRL module. Our fully automated structure prediction system was examined with sample test sets consisting of recently released 80 PDB chains. Judged by the TM-score (≥0.4), we concluded that 60 cases (75%) showed similar structures of statistical significance. This prediction system provides the users with simple and reliable models within hours of query submission, so that it is quite simply used for high throughput enzyme screening.  相似文献   

18.
It has been a landmark year for artificial intelligence (AI) and biotechnology. Perhaps the most noteworthy of these advances was Google DeepMind’s AlphaFold2 algorithm which smashed records in protein structure prediction (Jumper et al., 2021, Nature, 596, 583) complemented by progress made by other research groups around the globe (Baek et al., 2021, Science, 373, 871; Zheng et al., 2021, Proteins). For the first time in history, AI achieved protein structure models rivalling the accuracy of experimentally determined structures. The power of accurate protein structure prediction at our fingertips has countless implications for drug discovery, de novo protein design and fundamental research in chemical biology. While acknowledging the significance of these breakthroughs, this perspective aims to cut through the hype and examine some key limitations using AlphaFold2 as a lens to consider the broader implications of AI for microbial biotechnology for the next 15 years and beyond.  相似文献   

19.
A family of structurally related intrinsic membrane proteins (facilitative glucose transporters) catalyzes the movement of glucose across the plasma membrane of animal cells. Evidence indicates that these proteins show a common structural motif where approximately 50% of the mass is embedded in lipid bilayer (transmembrane domain) in 12 alpha-helices (transmembrane helices; TMHs) and accommodates a water-filled channel for substrate passage (glucose channel) whose tertiary structure is currently unknown. Using recent advances in protein structure prediction algorithms we proposed here two three-dimensional structural models for the transmembrane glucose channel of GLUT1 glucose transporter. Our models emphasize the physical dimension and water accessibility of the channel, loop lengths between TMHs, the macrodipole orientation in four-helix bundle motif, and helix packing energy. Our models predict that five TMHs, either TMHs 3, 4, 7, 8, 11 (Model 1) or TMHs 2, 5, 11, 8, 7 (Model 2), line the channel, and the remaining TMHs surround these channel-lining TMHs. We discuss how our models are compatible with the experimental data obtained with this protein, and how they can be used in designing new biochemical and molecular biological experiments in elucidation of the structural basis of this important protein function.  相似文献   

20.
Although most statistical methods for the analysis of longitudinal data have focused on retrospective models of association, new advances in mobile health data have presented opportunities for predicting future health status by leveraging an individual's behavioral history alongside data from similar patients. Methods that incorporate both individual-level and sample-level effects are critical to using these data to its full predictive capacity. Neural networks are powerful tools for prediction, but many assume input observations are independent even when they are clustered or correlated in some way, such as in longitudinal data. Generalized linear mixed models (GLMM) provide a flexible framework for modeling longitudinal data but have poor predictive power particularly when the data are highly nonlinear. We propose a generalized neural network mixed model that replaces the linear fixed effect in a GLMM with the output of a feed-forward neural network. The model simultaneously accounts for the correlation structure and complex nonlinear relationship between input variables and outcomes, and it utilizes the predictive power of neural networks. We apply this approach to predict depression and anxiety levels of schizophrenic patients using longitudinal data collected from passive smartphone sensor data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号