首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Rosetta software suite for macromolecular modeling is a powerful computational toolbox for protein design, structure prediction, and protein structure analysis. The development of novel Rosetta‐based scientific tools requires two orthogonal skill sets: deep domain‐specific expertise in protein biochemistry and technical expertise in development, deployment, and analysis of molecular simulations. Furthermore, the computational demands of molecular simulation necessitate large scale cluster‐based or distributed solutions for nearly all scientifically relevant tasks. To reduce the technical barriers to entry for new development, we integrated Rosetta with modern, widely adopted computational infrastructure. This allows simplified deployment in large‐scale cluster and cloud computing environments, and effective reuse of common libraries for simulation execution and data analysis. To achieve this, we integrated Rosetta with the Conda package manager; this simplifies installation into existing computational environments and packaging as docker images for cloud deployment. Then, we developed programming interfaces to integrate Rosetta with the PyData stack for analysis and distributed computing, including the popular tools Jupyter, Pandas, and Dask. We demonstrate the utility of these components by generating a library of a thousand de novo disulfide‐rich miniproteins in a hybrid simulation that included cluster‐based design and interactive notebook‐based analyses. Our new tools enable users, who would otherwise not have access to the necessary computational infrastructure, to perform state‐of‐the‐art molecular simulation and design with Rosetta.  相似文献   

2.
Temperature-sensitive (ts) mutations are mutations that exhibit a mutant phenotype at high or low temperatures and a wild-type phenotype at normal temperature. Temperature-sensitive mutants are valuable tools for geneticists, particularly in the study of essential genes. However, finding ts mutations typically relies on generating and screening many thousands of mutations, which is an expensive and labor-intensive process. Here we describe an in silico method that uses Rosetta and machine learning techniques to predict a highly accurate "top 5" list of ts mutations given the structure of a protein of interest. Rosetta is a protein structure prediction and design code, used here to model and score how proteins accommodate point mutations with side-chain and backbone movements. We show that integrating Rosetta relax-derived features with sequence-based features results in accurate temperature-sensitive mutation predictions.  相似文献   

3.
Symmetric protein assemblies play important roles in many biochemical processes. However, the large size of such systems is challenging for traditional structure modeling methods. This paper describes the implementation of a general framework for modeling arbitrary symmetric systems in Rosetta3. We describe the various types of symmetries relevant to the study of protein structure that may be modeled using Rosetta's symmetric framework. We then describe how this symmetric framework is efficiently implemented within Rosetta, which restricts the conformational search space by sampling only symmetric degrees of freedom, and explicitly simulates only a subset of the interacting monomers. Finally, we describe structure prediction and design applications that utilize the Rosetta3 symmetric modeling capabilities, and provide a guide to running simulations on symmetric systems.  相似文献   

4.

Background  

Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest.  相似文献   

5.
In order to get the dynamic molecule model from the static one, the molecular dynamics (MD) simulation needs to be performed. Some software sets such as GROMACS are used for that purpose. Unfortunately they lack GUI. The Dynamics PyMOL plugin allows researcher to perform MD simulations directly from the PyMOL software by GUI-based interface of GROMACS tools. This paper describes many improvements introduced into the Dynamics PyMOL plugin 2.0 including: an integration with ProDy library, possibility to use the implicit solvents, an ability to interpret the MD simulations, and implementation of some more GROMACS functionality.  相似文献   

6.
We recently developed the Rosetta algorithm for ab initio protein structure prediction, which generates protein structures from fragment libraries using simulated annealing. The scoring function in this algorithm favors the assembly of strands into sheets. However, it does not discriminate between different sheet motifs. After generating many structures using Rosetta, we found that the folding algorithm predominantly generates very local structures. We surveyed the distribution of beta-sheet motifs with two edge strands (open sheets) in a large set of non-homologous proteins. We investigated how much of that distribution can be accounted for by rules previously published in the literature, and developed a filter and a scoring method that enables us to improve protein structure prediction for beta-sheet proteins. Proteins 2002;48:85-97.  相似文献   

7.
Fujitsuka Y  Chikenji G  Takada S 《Proteins》2006,62(2):381-398
Predicting protein tertiary structures by in silico folding is still very difficult for proteins that have new folds. Here, we developed a coarse-grained energy function, SimFold, for de novo structure prediction, performed a benchmark test of prediction with fragment assembly simulations for 38 test proteins, and proposed consensus prediction with Rosetta. The SimFold energy consists of many terms that take into account solvent-induced effects on the basis of physicochemical consideration. In the benchmark test, SimFold succeeded in predicting native structures within 6.5 A for 12 of 38 proteins; this success rate was the same as that by the publicly available version of Rosetta (ab initio version 1.2) run with default parameters. We investigated which energy terms in SimFold contribute to structure prediction performance, finding that the hydrophobic interaction is the most crucial for the prediction, whereas other sequence-specific terms have weak but positive roles. In the benchmark, well-predicted proteins by SimFold and by Rosetta were not the same for 5 of 12 proteins, which led us to introduce consensus prediction. With combined decoys, we succeeded in prediction for 16 proteins, four more than SimFold or Rosetta separately. For each of 38 proteins, structural ensembles generated by SimFold and by Rosetta were qualitatively compared by mapping sampled structural space onto two dimensions. For proteins of which one of the two methods succeeded and the other failed in prediction, the former had a less scattered ensemble located around the native. For proteins of which both methods succeeded in prediction, often two ensembles were mixed up.  相似文献   

8.
Bowman GR  Pande VS 《Proteins》2009,74(3):777-788
Rosetta is a structure prediction package that has been employed successfully in numerous protein design and other applications.1 Previous reports have attributed the current limitations of the Rosetta de novo structure prediction algorithm to inadequate sampling, particularly during the low-resolution phase.2-5 Here, we implement the Simulated Tempering (ST) sampling algorithm67 in Rosetta to address this issue. ST is intended to yield canonical sampling by inducing a random walk in temperatures space such that broad sampling is achieved at high temperatures and detailed exploration of local free energy minima is achieved at low temperatures. ST should therefore visit basins in accordance with their free energies rather than their energies and achieve more global sampling than the localized scheme currently implemented in Rosetta. However, we find that ST does not improve structure prediction with Rosetta. To understand why, we carried out a detailed analysis of the low-resolution scoring functions and find that they do not provide a strong bias towards the native state. In addition, we find that both ST and standard Rosetta runs started from the native state are biased away from the native state. Although the low-resolution scoring functions could be improved, we propose that working entirely at full-atom resolution is now possible and may be a better option due to superior native-state discrimination at full-atom resolution. Such an approach will require more attention to the kinetics of convergence, however, as functions capable of native state discrimination are not necessarily capable of rapidly guiding non-native conformations to the native state.  相似文献   

9.
Ordog R 《Bioinformation》2008,2(8):346-347
The fast growing Protein Data Bank (PDB) contains a vast amount of 3-dimensional data on proteins, and nucleic-acid structures obtained by X-ray crystallography and Nuclear Magnetic Resonance (NMR) spectroscopy. PyDeT is a PyMOL (molecular visualization software system) plug-in that visualize tessellations derived from the protein structure along with the source protein. PyDeT is released under a GNU General Public License (GPL) and is available from the authors.  相似文献   

10.
Incorporation of effective backbone sampling into protein simulation and design is an important step in increasing the accuracy of computational protein modeling. Recent analysis of high-resolution crystal structures has suggested a new model, termed backrub, to describe localized, hinge-like alternative backbone and side-chain conformations observed in the crystal lattice. The model involves internal backbone rotations about axes between C-alpha atoms. Based on this observation, we have implemented a backrub-inspired sampling method in the Rosetta structure prediction and design program. We evaluate this model of backbone flexibility using three different tests. First, we show that Rosetta backrub simulations recapitulate the correlation between backbone and side-chain conformations in the high-resolution crystal structures upon which the model was based. As a second test of backrub sampling, we show that backbone flexibility improves the accuracy of predicting point-mutant side-chain conformations over fixed backbone rotameric sampling alone. Finally, we show that backrub sampling of triosephosphate isomerase loop 6 can capture the millisecond/microsecond oscillation between the open and closed states observed in solution. Our results suggest that backrub sampling captures a sizable fraction of localized conformational changes that occur in natural proteins. Application of this simple model of backbone motions may significantly improve both protein design and atomistic simulations of localized protein flexibility.  相似文献   

11.
Proper visualization of scientific data is important for understanding spatial relationships. Particularly in the field of structural biology, where researchers seek to gain an understanding of the structure and function of biological macromolecules, it is important to have access to visualization programs which are fast, flexible, and customizable. We present KiNG, a Java program for visualizing scientific data, with a focus on macromolecular visualization. KiNG uses the kinemage graphics format, which is tuned for macromolecular structures, but is also ideal for many other kinds of spatially embedded information. KiNG is written in cross‐platform, open‐source Java code, and can be extended by end users through simple or elaborate “plug‐in” modules. Here, we present three such applications of KiNG to problems in structural biology (protein backbone rebuilding), bioinformatics of high‐dimensional data (e.g., protein sidechain chi angles), and classroom education (molecular illustration). KiNG is a mature platform for rapidly creating and capitalizing on scientific visualizations. As a research tool, it is invaluable as a test bed for new methods of visualizing scientific data and information. It is also a powerful presentation tool, whether for structure browsing, teaching, direct 3D display on the web, or as a method for creating pictures and videos for publications. KiNG is freely available for download at http://kinemage.biochem.duke.edu .  相似文献   

12.
The Rosetta Molecular Modeling suite is a command-line-only collection of applications that enable high-resolution modeling and design of proteins and other molecules. Although extremely useful, Rosetta can be difficult to learn for scientists with little computational or programming experience. To that end, we have created a Graphical User Interface (GUI) for Rosetta, called the PyRosetta Toolkit, for creating and running protocols in Rosetta for common molecular modeling and protein design tasks and for analyzing the results of Rosetta calculations. The program is highly extensible so that developers can add new protocols and analysis tools to the PyRosetta Toolkit GUI.  相似文献   

13.
Antifreeze proteins (AFPs) are known to polypeptide components formed by certain plants, animals, fungi and bacteria which support to survive in sub-zero temperature. Current study highlighted the seven different antifreeze proteins of fish Ocean pout (Zoarces americanus), in which protein (amino acids sequence) were collected from National Centre for Biotechnology Information and finely characterized using several in silico tools. Such biocomputational techniques applied to figure out the physicochemical, functional and conformational characteristics of targeted AFPs. Multiple physicochemical properties such as Isoelectric Point, Extinction Coefficient and Instability Index, Aliphatic Index, Grand Average Hydropathy were calculated and analysed by ExPASy-ProtParam prediction web server. EMBOSS: pepwheel online tool was used to represent the protein sequences in a helical form. The primary structure analysis shows that most of the AFPs are hydrophobic in nature due to the high content of non-polar residues. The secondary structure of these proteins was calculated using SOPMA tool. SOSUI server and CYS_REC program also run for ideal prediction of transmembrane helices and disulfide bridges of experimental proteins respectively. The modelling of 3D structures of seven desired AFPs were executed by the homology modelling programmes; SWISS MODEL and ProSA web server. UCSF Chimera, Antheprot 3D, PyMOL and RAMPAGE were used to visualize and analysis of the structural variation of the predicted protein model. MEGA7.0.9 software used to know the phylogenetic relationship among these AFPs. These models offered excellent and reliable baseline information for functional characterization of the experimentally derived protein domain composition by using the advanced tools and techniques of Computational Biology.  相似文献   

14.
Membrane proteins are critical functional molecules in the human body, constituting more than 30% of open reading frames in the human genome. Unfortunately, a myriad of difficulties in overexpression and reconstitution into membrane mimetics severely limit our ability to determine their structures. Computational tools are therefore instrumental to membrane protein structure prediction, consequently increasing our understanding of membrane protein function and their role in disease. Here, we describe a general framework facilitating membrane protein modeling and design that combines the scientific principles for membrane protein modeling with the flexible software architecture of Rosetta3. This new framework, called RosettaMP, provides a general membrane representation that interfaces with scoring, conformational sampling, and mutation routines that can be easily combined to create new protocols. To demonstrate the capabilities of this implementation, we developed four proof-of-concept applications for (1) prediction of free energy changes upon mutation; (2) high-resolution structural refinement; (3) protein-protein docking; and (4) assembly of symmetric protein complexes, all in the membrane environment. Preliminary data show that these algorithms can produce meaningful scores and structures. The data also suggest needed improvements to both sampling routines and score functions. Importantly, the applications collectively demonstrate the potential of combining the flexible nature of RosettaMP with the power of Rosetta algorithms to facilitate membrane protein modeling and design.  相似文献   

15.
When a protein sequence does not share any significant sequence similarity with a protein of known structure, homology modeling cannot be applied. However, many novel and interesting methods, such as secondary structure prediction, fold recognition, and prediction of long-range interactions, are being developed and have been shown to be reasonably successful in predicting protein structures from sequence data and evolutionary information. The a priori evaluation of the correctness of a prediction obtained by one of these methods is however often problematic. Consequently, it is important to use all available information provided by as many different methods as possible and all the available experimental data about the protein of interest, since the consistency of the results is indicative of the reliability of the prediction. Hence the need has arisen for suitable tools able to compare results provided by different methods and evaluate their consistency. We have therefore constructed GLASS, a general platform to read, visualize, compare, and evaluate prediction results from many different sources and to project these prediction results into three dimensions. In addition, GLASS allows the comparison of selected parameters calculated for a model with the distribution observed in real protein structures, thus providing an easy way to test new methods for evaluating the likelihood of different structural models. GLASS can be considered as a “workbench” for structural predictions useful to both experimentalists and theoreticians. Proteins 30:339–351, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

16.
This study explores the use of multiple sequence alignment (MSA) information and global measures of hydrophobic core formation for improving the Rosetta ab initio protein structure prediction method. The most effective use of the MSA information is achieved by carrying out independent folding simulations for a subset of the homologous sequences in the MSA and then identifying the free energy minima common to all folded sequences via simultaneous clustering of the independent folding runs. Global measures of hydrophobic core formation, using ellipsoidal rather than spherical representations of the hydrophobic core, are found to be useful in removing non-native conformations before cluster analysis. Through this combination of MSA information and global measures of protein core formation, we significantly increase the performance of Rosetta on a challenging test set. Proteins 2001;43:1-11.  相似文献   

17.
18.
Gene duplication and loss are major driving forces in evolution. While many important genomic resources provide information on gene presence, there is a lack of tools giving equal importance to presence and absence information as well as web platforms enabling easy visual comparison of multiple domain‐based protein occurrences at once. Here, we present Aquerium, a platform for visualizing genomic presence and absence of biomolecules with a focus on protein domain architectures. The web server offers advanced domain organization querying against the database of pre‐computed domains for ~26,000 organisms and it can be utilized for identification of evolutionary events, such as fusion, disassociation, duplication, and shuffling of protein domains. The tool also allows alternative inputs of custom entries or BLASTP results for visualization. Aquerium will be a useful tool for biologists who perform comparative genomic and evolutionary analyses. The web server is freely accessible at http://aquerium.utk.edu . Proteins 2016; 85:72–77. © 2016 Wiley Periodicals, Inc.  相似文献   

19.
The dramatic increase in heterogeneous types of biological data—in particular, the abundance of new protein sequences—requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity—GPCRs and kinases from humans, and the crotonase superfamily of enzymes—we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.  相似文献   

20.
Chemical crosslinking‐mass spectrometry (XL‐MS) is a valuable technique for gaining insights into protein structure and the organization of macromolecular complexes. XL‐MS data yield inter‐residue restraints that can be compared with high‐resolution structural data. Distances greater than the crosslinker spacer‐arm can reveal lowly populated “excited” states of proteins/protein assemblies, or crosslinks can be used as restraints to generate structural models in the absence of structural data. Despite increasing uptake of XL‐MS, there are few tools to enable rapid and facile mapping of XL‐MS data onto high‐resolution structures or structural models. PyXlinkViewer is a user‐friendly plugin for PyMOL v2 that maps intra‐protein, inter‐protein, and dead‐end crosslinks onto protein structures/models and automates the calculation of inter‐residue distances for the detected crosslinks. This enables rapid visualization of XL‐MS data, assessment of whether a set of detected crosslinks is congruent with structural data, and easy production of high‐quality images for publication.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号