首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
We report here a novel method for predicting melting temperatures of DNA sequences based on a molecular-level hypothesis on the phenomena underlying the thermal denaturation of DNA. The model presented here attempts to quantify the energetic components stabilizing the structure of DNA such as base pairing, stacking, and ionic environment which are partially disrupted during the process of thermal denaturation. The model gives a Pearson product-moment correlation coefficient (r) of ∼0.98 between experimental and predicted melting temperatures for over 300 sequences of varying lengths ranging from 15-mers to genomic level and at different salt concentrations. The approach is implemented as a web tool (www.scfbio-iitd.res.in/chemgenome/Tm_predictor.jsp) for the prediction of melting temperatures of DNA sequences.  相似文献   

2.
Abstract

Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions—energy based or statistical—have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp  相似文献   

3.
Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions--energy based or statistical--have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp.  相似文献   

4.
Specification of the three dimensional structure of a protein from its amino acid sequence, also called a “Grand Challenge” problem, has eluded a solution for over six decades. A modestly successful strategy has evolved over the last couple of decades based on development of scoring functions (e.g. mimicking free energy) that can capture native or native-like structures from an ensemble of decoys generated as plausible candidates for the native structure. A scoring function must be fast enough in discriminating the native from unfolded/misfolded structures, and requires validation on a large data set(s) to generate sufficient confidence in the score. Here we develop a scoring function called pcSM that detects true native structure in the top 5 with 93% accuracy from an ensemble of candidate structures. If we eliminate the native from ensemble of decoys then pcSM is able to capture near native structure (RMSD < = 5 ?) in top 10 with 86% accuracy. The parameters considered in pcSM are a C-alpha Euclidean metric, secondary structural propensity, surface areas and an intramolecular energy function. pcSM has been tested on 415 systems consisting 142,698 decoys (public and CASP—largest reported hitherto in literature). The average rank for the native is 2.38, a significant improvement over that existing in literature. In-silico protein structure prediction requires robust scoring technique(s). Therefore, pcSM is easily amenable to integration into a successful protein structure prediction strategy. The tool is freely available at http://www.scfbio-iitd.res.in/software/pcsm.jsp.  相似文献   

5.
Song J  Tan H  Wang M  Webb GI  Akutsu T 《PloS one》2012,7(2):e30361
Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the C(α)-N bond (Phi) and the C(α)-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/.  相似文献   

6.
Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long‐range and short‐range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci‐bioinfo.cn/swfrate/input.jsp . Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

7.
We describe here an energy based computer software suite for narrowing down the search space of tertiary structures of small globular proteins. The protocol comprises eight different computational modules that form an automated pipeline. It combines physics based potentials with biophysical filters to arrive at 10 plausible candidate structures starting from sequence and secondary structure information. The methodology has been validated here on 50 small globular proteins consisting of 2-3 helices and strands with known tertiary structures. For each of these proteins, a structure within 3-6 A RMSD (root mean square deviation) of the native has been obtained in the 10 lowest energy structures. The protocol has been web enabled and is accessible at http://www.scfbio-iitd.res.in/bhageerath.  相似文献   

8.
Root-mean-square-deviation (RMSD), of computationally-derived protein structures from experimentally determined structures, is a critical index to assessing protein-structure-prediction-algorithms (PSPAs). The development of PSPAs to obtain 0 Å RMSD from native structures is considered central to computational biology. However, till date it has been quite challenging to measure how far a predicted protein structure is from its native — in the absence of a known experimental/native structure. In this work, we report the development of a metric “D2N” (distance to the native) — that predicts the “RMSD” of any structure without actually knowing the native structure. By combining physico-chemical properties and known universalities in spatial organization of soluble proteins to develop D2N, we demonstrate the ability to predict the distance of a proposed structure to within ± 1.5 ? error with a remarkable average accuracy of 93.6% for structures below 5 ? from the native. We believe that this work opens up a completely new avenue towards assigning reliable structures to whole proteomes even in the absence of experimentally determined native structures. The D2N tool is freely available at http://www.scfbio-iitd.res.in/software/d2n.jsp.  相似文献   

9.
Among secondary structure elements, beta-turns are ubiquitous and major feature of bioactive peptides. We analyzed 77 biologically active peptides with length varying from 9 to 20 residues. Out of 77 peptides, 58 peptides were found to contain at least one beta-turn. Further, at the residue level, 34.9% of total peptide residues were found to be in beta-turns, higher than the number of helical (32.3%) and beta-sheet residues (6.9%). So, we utilized the predicted beta-turns information to develop an improved method for predicting the three-dimensional (3D) structure of small peptides. In principle, we built four different structural models for each peptide. The first 'model I' was built by assigning all the peptide residues an extended conformation (phi = Psi = 180 degrees ). Second 'model II' was built using the information of regular secondary structures (helices, beta-strands and coil) predicted from PSIPRED. In third 'model III', secondary structure information including beta-turn types predicted from BetaTurns method was used. The fourth 'model IV' had main-chain phi, Psi angles of model III and side chain angles assigned using standard Dunbrack backbone dependent rotamer library. These models were further refined using AMBER package and the resultant C(alpha) rmsd values were calculated. It was found that adding the beta-turns to the regular secondary structures greatly reduces the rmsd values both before and after the energy minimization. Hence, the results indicate that regular and irregular secondary structures, particularly beta-turns information can provide valuable and vital information in the tertiary structure prediction of small bioactive peptides. Based on the above study, a web server PEPstr (http://www.imtech.res.in/raghava/pepstr/) was developed for predicting the tertiary structure of small bioactive peptides.  相似文献   

10.
Kuhn M  Meiler J  Baker D 《Proteins》2004,54(2):282-288
Beta-sheet proteins have been particularly challenging for de novo structure prediction methods, which tend to pair adjacent beta-strands into beta-hairpins and produce overly local topologies. To remedy this problem and facilitate de novo prediction of beta-sheet protein structures, we have developed a neural network that classifies strand-loop-strand motifs by local hairpins and nonlocal diverging turns by using the amino acid sequence as input. The neural network is trained with a representative subset of the Protein Data Bank and achieves a prediction accuracy of 75.9 +/- 4.4% compared to a baseline prediction rate of 59.1%. Hairpins are predicted with an accuracy of 77.3 +/- 6.1%, diverging turns with an accuracy of 73.9 +/- 6.0%. Incorporation of the beta-hairpin/diverging turn classification into the ROSETTA de novo structure prediction method led to higher contact order models and somewhat improved tertiary structure predictions for a test set of 11 all-beta-proteins and 3 alphabeta-proteins. The beta-hairpin/diverging turn classification from amino acid sequences is available online for academic use (Meiler and Kuhn, 2003; www.jens-meiler.de/turnpred.html).  相似文献   

11.
MOTIVATION: Most secondary structure prediction programs target only alpha helix and beta sheet structures and summarize all other structures in the random coil pseudo class. However, such an assignment often ignores existing local ordering in so-called random coil regions. Signatures for such ordering are distinct dihedral angle pattern. For this reason, we propose as an alternative approach to predict directly dihedral regions for each residue as this leads to a higher amount of structural information. RESULTS: We propose a multi-step support vector machine (SVM) procedure, dihedral prediction (DHPRED), to predict the dihedral angle state of residues from sequence. Trained on 20,000 residues our approach leads to dihedral region predictions, that in regions without alpha helices or beta sheets is higher than those from secondary structure prediction programs. AVAILABILITY: DHPRED has been implemented as a web service, which academic researchers can access from our webpage http://www.fz-juelich.de/nic/cbb  相似文献   

12.
Chao Fang  Yi Shang  Dong Xu 《Proteins》2018,86(5):592-598
Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception‐inside‐inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD‐SS. The input to MUFOLD‐SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio‐chemical properties of amino acids, PSI‐BLAST profile, and HHBlits profile. MUFOLD‐SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD‐SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD‐SS outperformed the best existing methods and other deep neural networks significantly. MUFold‐SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html .  相似文献   

13.
The loops which connect or flank helices/sheets in protein structures are known to be functionally important. However, ironically they also belong to the part of protein whose structure is least accurately predicted. Here, a new method to isolate and analyze loop regions in protein structure is proposed using the spatial coordinates of the solved three‐dimensional structure. The extent of dispersion among points of successive amino acid residues in the Ramachandran map of protein region is utilized to calculate the Mean Separation between these points in the Ramachandran Plot (MSRP). Based on analysis of 2935 protein secondary structure regions obtained using DSSP software, spanning a range from 2 to 64 residues, taken from a set of 170 proteins, it is shown that helices (MSRP < 17) and strands (MSRP < 64) stand effectively demarcated from the loop regions (MSRP > 130). Analysis of 43 DNA binding and 98 ligand binding proteins revealed several loop regions with clear change in MSRP subsequent to binding. The population of such loops correlated with the magnitude of backbone displacement in the protein subsequent to binding. Can changes in MSRP quantify the temporal oscillations in dihedral angles among structured/unstructured regions in proteins? Molecular dynamics simulations (10 ns) revealed that deviations in MSRP among different snapshots in the trajectory were at least twofold higher for unstructured proteins in comparison with ordered proteins. The above results validate the use of MSRP parameter as a tool to identify and investigate functionally active loops and unstructured regions in protein structures. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

14.
The ff94 force field that is commonly associated with the Amber simulation package is one of the most widely used parameter sets for biomolecular simulation. After a decade of extensive use and testing, limitations in this force field, such as over-stabilization of alpha-helices, were reported by us and other researchers. This led to a number of attempts to improve these parameters, resulting in a variety of "Amber" force fields and significant difficulty in determining which should be used for a particular application. We show that several of these continue to suffer from inadequate balance between different secondary structure elements. In addition, the approach used in most of these studies neglected to account for the existence in Amber of two sets of backbone phi/psi dihedral terms. This led to parameter sets that provide unreasonable conformational preferences for glycine. We report here an effort to improve the phi/psi dihedral terms in the ff99 energy function. Dihedral term parameters are based on fitting the energies of multiple conformations of glycine and alanine tetrapeptides from high level ab initio quantum mechanical calculations. The new parameters for backbone dihedrals replace those in the existing ff99 force field. This parameter set, which we denote ff99SB, achieves a better balance of secondary structure elements as judged by improved distribution of backbone dihedrals for glycine and alanine with respect to PDB survey data. It also accomplishes improved agreement with published experimental data for conformational preferences of short alanine peptides and better accord with experimental NMR relaxation data of test protein systems.  相似文献   

15.
Tertiary structure prediction of a protein from its amino acid sequence is one of the major challenges in the field of bioinformatics. Hierarchical approach is one of the persuasive techniques used for predicting protein tertiary structure, especially in the absence of homologous protein structures. In hierarchical approach, intermediate states are predicted like secondary structure, dihedral angles, Cα-Cα distance bounds, etc. These intermediate states are used to restraint the protein backbone and assist its correct folding. In the recent years, several methods have been developed for predicting dihedral angles of a protein, but it is difficult to conclude which method is better than others. In this study, we benchmarked the performance of dihedral prediction methods ANGLOR and SPINE X on various datasets, including independent datasets. TANGLE dihedral prediction method was not benchmarked (due to unavailability of its standalone) and was compared with SPINE X and ANGLOR on only ANGLOR dataset on which TANGLE has reported its results. It was observed that SPINE X performed better than ANGLOR and TANGLE, especially in case of prediction of dihedral angles of glycine and proline residues. The analysis suggested that angle shifting was the foremost reason of better performance of SPINE X. We further evaluated the performance of the methods on independent ccPDB30 dataset and observed that SPINE X performed better than ANGLOR.  相似文献   

16.
For naturally occurring proteins, similar sequence implies similar structure. Consequently, multiple sequence alignments (MSAs) often are used in template‐based modeling of protein structure and have been incorporated into fragment‐based assembly methods. Our previous homology‐free structure prediction study introduced an algorithm that mimics the folding pathway by coupling the formation of secondary and tertiary structure. Moves in the Monte Carlo procedure involve only a change in a single pair of ?,ψ backbone dihedral angles that are obtained from a Protein Data Bank‐based distribution appropriate for each amino acid, conditional on the type and conformation of the flanking residues. We improve this method by using MSAs to enrich the sampling distribution, but in a manner that does not require structural knowledge of any protein sequence (i.e., not homologous fragment insertion). In combination with other tools, including clustering and refinement, the accuracies of the predicted secondary and tertiary structures are substantially improved and a global and position‐resolved measure of confidence is introduced for the accuracy of the predictions. Performance of the method in the Critical Assessment of Structure Prediction (CASP8) is discussed.  相似文献   

17.
ProPred: prediction of HLA-DR binding sites.   总被引:22,自引:0,他引:22  
ProPred is a graphical web tool for predicting MHC class II binding regions in antigenic protein sequences. The server implement matrix based prediction algorithm, employing amino-acid/position coefficient table deduced from literature. The predicted binders can be visualized either as peaks in graphical interface or as colored residues in HTML interface. This server might be a useful tool in locating the promiscuous binding regions that can bind to several HLA-DR alleles. AVAILABILITY: The server is available at http://www.imtech.res.in/raghava/propred/ CONTACT: raghava@imtech.res.in SUPPLEMENTARY INFORMATION: http://www.imtech.res.in/raghava/propred/page3.html  相似文献   

18.
The model describing the structure and conformational preferences of the HIV-Haiti V3 loop in the geometric spaces of Cartesian coordinates and dihedral angles was generated in terms of NMR spectroscopy data published in literature. To this end, the following successive steps were put into effect: (i) the NMR-based 3D structure for the HIV-Haiti V3 loop in water was built by computer modeling methods; (ii) the conformations of its irregular segments were analyzed and the secondary structure elements identified; and (iii) to reveal a common structural motifs in the HIV-Haiti V3 loop regardless of its environment variability, the simulated structure was collated with the one deciphered previously for the HIV-Haiti V3 loop in a water/trifluoroethanol (TFE) mixed solvent. As a result, the HIV-Haiti V3 loop was found to offer the highly variable fragment of gp120 sensitive to its environment whose changes trigger the large-scale structural rearrangements, bringing in substantial altering the secondary and tertiary structures of this functionally important site of the virus envelope. In spite of this fact, over half of amino acid residues that reside, for the most part, in the functionally important regions of the gp120 protein and may present promising targets for AIDS drug researches, were shown to preserve their conformational states in the structures under review. In particular, the register of these amino acids holds Asn-25 that is critical for the virus binding with primary cell receptor CD4 as well as Arg-3 that is critical for utilization of CCR5 co-receptor and heparan sulfate proteoglycans. The conservative structural motif embracing one of the potential sites of the gp120 N-linked glycosylation was detected, which seems to be a promising target for the HIV-1 drug design. The implications are discussed in conjunction with the literature data on the biological activity of the individual amino acids for the HIV-1 gp120 V3 loop.  相似文献   

19.
Abstract

The model describing the structure and conformational preferences of the HIV-Haiti V3 loop in the geometric spaces of Cartesian coordinates and dihedral angles was generated in terms of NMR spectroscopy data published in literature. To this end, the following successive steps were put into effect: (i) the NMR-based 3D structure for the HIV-Haiti V3 loop in water was built by computer modeling methods; (ii) the conformations of its irregular segments were analyzed and the secondary structure elements identified; and (iii) to reveal a common structural motifs in the HIV-Haiti V3 loop regardless of its environment variability, the simulated structure was collated with the one deciphered previously for the HIV-Haiti V3 loop in a water/trifluoroethanol (TFE) mixed solvent.

As a result, the HIV-Haiti V3 loop was found to offer the highly variable fragment of gp120 sensitive to its environment whose changes trigger the large-scale structural rearrangements, bringing in substantial altering the secondary and tertiary structures of this functionally important site of the virus envelope. In spite of this fact, over half of amino acid residues that reside, for the most part, in the functionally important regions of the gp120 protein and may present promising targets for AIDS drug researches, were shown to preserve their conformational states in the structures under review. In particular, the register of these amino acids holds Asn-25 that is critical for the virus binding with primary cell receptor CD4 as well as Arg-3 that is critical for utilization of CCR5 co-receptor and heparan sulfate proteoglycans. The conservative structural motif embracing one of the potential sites of the gp120 N-linked glycosylation was detected, which seems to be a promising target for the HIV-1 drug design.

The implications are discussed in conjunction with the literature data on the biological activity of the individual amino acids for the HIV-1 gp120 V3 loop.  相似文献   

20.
In protein structure prediction, it is often the case that a protein segment must be adjusted to connect two fixed segments. This occurs during loop structure prediction in homology modeling as well as in ab initio structure prediction. Several algorithms for this purpose are based on the inverse Jacobian of the distance constraints with respect to dihedral angle degrees of freedom. These algorithms are sometimes unstable and fail to converge. We present an algorithm developed originally for inverse kinematics applications in robotics. In robotics, an end effector in the form of a robot hand must reach for an object in space by altering adjustable joint angles and arm lengths. In loop prediction, dihedral angles must be adjusted to move the C-terminal residue of a segment to superimpose on a fixed anchor residue in the protein structure. The algorithm, referred to as cyclic coordinate descent or CCD, involves adjusting one dihedral angle at a time to minimize the sum of the squared distances between three backbone atoms of the moving C-terminal anchor and the corresponding atoms in the fixed C-terminal anchor. The result is an equation in one variable for the proposed change in each dihedral. The algorithm proceeds iteratively through all of the adjustable dihedral angles from the N-terminal to the C-terminal end of the loop. CCD is suitable as a component of loop prediction methods that generate large numbers of trial structures. It succeeds in closing loops in a large test set 99.79% of the time, and fails occasionally only for short, highly extended loops. It is very fast, closing loops of length 8 in 0.037 sec on average.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号