首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: There are two main areas of difficulty in homology modelling that are particularly important when sequence identity between target and template falls below 50%: sequence alignment and loop building. These problems become magnified with automatic modelling processes, as there is no human input to correct mistakes. As such we have benchmarked several stand-alone strategies that could be implemented in a workflow for automated high-throughput homology modelling. These include three new sequence-structure alignment programs: 3D-Coffee, Staccato and SAlign, plus five homology modelling programs and their respective loop building methods: Builder, Nest, Modeller, SegMod/ENCAD and Swiss-Model. The SABmark database provided 123 targets with at least five templates from the same SCOP family and sequence identities 相似文献   

2.
When researchers build high-quality models of protein structure from sequence homology, it is today common to use several alternative target-template alignments. Several methods can, at least in theory, utilize information from multiple templates, and many examples of improved model quality have been reported. However, to our knowledge, thus far no study has shown that automatic inclusion of multiple alignments is guaranteed to improve models without artifacts. Here, we have carried out a systematic investigation of the potential of multiple templates to improving homology model quality. We have used test sets consisting of targets from both recent CASP experiments and a larger reference set. In addition to Modeller and Nest, a new method (Pfrag) for multiple template-based modeling is used, based on the segment-matching algorithm from Levitt's SegMod program. Our results show that all programs can produce multi-template models better than any of the single-template models, but a large part of the improvement is simply due to extension of the models. Most of the remaining improved cases were produced by Modeller. The most important factor is the existence of high-quality single-sequence input alignments. Because of the existence of models that are worse than any of the top single-template models, the average model quality does not improve significantly. However, by ranking models with a model quality assessment program such as ProQ, the average quality is improved by approximately 5% in the CASP7 test set.  相似文献   

3.
SCWRL and MolIDE are software applications for prediction of protein structures. SCWRL is designed specifically for the task of prediction of side-chain conformations given a fixed backbone usually obtained from an experimental structure determined by X-ray crystallography or NMR. SCWRL is a command-line program that typically runs in a few seconds. MolIDE provides a graphical interface for basic comparative (homology) modeling using SCWRL and other programs. MolIDE takes an input target sequence and uses PSI-BLAST to identify and align templates for comparative modeling of the target. The sequence alignment to any template can be manually modified within a graphical window of the target-template alignment and visualization of the alignment on the template structure. MolIDE builds the model of the target structure on the basis of the template backbone, predicted side-chain conformations with SCWRL and a loop-modeling program for insertion-deletion regions with user-selected sequence segments. SCWRL and MolIDE can be obtained at (http://dunbrack.fccc.edu/Software.php).  相似文献   

4.
In this study, we investigate the extent to which techniques for homology modeling that were developed for water-soluble proteins are appropriate for membrane proteins as well. To this end we present an assessment of current strategies for homology modeling of membrane proteins and introduce a benchmark data set of homologous membrane protein structures, called HOMEP. First, we use HOMEP to reveal the relationship between sequence identity and structural similarity in membrane proteins. This analysis indicates that homology modeling is at least as applicable to membrane proteins as it is to water-soluble proteins and that acceptable models (with C alpha-RMSD values to the native of 2 A or less in the transmembrane regions) may be obtained for template sequence identities of 30% or higher if an accurate alignment of the sequences is used. Second, we show that secondary-structure prediction algorithms that were developed for water-soluble proteins perform approximately as well for membrane proteins. Third, we provide a comparison of a set of commonly used sequence alignment algorithms as applied to membrane proteins. We find that high-accuracy alignments of membrane protein sequences can be obtained using state-of-the-art profile-to-profile methods that were developed for water-soluble proteins. Improvements are observed when weights derived from the secondary structure of the query and the template are used in the scoring of the alignment, a result which relies on the accuracy of the secondary-structure prediction of the query sequence. The most accurate alignments were obtained using template profiles constructed with the aid of structural alignments. In contrast, a simple sequence-to-sequence alignment algorithm, using a membrane protein-specific substitution matrix, shows no improvement in alignment accuracy. We suggest that profile-to-profile alignment methods should be adopted to maximize the accuracy of homology models of membrane proteins.  相似文献   

5.
ESyPred3D: Prediction of proteins 3D structures   总被引:1,自引:0,他引:1  
MOTIVATION: Homology or comparative modeling is currently the most accurate method to predict the three-dimensional structure of proteins. It generally consists in four steps: (1) databanks searching to identify the structural homolog, (2) target-template alignment, (3) model building and optimization, and (4) model evaluation. The target-template alignment step is generally accepted as the most critical step in homology modeling. RESULTS: We present here ESyPred3D, a new automated homology modeling program. The method gets benefit of the increased alignment performances of a new alignment strategy. Alignments are obtained by combining, weighting and screening the results of several multiple alignment programs. The final three-dimensional structure is build using the modeling package MODELLER. ESyPred3D was tested on 13 targets in the CASP4 experiment (Critical Assessment of Techniques for Proteins Structural Prediction). Our alignment strategy obtains better results compared to PSI-BLAST alignments and ESyPred3D alignments are among the most accurate compared to those of participants having used the same template. AVAILABILITY: ESyPred3D is available through its web site at http://www.fundp.ac.be/urbm/bioinfo/esypred/ CONTACT: christophe.lambert@fundp.ac.be; http://www.fundp.ac.be/~lambertc  相似文献   

6.
Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these “bent” alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matt's global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of α-helices and β-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matt's strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins.  相似文献   

7.
Knowledge of the three-dimensional structures of protein targets from genomic data has the potential to accelerate researches pertaining to drug discovery. Human β2 adrenergic receptor is a G-protein-coupled receptor with seven transmembrane helices, and is important in pharmaceutical targeting on pulmonary and cardiovascular diseases. The human β2 adrenergic receptor has been found to play a very important role in the pathogenesis of high altitude pulmonary edema (HAPE). In the present study, a high quality of protein 3D structure has been predicted for the human β2 adrenergic receptor sequence with primary accession number P07550. Homologous template protein sequence with known 3D structure was identified and the template-query protein sequence validation was done by multiple sequence alignment method. The homology model was performed through Modeller and depended on the quality of the sequence alignment by BLAST, template structure and the consolidated result performed by Gene silico meta-server. The statistical verification of the generated model was evaluated by PROCHECK which revealed that the structure modeled through Modeller to be of good quality with 84.1% of residues in the most favored region. Docking studies were carried out after modeling with two well known ligands namely Salmeterol and Nifedipine, and the fitness score revealed that Salmeterol has a higher fitness score than Nifedipine. Estimation of binding affinity by X-Score revealed that Salmeterol had −10.40 binding affinity while Nifedipine showed −9.62 binding affinity. From the present study, it can be concluded that the generated model of human β2 adrenergic receptor can be used for further studies related to this receptor and Salmeterol was found to have a high binding affinity with human β2 adrenergic receptor.  相似文献   

8.
MOTIVATION: Even the best sequence alignment methods frequently fail to correctly identify the framework regions for which backbones can be copied from the template into the target structure. Since the underprediction and, more significantly, the overprediction of these regions reduces the quality of the final model, it is of prime importance to attain as much as possible of the true structural alignment between target and template. RESULTS: We have developed an algorithm called Consensus that consistently provides a high quality alignment for comparative modeling. The method follows from a benchmark analysis of the 3D models generated by ten alignment techniques for a set of 79 homologous protein structure pairs. For 20-to-40% of the targets, these methods yield models with at least 6 A root mean square deviation (RMSD) from the native structure. We have selected the top five performing methods, and developed a consensus algorithm to generate an improved alignment. By building on the individual strength of each method, a set of criteria was implemented to remove the alignment segments that are likely to correspond to structurally dissimilar regions. The automated algorithm was validated on a different set of 48 protein pairs, resulting in 2.2 A average RMSD for the predicted models, and only four cases in which the RMSD exceeded 3 A. The average length of the alignments was about 75% of that found by standard structural superposition methods. The performance of Consensus was consistent from 2 to 32% target-template sequence identity, and hence it can be used for accurate prediction of framework regions in homology modeling.  相似文献   

9.
The F420-dependent NADP oxidoreductase enzyme from Methanobrevibacter smithii catalyzes the important electron transfer step during methanogenesis. Therefore, it may act as potential target for blocking the process of methane formation. Its protein sequence is available in GenBank (accession number: ABQ86254.1) however no report has been found about its 3D protein structure. In this work, we first time claim 3D model structure of F420-dependent NADP oxidoreductase enzyme from Methanobrevibacter smithii by comparative homology modeling method. Swiss model and ESyPred3d (via Modeller 6v2) software's were generated the 3D model by detecting 1JAX (A) as template along with sequence identities of 34.272% and 35.40%. Furthermore, PROCHECK with Ramachandran plot and ProSA analysis revealed that swiss model produced better model than Modeller6v2 with 98.90% of residues in favored and additional allowed regions (RM plot) as well as with ProSA Z score of -7.26. In addition, we investigated that the substrate F420 bound at the cavity of the model. Subsequently, inhibitor prediction study revealed that Lovastatin (-22.07 Kcal/mol) and Compactin (Mevastatin) (-21.91 Kcal/mol) produced more affinity for model structure of NADP oxidoreducatse as compared to F420 (-14.40 Kcal/mol). It indicates that the Lovastatin and Compactin (Mevastatin) compounds (Negative regulator) may act as potential inhibitor of F420 dependent NADP oxidoreducatse protein.  相似文献   

10.
Homology models of amidase-03 from Bacillus anthracis were constructed using Modeller (9v2). Modeller constructs protein models using an automated approach for comparative protein structure modeling by the satisfaction of spatial restraints. A template structure of Listeria monocytogenes bacteriophage PSA endolysin PlyPSA (PDB ID: 1XOV) was selected from protein databank (PDB) using BLASTp with BLOSUM62 sequence alignment scoring matrix. We generated five models using the Modeller default routine in which initial coordinates are randomized and evaluated by pseudo-energy parameters. The protein models were validated using PROCHECK and energy minimized using the steepest descent method in GROMACS 3.2 (flexible SPC water model in cubic box of size 1 Å instead of rigid SPC model). We used G43a1 force field in GROMACS for energy calculations and the generated structure was subsequently analyzed using the VMD software for stereo-chemistry, atomic clash and misfolding. A detailed analysis of the amidase-03 model structure from Bacillus anthracis will provide insight to the molecular design of suitable inhibitors as drug candidates.  相似文献   

11.
SUMMARY: NdPASA is a web server specifically designed to optimize sequence alignment between distantly related proteins. The program integrates structure information of the template sequence into a global alignment algorithm by employing neighbor-dependent propensities of amino acids as a unique parameter for alignment. NdPASA optimizes alignment by evaluating the likelihood of a residue pair in the query sequence matching against a corresponding residue pair adopting a particular secondary structure in the template sequence. NdPASA is most effective in aligning homologous proteins sharing low percentage of sequence identity. The server is designed to aid homologous protein structure modeling. A PSI-BLAST search engine was implemented to help users identify template candidates that are most appropriate for modeling the query sequences.  相似文献   

12.
Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.  相似文献   

13.
SUMMARY: Sequence-structure alignments are a common means for protein structure prediction in the fields of fold recognition and homology modeling, and there is a broad variety of programs that provide such alignments based on sequence similarity, secondary structure or contact potentials. Nevertheless, finding the best sequence-structure alignment in a pool of alignments remains a difficult problem. QUASAR (quality of sequence-structure alignments ranking) provides a unifying framework for scoring sequence-structure alignments that aids finding well-performing combinations of well-known and custom-made scoring schemes. Those scoring functions can be benchmarked against widely accepted quality scores like MaxSub, TMScore, Touch and APDB, thus enabling users to test their own alignment scores against 'standard-of-truth' structure-based scores. Furthermore, individual score combinations can be optimized with respect to benchmark sets based on known structural relationships using QUASAR's in-built optimization routines.  相似文献   

14.
[目的]研究米曲霉木糖醇脱氢酶基因的结构与功能.[方法]克隆测序来源于米曲霉的木糖醇脱氢酶(XDH)基因,利用Swiss-MODEL和Modeller对XDH进行三级结构模建,通过PROCHECK和Prosa2003对得到的4个目标模型进行评价,从中得到一个最佳模型.在同源建模的基础上,通过分子对接软件MolsoftICM-Pro,对辅因子进行对接,预测了XDH与NAD+、Zn2+作用的相关残基.寻找底物木糖醇与XDH结合的可能活性口袋,用Molsoft模拟XDH与木糖醇的对接,预测了酶与底物作用的关键氨基酸残基.[结果]结构分析显示,米曲霉XDH含有醇脱氢酶家族锌指纹结构和典型醇脱氢酶Rossmann折叠的辅酶结合域,属于Medium-chain脱氢酶(MDR)家族.通过对接研究,预测了XDH与NAD+之间形成氢键的氨基酸有Asp206、Arg211、Ser255、Ser301和Arg303,这些氨基酸位于结合域,与Zn2+形成氢键的氨基酸有His72和Glu73,位于催化域,与天然底物木糖醇形成氢键的氨基酸有Ile46、Ile349、Lys350和Thr351,位于催化域.[结论]所得信息对XDH分子定向改造、拓展米曲霉工业应用范围有重要意义.  相似文献   

15.
Measuring the accuracy of protein three-dimensional structures is one of the most important problems in protein structure prediction. For structure-based drug design, the accuracy of the binding site is far more important than the accuracy of any other region of the protein. We have developed an automated method for assessing the quality of a protein model by focusing on the set of residues in the small molecule binding site. Small molecule binding sites typically involve multiple regions of the protein coming together in space, and their accuracy has been observed to be sensitive to even small alignment errors. In addition, ligand binding sites contain the critical information required for drug design, making their accuracy particularly important. We analyzed the accuracy of the binding sites on two sets of protein models: the predictions submitted by the top-performing CASP7 groups, and the models generated by four widely used homology modeling packages. The results of our CASP7 analysis significantly differ from the previous findings, implying that the binding site measure does not correlate with the traditional model quality measures used in the structure prediction benchmarks. For the modeling programs, the resolution of binding sites is extremely sensitive to the degree of sequence homology between the query and the template, even when the most accurate alignments are used in the homology modeling process.  相似文献   

16.
John B  Sali A 《Nucleic acids research》2003,31(14):3982-3992
Comparative or homology protein structure modeling is severely limited by errors in the alignment of a modeled sequence with related proteins of known three-dimensional structure. To ameliorate this problem, we have developed an automated method that optimizes both the alignment and the model implied by it. This task is achieved by a genetic algorithm protocol that starts with a set of initial alignments and then iterates through re-alignment, model building and model assessment to optimize a model assessment score. During this iterative process: (i) new alignments are constructed by application of a number of operators, such as alignment mutations and cross-overs; (ii) comparative models corresponding to these alignments are built by satisfaction of spatial restraints, as implemented in our program MODELLER; (iii) the models are assessed by a variety of criteria, partly depending on an atomic statistical potential. When testing the procedure on a very difficult set of 19 modeling targets sharing only 4–27% sequence identity with their template structures, the average final alignment accuracy increased from 37 to 45% relative to the initial alignment (the alignment accuracy was measured as the percentage of positions in the tested alignment that were identical to the reference structure-based alignment). Correspondingly, the average model accuracy increased from 43 to 54% (the model accuracy was measured as the percentage of the Cα atoms of the model that were within 5 Å of the corresponding Cα atoms in the superposed native structure). The present method also compares favorably with two of the most successful previously described methods, PSI-BLAST and SAM. The accuracy of the final models would be increased further if a better method for ranking of the models were available.  相似文献   

17.

Background  

For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significantly depending on our choice of various alignment parameters such as gap opening penalty and gap extension penalty. Because the accuracy of sequence alignment is typically measured by comparing it with its corresponding structure alignment, there is no good way of evaluating alignment accuracy without knowing the structure of a query protein, which is obviously not available at the time of structure prediction. Moreover, there is no universal alignment parameter option that would always yield the optimal alignment.  相似文献   

18.
Homology modeling is a powerful technique that greatly increases the value of experimental structure determination by using the structural information of one protein to predict the structures of homologous proteins. We have previously described a method of homology modeling by satisfaction of spatial restraints (Li et al., Protein Sci 1997;6:956-970). The Homology Modeling Automatically (HOMA) web site, , is a new tool, using this method to predict 3D structure of a target protein based on the sequence alignment of the target protein to a template protein and the structure coordinates of the template. The user is presented with the resulting models, together with an extensive structure validation report providing critical assessments of the quality of the resulting homology models. The homology modeling method employed by HOMA was assessed and validated using twenty-four groups of homologous proteins. Using HOMA, homology models were generated for 510 proteins, including 264 proteins modeled with correct folds and 246 modeled with incorrect folds. Accuracies of these models were assessed by superimposition on the corresponding experimentally determined structures. A subset of these results was compared with parallel studies of modeling accuracy using several other automated homology modeling approaches. Overall, HOMA provides prediction accuracies similar to other state-of-the-art homology modeling methods. We also provide an evaluation of several structure quality validation tools in assessing the accuracy of homology models generated with HOMA. This study demonstrates that Verify3D (Luthy et al., Nature 1992;356:83-85) and ProsaII (Sippl, Proteins 1993;17:355-362) are most sensitive in distinguishing between homology models with correct or incorrect folds. For homology models that have the correct fold, the steric conformational energy (including primarily the Van der Waals energy), MolProbity clashscore (Word et al., Protein Sci 2000;9:2251-2259), and the PROCHECK G-factors (Laskowski et al., J Biomol NMR 1996;8:477-486) provide sensitive and consistent methods for assessing accuracy and can distinguish between homology models of higher and lower accuracy. As demonstrated in the accompanying paper (Bhattacharya et al., accompanying paper), combinations of these scores for models generated with HOMA provide a basis for distinguishing low from high accuracy models.  相似文献   

19.
Protein structure alignment methods are essential for many different challenges in protein science, such as the determination of relations between proteins in the fold space or the analysis and prediction of their biological function. A number of different pairwise and multiple structure alignment (MStA) programs have been developed and provided to the community. Prior knowledge of the expected alignment accuracy is desirable for the user of such tools. To retrieve an estimate of the performance of current structure alignment methods, we compiled a test suite taken from literature and the SISYPHUS database consisting of proteins that are difficult to align. Subsequently, different MStA programs were evaluated regarding alignment correctness and general limitations. The analysis shows that there are large differences in the success between the methods in terms of applicability and correctness. The latter ranges from 44 to 75% correct core positions. Taking only the best method result per test case this number increases to 84%. We conclude that the methods available are applicable to difficult cases, but also that there is still room for improvements in both, practicability and alignment correctness. An approach that combines the currently available methods supported by a proper score would be useful. Until then, a user should not rely on just a single program.  相似文献   

20.
MOFOID is a new server developed mainly for automated modeling of protein structures by their homology to the structures deposited in the PDB database. Selection of a template and calculation of the alignment is performed with the Smith-Waterman or Needleman-Wunsch algorithms implemented in the EMBOSS package. The final model is built and optimised with programs from the JACKAL package. The wide spectrum of options in the web-based interface and the possibility of uploading user's own alignment make MOFOID a suitable platform for testing new approaches in the alignment building. The server is available at https:// valis.ibb.waw.pl/mofoid/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号