首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MollDE: a homology modeling framework you can click with   总被引:1,自引:0,他引:1  
SUMMARY: Molecular Integrated Development Environment (MolIDE) is an integrated application designed to provide homology modeling tools and protocols under a uniform, user-friendly graphical interface. Its main purpose is to combine the most frequent modeling steps in a semi-automatic, interactive way, guiding the user from the target protein sequence to the final three-dimensional protein structure. The typical basic homology modeling process is composed of building sequence profiles of the target sequence family, secondary structure prediction, sequence alignment with PDB structures, assisted alignment editing, side-chain prediction and loop building. All of these steps are available through a graphical user interface. MolIDE's user-friendly and streamlined interactive modeling protocol allows the user to focus on the important modeling questions, hiding from the user the raw data generation and conversion steps. MolIDE was designed from the ground up as an open-source, cross-platform, extensible framework. This allows developers to integrate additional third-party programs to MolIDE. AVAILABILITY: http://dunbrack.fccc.edu/molide/molide.php CONTACT: rl_dunbrack@fccc.edu.  相似文献   

2.
SUMMARY: The DBAli database includes approximately 35000 alignments of pairs of protein structures from SCOP (Lo Conte et al., Nucleic Acids Res., 28, 257-259, 2000) and CE (Shindyalov and Bourne, Protein Eng., 11, 739-747, 1998). DBAli is linked to several resources, including Compare3D (Shindyalov and Bourne, http://www.sdsc.edu/pb/software.htm, 1999) and ModView (Ilyin and Sali, http://guitar.rockefeller.edu/ModView/, 2001) for visualizing sequence alignments and structure superpositions. A flexible search of DBAli by protein sequence and structure properties allows construction of subsets of alignments suitable for a number of applications, such as benchmarking of sequence-sequence and sequence-structure alignment methods under a variety of conditions. AVAILABILITY: http://guitar.rockefeller.edu/DBAli/  相似文献   

3.
Yang JM  Tung CH 《Nucleic acids research》2006,34(13):3646-3659
As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].  相似文献   

4.
SCWRL and MolIDE are software applications for prediction of protein structures. SCWRL is designed specifically for the task of prediction of side-chain conformations given a fixed backbone usually obtained from an experimental structure determined by X-ray crystallography or NMR. SCWRL is a command-line program that typically runs in a few seconds. MolIDE provides a graphical interface for basic comparative (homology) modeling using SCWRL and other programs. MolIDE takes an input target sequence and uses PSI-BLAST to identify and align templates for comparative modeling of the target. The sequence alignment to any template can be manually modified within a graphical window of the target-template alignment and visualization of the alignment on the template structure. MolIDE builds the model of the target structure on the basis of the template backbone, predicted side-chain conformations with SCWRL and a loop-modeling program for insertion-deletion regions with user-selected sequence segments. SCWRL and MolIDE can be obtained at (http://dunbrack.fccc.edu/Software.php).  相似文献   

5.
An open question in protein homology modeling is, how well do current modeling packages satisfy the dual criteria of quality of results and practical ease of use? To address this question objectively, we examined homology‐built models of a variety of therapeutically relevant proteins. The sequence identities across these proteins range from 19% to 76%. A novel metric, the difference alignment index (DAI), is developed to aid in quantifying the quality of local sequence alignments. The DAI is also used to construct the relative sequence alignment (RSA), a new representation of global sequence alignment that facilitates comparison of sequence alignments from different methods. Comparisons of the sequence alignments in terms of the RSA and alignment methodologies are made to better understand the advantages and caveats of each method. All sequence alignments and corresponding 3D models are compared to their respective structure‐based alignments and crystal structures. A variety of protein modeling software was used. We find that at sequence identities >40%, all packages give similar (and satisfactory) results; at lower sequence identities (<25%), the sequence alignments generated by Profit and Prime, which incorporate structural information in their sequence alignment, stand out from the rest. Moreover, the model generated by Prime in this low sequence identity region is noted to be superior to the rest. Additionally, we note that DSModeler and MOE, which generate reasonable models for sequence identities >25%, are significantly more functional and easier to use when compared with the other structure‐building software.  相似文献   

6.
MOTIVATION: Accurate multiple sequence alignments are essential in protein structure modeling, functional prediction and efficient planning of experiments. Although the alignment problem has attracted considerable attention, preparation of high-quality alignments for distantly related sequences remains a difficult task. RESULTS: We developed PROMALS, a multiple alignment method that shows promising results for protein homologs with sequence identity below 10%, aligning close to half of the amino acid residues correctly on average. This is about three times more accurate than traditional pairwise sequence alignment methods. PROMALS algorithm derives its strength from several sources: (i) sequence database searches to retrieve additional homologs; (ii) accurate secondary structure prediction; (iii) a hidden Markov model that uses a novel combined scoring of amino acids and secondary structures; (iv) probabilistic consistency-based scoring applied to progressive alignment of profiles. Compared to the best alignment methods that do not use secondary structure prediction and database searches (e.g. MUMMALS, ProbCons and MAFFT), PROMALS is up to 30% more accurate, with improvement being most prominent for highly divergent homologs. Compared to SPEM and HHalign, which also employ database searches and secondary structure prediction, PROMALS shows an accuracy improvement of several percent. AVAILABILITY: The PROMALS web server is available at: http://prodata.swmed.edu/promals/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

7.
Constructing a model of a query protein based on its alignment to a homolog with experimentally determined spatial structure (the template) is still the most reliable approach to structure prediction. Alignment errors are the main bottleneck for homology modeling when the query is distantly related to the template. Alignment methods often misalign secondary structural elements by a few residues. Therefore, better alignment solutions can be found within a limited set of local shifts of secondary structures. We present a refinement method to improve pairwise sequence alignments by evaluating alignment variants generated by local shifts of template‐defined secondary structures. Our method SFESA is based on a novel scoring function that combines the profile‐based sequence score and the structure score derived from residue contacts in a template. Such a combined score frequently selects a better alignment variant among a set of candidate alignments generated by local shifts and leads to overall increase in alignment accuracy. Evaluation of several benchmarks shows that our refinement method significantly improves alignments made by automatic methods such as PROMALS, HHpred and CNFpred. The web server is available at http://prodata.swmed.edu/sfesa . Proteins 2015; 83:411–427. © 2014 Wiley Periodicals, Inc.  相似文献   

8.
M Källberg  H Wang  S Wang  J Peng  Z Wang  H Lu  J Xu 《Nature protocols》2012,7(8):1511-1522
A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world.  相似文献   

9.
A novel method is presented for predicting the common secondary structures and alignment of two homologous RNA sequences by sampling the ‘structural alignment’ space, i.e. the joint space of their alignments and common secondary structures. The structural alignment space is sampled according to a pseudo-Boltzmann distribution based on a pseudo-free energy change that combines base pairing probabilities from a thermodynamic model and alignment probabilities from a hidden Markov model. By virtue of the implicit comparative analysis between the two sequences, the method offers an improvement over single sequence sampling of the Boltzmann ensemble. A cluster analysis shows that the samples obtained from joint sampling of the structural alignment space cluster more closely than samples generated by the single sequence method. On average, the representative (centroid) structure and alignment of the most populated cluster in the sample of structures and alignments generated by joint sampling are more accurate than single sequence sampling and alignment based on sequence alone, respectively. The ‘best’ centroid structure that is closest to the known structure among all the centroids is, on average, more accurate than structure predictions of other methods. Additionally, cluster analysis identifies, on average, a few clusters, whose centroids can be presented as alternative candidates. The source code for the proposed method can be downloaded at http://rna.urmc.rochester.edu.  相似文献   

10.
Many noncoding RNAs (ncRNAs) function through both their sequences and secondary structures. Thus, secondary structure derivation is an important issue in today's RNA research. The state-of-the-art structure annotation tools are based on comparative analysis, which derives consensus structure of homologous ncRNAs. Despite promising results from existing ncRNA aligning and consensus structure derivation tools, there is a need for more efficient and accurate ncRNA secondary structure modeling and alignment methods. In this work, we introduce a consensus structure derivation approach based on grammar string, a novel ncRNA secondary structure representation that encodes an ncRNA's sequence and secondary structure in the parameter space of a context-free grammar (CFG) and a full RNA grammar including pseudoknots. Being a string defined on a special alphabet constructed from a grammar, grammar string converts ncRNA alignment into sequence alignment. We derive consensus secondary structures from hundreds of ncRNA families from BraliBase 2.1 and 25 families containing pseudoknots using grammar string alignment. Our experiments have shown that grammar string-based structure derivation competes favorably in consensus structure quality with Murlet and RNASampler. Source code and experimental data are available at http://www.cse.msu.edu/~yannisun/grammar-string.  相似文献   

11.
As of September, 1998, a total of 43 sequences are contained within the tmRNA database (tmRDB). The tmRNA sequences are arranged alphabetically and ordered phylogenetically. The alignment of the tmRNAs emphasizes the basepairs that are supported by comparative sequence analysis and establishes minimal secondary structures for the known tmRNAs. A corresponding alignment of the predicted tmRNA-encoded tag peptides is presented. The tmRDB also offers a small number of RNA secondary structure diagrams and PDB-formatted three-dimensional models generated with the program ERNA-3D. The data are available freely at the URL http://psyche.uthct.edu/dbs/tmRDB/tmRDB.++ +html  相似文献   

12.
This study involves the development of a rapid comparative modeling tool for homologous sequences by extension of the TASSER methodology, developed for tertiary structure prediction. This comparative modeling procedure was validated on a representative benchmark set of proteins in the Protein Data Bank composed of 901 single domain proteins (41-200 residues) having sequence identities between 35-90% with respect to the template. Using a Monte Carlo search scheme with the length of runs optimized for weakly/nonhomologous proteins, TASSER often provides appreciable improvement in structure quality over the initial template. However, on average, this requires approximately 29 h of CPU time per sequence. Since homologous proteins are unlikely to require the extent of conformational search as weakly/nonhomologous proteins, TASSER's parameters were optimized to reduce the required CPU time to approximately 17 min, while retaining TASSER's ability to improve structure quality. Using this optimized TASSER (TASSER-Lite), we find an average improvement in the aligned region of approximately 10% in root mean-square deviation from native over the initial template. Comparison of TASSER-Lite with the widely used comparative modeling tool MODELLER showed that TASSER-Lite yields final models that are closer to the native. TASSER-Lite is provided on the web at (http://cssb.biology.gatech.edu/skolnick/webservice/tasserlite/index.html).  相似文献   

13.
This first release of the tmRNA database (tmRDB) contains 19 tmRNA sequences, a tmRNA sequence alignment with emphasis of base pairs that are supported by comparative sequence analysis, and a tabulation of tmRNA-encoded tag peptides. The tmRNADB also offers an RNA secondary structure diagram of the Escherichia coli tmRNA, as well as PDB-formatted coordinates for three-dimensional modeling. The data are available on the World Wide Web at http://www.uthct. edu/tmRDB/tmRDB.html  相似文献   

14.
Enzyme systems that attack the plant cell wall contain noncatalytic carbohydrate-binding modules (CBMs) that mediate attachment to this composite structure and play a pivotal role in maximizing the hydrolytic process. Although xyloglucan, which includes a backbone of beta-1,4-glucan decorated primarily with xylose residues, is a key component of the plant cell wall, CBMs that bind to this polymer have not been identified. Here we showed that the C-terminal domain of the modular Clostridium thermocellum enzyme CtCel9D-Cel44A (formerly known as CelJ) comprises a novel CBM (designated CBM44) that binds with equal affinity to cellulose and xyloglucan. We also showed that accommodation of xyloglucan side chains is a general feature of CBMs that bind to single cellulose chains. The crystal structures of CBM44 and the other CBM (CBM30) in CtCel9D-Cel44A display a beta-sandwich fold. The concave face of both CBMs contains a hydrophobic platform comprising three tryptophan residues that can accommodate up to five glucose residues. The orientation of these aromatic residues is such that the bound ligand would adopt the twisted conformation displayed by cello-oligosaccharides in solution. Mutagenesis studies confirmed that the hydrophobic platform located on the concave face of both CBMs mediates ligand recognition. In contrast to other CBMs that bind to single polysaccharide chains, the polar residues in the binding cleft of CBM44 play only a minor role in ligand recognition. The mechanism by which these proteins are able to recognize linear and decorated beta-1,4-glucans is discussed based on the structures of CBM44 and the other CBMs that bind single cellulose chains.  相似文献   

15.
16.
MOTIVATION: We present a structural alignment database that is specifically targeted for use in derivation and optimization of sequence-structure alignment algorithms for homology modeling. We have paid attention to ensure that fold-space is properly sampled, that the structures involved in alignments are of significant resolution (better than 2.5 A) and the alignments are accurate and reliable. RESULTS: Alignments have been taken from the HOMSTRAD, BAliBASE and SCOP-based Gerstein databases along with alignments generated by a global structural alignment method described here. In order to discriminate between equivalent alignments from these different sources, we have developed a novel scoring function, Contact Alignment Quality score, which evaluates trial alignments by their statistical significance combined with their ability to reproduce conserved three-dimensional residue contacts. The resulting non-redundant, unbiased database contains 1927 alignments from across fold-space with high-resolution structures and a wide range of sequence identities. AVAILABILITY: The database can be interactively queried either over the web at http://abagyan.scripps.edu/lab/web/sad/show.cgi or by using MySQL, and is also available to download over the web.  相似文献   

17.
Biomolecule sequences and structures of land, air and water species are determined rapidly and the data entries are unevenly distributed for different organisms. It frequently leads to the BLAST results of homologous search containing undesirable entries from organisms living in different environments. To reduce irrelevant searching results, a separate database for comparative genomics is urgently required. A comprehensive bioinformatics tool set and an integrated database, named Bioinformatics tools for Marine and Freshwater Genomics (BiMFG), are constructed for comparative analyses among model species and underwater species. Novel matching techniques based on conserved motifs and/or secondary structure elements are designed for efficiently and effectively retrieving and aligning remote sequences through cross-species comparisons. It is especially helpful when sequences under analysis possess low similarities and unresolved structural information. In addition, the system provides core techniques of multiple sequence alignment, multiple second structure profile alignment and iteratively refined multiple structural alignments for biodiversity analysis and verification in marine and freshwater biology. The BiMFG web server is freely available for use at http://bimfg.cs.ntou.edu.tw/.  相似文献   

18.
MOTIVATION: As protein structure database expands, protein loop modeling remains an important and yet challenging problem. Knowledge-based protein loop prediction methods have met with two challenges in methodology development: (1) loop boundaries in protein structures are frequently problematic in constructing length-dependent loop databases for protein loop predictions; (2) knowledge-based modeling of loops of unknown structure requires both aligning a query loop sequence to loop templates and ranking the loop sequence-template matches. RESULTS: We developed a knowledge-based loop prediction method that circumvents the need of constructing hierarchically clustered length-dependent loop libraries. The method first predicts local structural fragments of a query loop sequence and then structurally aligns the predicted structural fragments to a set of non-redundant loop structural templates regardless of the loop length. The sequence-template alignments are then quantitatively evaluated with an artificial neural network model trained on a set of predictions with known outcomes. Prediction accuracy benchmarks indicated that the novel procedure provided an alternative approach overcoming the challenges of knowledge-based loop prediction. AVAILABILITY: http://cmb.genomics.sinica.edu.tw  相似文献   

19.
The functions of RNAs, like proteins, are determined by their structures, which, in turn, are determined by their sequences. Comparison/alignment of RNA molecules provides an effective means to predict their functions and understand their evolutionary relationships. For RNA sequence alignment, most methods developed for protein and DNA sequence alignment can be directly applied. RNA 3-dimensional structure alignment, on the other hand, tends to be more difficult than protein structure alignment due to the lack of regular secondary structures as observed in proteins. Most of the existing RNA 3D structure alignment methods use only the backbone geometry and ignore the sequence information. Using both the sequence and backbone geometry information in RNA alignment may not only produce more accurate classification, but also deepen our understanding of the sequence–structure–function relationship of RNA molecules. In this study, we developed a new RNA alignment method based on elastic shape analysis (ESA). ESA treats RNA structures as three dimensional curves with sequence information encoded on additional dimensions so that the alignment can be performed in the joint sequence–structure space. The similarity between two RNA molecules is quantified by a formal distance, geodesic distance. Based on ESA, a rigorous mathematical framework can be built for RNA structure comparison. Means and covariances of full structures can be defined and computed, and probability distributions on spaces of such structures can be constructed for a group of RNAs. Our method was further applied to predict functions of RNA molecules and showed superior performance compared with previous methods when tested on benchmark datasets. The programs are available at http://stat.fsu.edu/ ∼jinfeng/ESA.html.  相似文献   

20.

Background

Protein sequence alignment is essential for a variety of tasks such as homology modeling and active site prediction. Alignment errors remain the main cause of low-quality structure models. A bioinformatics tool to refine alignments is needed to make protein alignments more accurate.

Results

We developed the SFESA web server to refine pairwise protein sequence alignments. Compared to the previous version of SFESA, which required a set of 3D coordinates for a protein, the new server will search a sequence database for the closest homolog with an available 3D structure to be used as a template. For each alignment block defined by secondary structure elements in the template, SFESA evaluates alignment variants generated by local shifts and selects the best-scoring alignment variant. A scoring function that combines the sequence score of profile-profile comparison and the structure score of template-derived contact energy is used for evaluation of alignments. PROMALS pairwise alignments refined by SFESA are more accurate than those produced by current advanced alignment methods such as HHpred and CNFpred. In addition, SFESA also improves alignments generated by other software.

Conclusions

SFESA is a web-based tool for alignment refinement, designed for researchers to compute, refine, and evaluate pairwise alignments with a combined sequence and structure scoring of alignment blocks. To our knowledge, the SFESA web server is the only tool that refines alignments by evaluating local shifts of secondary structure elements. The SFESA web server is available at http://prodata.swmed.edu/sfesa.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号