首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Recently, advances have been made in methods and applications that integrate electron microscopy density maps and comparative modeling to produce atomic structures of macromolecular assemblies. Electron microscopy can benefit from comparative modeling through the fitting of comparative models into electron microscopy density maps. Also, comparative modeling can benefit from electron microscopy through the use of intermediate-resolution density maps in fold recognition, template selection and sequence-structure alignment.  相似文献   

We describe a database of protein structure alignments as well as methods and tools that use this database to improve comparative protein modeling. The current version of the database contains 105 alignments of similar proteins or protein segments. The database comprises 416 entries, 78,495 residues, 1,233 equivalent entry pairs, and 230,396 pairs of equivalent alignment positions. At present, the main application of the database is to improve comparative modeling by satisfaction of spatial restraints implemented in the program MODELLER (?ali A, Blundell TL, 1993, J Mol Biol 234:779–815). To illustrate the usefulness of the database, the restraints on the conformation of a disulfide bridge provided by an equivalent disulfide bridge in a related structure are derived from the alignments; the prediction success of the disulfide dihedral angle classes is increased to approximately 80%, compared to approximately 55% for modeling that relies on the stereochemistry of disulfide bridges alone. The second example of the use of the database is the derivation of the probability density function for comparative modeling of the cis/trans isomerism of the proline residues; the prediction success is increased from 0% to 82.9% for cis-proline and from 93.3% to 96.2% for trans-proline. The database is available via electronic mail.  相似文献   

With the progression of structural genomics projects, comparative modeling remains an increasingly important method of choice. It helps to bridge the gap between the available sequence and structure information by providing reliable and accurate protein models. Comparative modeling based on more than 30% sequence identity is now approaching its natural template-based limits and further improvements require the development of effective refinement techniques capable of driving models toward native structure. For difficult targets, for which the most significant progress in recent years has been observed, optimal template selection and alignment accuracy are still the major problems.  相似文献   

S-adenosylhomocysteine hydrolase (AdoHcyHD) is an ubiquitous enzyme that catalyzes the breakdown of S-adenosylhomocysteine, a powerful inhibitor of most transmethylation reactions, to adenosine and L-homocysteine. AdoHcyHD from the hyperthermophilic archaeon Pyrococcus furiosus (PfAdoHcyHD) was cloned, expressed in Escherichia coli, and purified. The enzyme is thermoactive with an optimum temperature of 95 degrees C, and thermostable retaining 100% residual activity after 1 h at 90 degrees C and showing an apparent melting temperature of 98 degrees C. The enzyme is a homotetramer of 190 kDa and contains four cysteine residues per subunit. Thiol groups are not involved in the catalytic process whereas disulfide bond(s) could be present since incubation with 0.8 M dithiothreitol reduces enzyme activity. Multiple sequence alignment of hyperthermophilic AdoHcyHD reveals the presence of two cysteine residues in the N-terminus of the enzyme conserved only in members of Pyrococcus species, and shows that hyperthermophilic AdoHcyHD lack eight C-terminal residues, thought to be important for structural and functional properties of the eukaryotic enzyme. The homology-modeled structure of PfAdoHcyHD shows that Trp220, Tyr181, Tyr184, and Leu185 of each subunit and Ile244 from a different subunit form a network of hydrophobic and aromatic interactions in the central channel formed at the subunits interface. These contacts partially replace the interactions of the C-terminal tail of the eukaryotic enzyme required for tetramer stability. Moreover, Cys221 and Lys245 substitute for Thr430 and Lys426, respectively, of the human enzyme in NAD-binding. Interestingly, all these residues are fairly well conserved in hyperthermophilic AdoHcyHDs but not in mesophilic ones, thus suggesting a common adaptation mechanism at high temperatures.  相似文献   

This study involves the development of a rapid comparative modeling tool for homologous sequences by extension of the TASSER methodology, developed for tertiary structure prediction. This comparative modeling procedure was validated on a representative benchmark set of proteins in the Protein Data Bank composed of 901 single domain proteins (41-200 residues) having sequence identities between 35-90% with respect to the template. Using a Monte Carlo search scheme with the length of runs optimized for weakly/nonhomologous proteins, TASSER often provides appreciable improvement in structure quality over the initial template. However, on average, this requires approximately 29 h of CPU time per sequence. Since homologous proteins are unlikely to require the extent of conformational search as weakly/nonhomologous proteins, TASSER's parameters were optimized to reduce the required CPU time to approximately 17 min, while retaining TASSER's ability to improve structure quality. Using this optimized TASSER (TASSER-Lite), we find an average improvement in the aligned region of approximately 10% in root mean-square deviation from native over the initial template. Comparison of TASSER-Lite with the widely used comparative modeling tool MODELLER showed that TASSER-Lite yields final models that are closer to the native. TASSER-Lite is provided on the web at (http://cssb.biology.gatech.edu/skolnick/webservice/tasserlite/index.html).  相似文献   

The performance of the self-consistent mean field theory (SCMFT) method for side-chain modeling, employing rotamer energies calculated with the flexible rotamer model (FRM), is evaluated in the context of comparative modeling of protein structure. Predictions were carried out on a test set of 56 model backbones of varying accuracy, to allow side-chain prediction accuracy to be analyzed as a function of backbone accuracy. A progressive decrease in the accuracy of prediction was observed as backbone accuracy decreased. However, even for very low backbone accuracy, prediction was substantially higher than random, indicating that the FRM can, in part, compensate for the errors in the modeled tertiary environment. It was also investigated whether the introduction in the FRM-SCMFT method of knowledge-based biases, derived from a backbone-dependent rotamer library, could enhance its performance. A bias derived from the backbone-dependent rotamer conformations alone did not improve prediction accuracy. However, a bias derived from the backbone-dependent rotamer probabilities improved prediction accuracy considerably. This bias was incorporated through two different strategies. In one (the indirect strategy), rotamer probabilities were used to reject unlikely rotamers a priori, thus restricting prediction by FRM-SCMFT to a subset containing only the most probable rotamers in the library. In the other (the direct strategy), rotamer energies were transformed into pseudo-energies that were added to the average potential energies of the respective rotamers, thereby creating hybrid energy-based/knowledge-based average rotamer energies, which were used by the FRM-SCMFT method for prediction. For all degrees of backbone accuracy, an optimal strength of the knowledge-based bias existed for both strategies for which predictions were more accurate than pure energy-based predictions, and also than pure knowledge-based predictions. Hybrid knowledge-based/energy-based methods were obtained from both strategies and compared with the SCWRL method, a hybrid method based on the same backbone-dependent rotamer library. The accuracy of the indirect method was approximately the same as that of the SCWRL method, but that of the direct method was significantly higher.  相似文献   

This study focuses on Ultra Violet stress (UVS) gene product which is a UV stress induced protein from cyanobacteria, Synechocystis PCC 6803. Three dimensional structural modeling of target UVS protein was carried out by homology modeling method. 3F2I pdb from Nostoc sp. PCC 7120 was selected as a suitable template protein structure. Ultimately, the detection of active binding regions was carried out for characterization of functional sites in modeled UV-B stress protein. The top five probable ligand binding sites were predicted and the common binding residues between target and template protein was analyzed. It has been validated for the first time that modeled UVS protein structure from Synechocystis PCC 6803 was structurally and functionally similar to well characterized UVS protein of another cyanobacterial species, Nostoc sp PCC 7120 because of having same structural motif and fold with similar protein topology and function. Investigations revealed that UVS protein from Synechocystis sp. might play significant role during ultraviolet resistance. Thus, it could be a potential biological source for remediation for UV induced stress.  相似文献   

We evaluate 3D models of human nucleoside diphosphate kinase, mouse cellular retinoic acid binding protein I, and human eosinophil neurotoxin that were calculated by MODELLER , a program for comparative protein modeling by satisfaction of spatial restraints. The models have good stereochemistry and are at least as similar to the crystallographic structures as the closest template structures. The largest errors occur in the regions that were not aligned correctly or where the template structures are not similar to the correct structure. These regions correspond predominantly to exposed loops, insertions of any length, and non-conserved side chains. When a template structure with more than 40% sequence identity to the target protein is available, the model is likely to have about 90% of the mainchain atoms modeled with an rms deviation from the X-ray structure of ≈ 1 Å, in large part because the templates are likely to be that similar to the X-ray structure of the target. This rms deviation is comparable to the overall differences between refined NMR and X-ray crystallography structures of the same protein. © 1995 Wiley-Liss, Inc.  相似文献   

Mirkovic N  Li Z  Parnassa A  Murray D 《Proteins》2007,66(4):766-777
The technological breakthroughs in structural genomics were designed to facilitate the solution of a sufficient number of structures, so that as many protein sequences as possible can be structurally characterized with the aid of comparative modeling. The leverage of a solved structure is the number and quality of the models that can be produced using the structure as a template for modeling and may be viewed as the "currency" with which the success of a structural genomics endeavor can be measured. Moreover, the models obtained in this way should be valuable to all biologists. To this end, at the Northeast Structural Genomics Consortium (NESG), a modular computational pipeline for automated high-throughput leverage analysis was devised and used to assess the leverage of the 186 unique NESG structures solved during the first phase of the Protein Structure Initiative (January 2000 to July 2005). Here, the results of this analysis are presented. The number of sequences in the nonredundant protein sequence database covered by quality models produced by the pipeline is approximately 39,000, so that the average leverage is approximately 210 models per structure. Interestingly, only 7900 of these models fulfill the stringent modeling criterion of being at least 30% sequence-identical to the corresponding NESG structures. This study shows how high-throughput modeling increases the efficiency of structure determination efforts by providing enhanced coverage of protein structure space. In addition, the approach is useful in refining the boundaries of structural domains within larger protein sequences, subclassifying sequence diverse protein families, and defining structure-based strategies specific to a particular family.  相似文献   

An improved generalized comparative modeling method, GENECOMP, for the refinement of threading models is developed and validated on the Fischer database of 68 probe-template pairs, a standard benchmark used to evaluate threading approaches. The basic idea is to perform ab initio folding using a lattice protein model, SICHO, near the template provided by the new threading algorithm PROSPECTOR. PROSPECTOR also provides predicted contacts and secondary structure for the template-aligned regions, and possibly for the unaligned regions by garnering additional information from other top-scoring threaded structures. Since the lowest-energy structure generated by the simulations is not necessarily the best structure, we employed two structure-selection protocols: distance geometry and clustering. In general, clustering is found to generate somewhat better quality structures in 38 of 68 cases. When applied to the Fischer database, the protocol does no harm and in a significant number of cases improves upon the initial threading model, sometimes dramatically. The procedure is readily automated and can be implemented on a genomic scale.  相似文献   

Added-value is the additional information that a model carries with respect to the template structure used for model building. Thousands of single-template models, corresponding to proteins of known structure, were analyzed. The accuracy of structure-derived properties, such as residue accessibility, surface area, electrostatic potential, and others, was determined as a function of template:target sequence identity by comparing the models with their corresponding experimental structures. Added-value was determined by comparing the accuracy in models with that from templates. Geometry-dependent properties such as neighborhood of buried residues and accessible surface area showed low added-value. Properties that also depend on the protein sequence, such as presence of polar areas and electrostatic potential, showed high added-value. In general added-value increases when template:target sequence identity decreases, but it is also affected by alignment errors. This study justifies the use of models instead of the use of templates to estimate structure-derived properties of a target protein.  相似文献   

The nucleotide sequence was established for the full-length Flavobacterium aquatile operon coding for the FauI restriction-modification system. The operon is unusual in structure and has the gene order control protein gene-DNA methyltransferase A gene-restriction endonuclease gene-DNA methyltransferase B gene, other than in the known analogs. The genes are similarly oriented and overlap. On evidence of sequence analysis, both methyltransferases are C5 enzymes, the control protein is similar to that of other restriction-modification systems, and restriction endonuclease is low-homologous to other enzymes cleaving the DNA upper strand in position 4 or 5 relative to the recognition site.  相似文献   

Pathogens have evolved numerous strategies to infect their hosts, while hosts have evolved immune responses and other defenses to these foreign challenges. The vast majority of host-pathogen interactions involve protein-protein recognition, yet our current understanding of these interactions is limited. Here, we present and apply a computational whole-genome protocol that generates testable predictions of host-pathogen protein interactions. The protocol first scans the host and pathogen genomes for proteins with similarity to known protein complexes, then assesses these putative interactions, using structure if available, and, finally, filters the remaining interactions using biological context, such as the stage-specific expression of pathogen proteins and tissue expression of host proteins. The technique was applied to 10 pathogens, including species of Mycobacterium, apicomplexa, and kinetoplastida, responsible for "neglected" human diseases. The method was assessed by (1) comparison to a set of known host-pathogen interactions, (2) comparison to gene expression and essentiality data describing host and pathogen genes involved in infection, and (3) analysis of the functional properties of the human proteins predicted to interact with pathogen proteins, demonstrating an enrichment for functionally relevant host-pathogen interactions. We present several specific predictions that warrant experimental follow-up, including interactions from previously characterized mechanisms, such as cytoadhesion and protease inhibition, as well as suspected interactions in hypothesized networks, such as apoptotic pathways. Our computational method provides a means to mine whole-genome data and is complementary to experimental efforts in elucidating networks of host-pathogen protein interactions.  相似文献   

The number of software packages for kinetic modeling of biochemical networks continues to grow. Although most packages share a common core of functionality, the specific capabilities and user interfaces of different packages mean that choosing the best package for a given task is not trivial. We compare 12 software packages with respect to their functionality, reliability, efficiency, user-friendliness and compatibility. Although most programs performed reliably in all numerical tasks tested, SBML compatibility and the set-up of multicompartmentalization are problematic in many packages. For simple models, GEPASI seems the best choice for non-expert users. For large-scale models, environments such as Jarnac/JDesigner are preferable, because they allow modular implementation of models. Virtual Cell is the most versatile program and provides the simplest and clearest functionality for setting up multicompartmentalization.  相似文献   

Three national patent offices have consulted on patents that cover protein three-dimensional structural data and pharmacophores, with significant implications for the biotechnology industry.  相似文献   

Rai BK  Fiser A 《Proteins》2006,63(3):644-661
A major bottleneck in comparative protein structure modeling is the quality of input alignment between the target sequence and the template structure. A number of alignment methods are available, but none of these techniques produce consistently good solutions for all cases. Alignments produced by alternative methods may be superior in certain segments but inferior in others when compared to each other; therefore, an accurate solution often requires an optimal combination of them. To address this problem, we have developed a new approach, Multiple Mapping Method (MMM). The algorithm first identifies the alternatively aligned regions from a set of input alignments. These alternatively aligned segments are scored using a composite scoring function, which determines their fitness within the structural environment of the template. The best scoring regions from a set of alternative segments are combined with the core part of the alignments to produce the final MMM alignment. The algorithm was tested on a dataset of 1400 protein pairs using 11 combinations of two to four alignment methods. In all cases MMM showed statistically significant improvement by reducing alignment errors in the range of 3 to 17%. MMM also compared favorably over two alignment meta-servers. The algorithm is computationally efficient; therefore, it is a suitable tool for genome scale modeling studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号