首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract

Knowledge-based homology modelling together with site-directed mutagenesis, epitope and conformational mapping is an approach to predict the structures of proteins and for the rational design of new drugs. In this study we present how this procedure has been applied to model the structure of herpes simplex virus type 1 thymidine kinase (HSV1 TK, HSV1 ATP-thymidine-5′-phosphotransferase, EC 2.7.1.21). We have used, and evaluated, several secondary structure prediction methods, such as the classical one based on Chou and Fastman algorithm, neural networks using the Kabsch and Sander classification, and the PRISM method. We have validated the algorithms by applying them to the porcine adenylate kinase (ADK), whose three-dimensional structure is known and that has been used for the alignment of the TKs as well. The resulting first model of HSV1-TK consisted of the first β-strand connected to the phosphate binding loop and its subsequent α-helix, the fourth β-strand connected to the conserved FDRH sequence and two α-helix with basic amino acids. The 3D structure was built using the X-ray structure of ADK as template and following the general procedure for homology modelling. We extended the model by means of COMPOSER, an automatic process for protein modelling. Site-directed mutagenesis was used to experimentally verify the predicted active-site model of HSV1-TK. The data measured in our lab and by others support the suggestion that the FDRH motif is part of the active site and plays an important role in the phosphorylation of substrates. The structure of HSV1 TK, recently solved in collaboration with Prof. G. Schulz at 2.7 Å resolution, includes 284 of 343 residues of the N-terminal truncated TK. The secondary structures could be clearly assigned and fitted to the density. The comparison between crystallographically determined structure and the model shows that nearly 70% of the HSV1 TK structure has been correctly modelled by the described integrated approach to knowledge based ligand protein complex structure prediction. This indicate that computer assisted methods, combined with manual” correction both for alignment and 3D construction are useful and can be successful.  相似文献   

2.
Sequence comparisons of highly related archaeal adenylate kinases (AKs) from the mesophilic Methanococcus voltae, the moderate thermophile Methanococcus thermolithotrophicus, and two extreme thermophiles Methanococcus igneus and Methanococcus jannaschii, allow identification of interactions responsible for the large variation in temperatures for optimal catalytic activity and thermostabilities observed for these proteins. The tertiary structures of the methanococcal AKs have been predicted by using homology modeling to further investigate the potential role of specific interactions on thermal stability and activity. The alignments for the methanococcal AKs have been generated by using an energy-based sequence–structure threading procedure against high-resolution crystal structures of eukaryotic, eubacterial, and mitochondrial adenylate and uridylate (UK) kinases. From these alignments, full atomic model structures have been produced using the program MODELLER. The final structures allow identification of potential active site interactions and place a polyproline region near the active site, both of which are unique to the archaeal AKs. Based on these model structures, the additional polar residues present in the thermophiles could contribute four additional salt bridges and a higher negative surface charge. Since only one of these possible salt bridges is interior, they do not appear significantly to the thermal stability. Instead, our model structures indicate that a larger and more hydrophobic core, due to a specific increase in aliphatic amino acid content and aliphatic side chain volume, in the thermophilic AKs is responsible for increased thermal stability. © 1997 Wiley-Liss Inc.  相似文献   

3.
To investigate the functional sites on a protein and the prediction of binding sites (residues)in proteins, it is often required to identify the binding site residues at different distance threshold from protein three dimensional (3D)structures. For the study of a particular protein chain and its interaction with the ligand in complex form, researchers have to parse the output of different available tools or databases for finding binding-site residues. Here we have developed a tool for calculating amino acid contact distances in proteins at different distance threshold from the 3D-structure of the protein. For an input of protein 3D-structure, ContPro can quickly find all binding-site residues in the protein by calculating distances and also allows researchers to select the different distance threshold, protein chain and ligand of interest. Additionally, it can also parse the protein model (in case of multi model protein coordinate file)and the sequence of selected protein chain in Fasta format from the input 3D-structure. The developed tool will be useful for the identification and analysis of binding sites of proteins from 3D-structure at different distance thresholds. AVAILABILITY: IT CAN BE ACCESSED AT: http://procarb.org/contpro/  相似文献   

4.
Cystic fibrosis transmembrane conductance regulator (CFTR) is an anion channel in the ATP-binding cassette (ABC) transporter protein family. In the presence of ATP and physiologically relevant concentrations of AMP, CFTR exhibits adenylate kinase activity (ATP + AMP ⇆ 2 ADP). Previous studies suggested that the interaction of nucleotide triphosphate with CFTR at ATP-binding site 2 is required for this activity. Two other ABC proteins, Rad50 and a structural maintenance of chromosome protein, also have adenylate kinase activity. All three ABC adenylate kinases bind and hydrolyze ATP in the absence of other nucleotides. However, little is known about how an ABC adenylate kinase interacts with ATP and AMP when both are present. Based on data from non-ABC adenylate kinases, we hypothesized that ATP and AMP mutually influence their interaction with CFTR at separate binding sites. We further hypothesized that only one of the two CFTR ATP-binding sites is involved in the adenylate kinase reaction. We found that 8-azidoadenosine 5′-triphosphate (8-N3-ATP) and 8-azidoadenosine 5′-monophosphate (8-N3-AMP) photolabeled separate sites in CFTR. Labeling of the AMP-binding site with 8-N3-AMP required the presence of ATP. Conversely, AMP enhanced photolabeling with 8-N3-ATP at ATP-binding site 2. The adenylate kinase active center probe P1,P5-di(adenosine-5′) pentaphosphate interacted simultaneously with an AMP-binding site and ATP-binding site 2. These results show that ATP and AMP interact with separate binding sites but mutually influence their interaction with the ABC adenylate kinase CFTR. They further indicate that the active center of the adenylate kinase comprises ATP-binding site 2.  相似文献   

5.
Abstract

A comparative model building process has been utilized to predict (he three-dimensional structure of the bacteriophage 434 Cro protein, Amino acid sequence similarities between the 434 Cro protein and other bacteriophage repressor and Cro proteins have been used, in conjunction with secondary structure prediction and the known structures of other base sequence specific DNA binding proteins, to derive the model. From this model the interactions between the 434 Cro protein and its operator DNA have been deduced. These proposed interactions are consistent with the known properties of the bacteriophage 434 Cro protein.  相似文献   

6.
Background

In eukaryotes, ubiquitin-conjugation is an important mechanism underlying proteasome-mediated degradation of proteins, and as such, plays an essential role in the regulation of many cellular processes. In the ubiquitin-proteasome pathway, E3 ligases play important roles by recognizing a specific protein substrate and catalyzing the attachment of ubiquitin to a lysine (K) residue. As more and more experimental data on ubiquitin conjugation sites become available, it becomes possible to develop prediction models that can be scaled to big data. However, no development that focuses on the investigation of ubiquitinated substrate specificities has existed. Herein, we present an approach that exploits an iteratively statistical method to identify ubiquitin conjugation sites with substrate site specificities.

Results

In this investigation, totally 6259 experimentally validated ubiquitinated proteins were obtained from dbPTM. After having filtered out homologous fragments with 40% sequence identity, the training data set contained 2658 ubiquitination sites (positive data) and 5532 non-ubiquitinated sites (negative data). Due to the difficulty in characterizing the substrate site specificities of E3 ligases by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. The profile hidden Markov model (profile HMM) was adopted to construct the predictive models learned from the identified substrate motifs. A five-fold cross validation was then used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 73.07%, 65.46%, and 67.93%, respectively. Additionally, an independent testing set, completely blind to the training data of the predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (76.13%) and outperform other ubiquitination site prediction tool.

Conclusion

A case study demonstrated the effectiveness of the characterized substrate motifs for identifying ubiquitination sites. The proposed method presents a practical means of preliminary analysis and greatly diminishes the total number of potential targets required for further experimental confirmation. This method may help unravel their mechanisms and roles in E3 recognition and ubiquitin-mediated protein degradation.

  相似文献   

7.
Reversible protein phosphorylation is one of the most important post-translational modifications, which regulates various biological cellular processes. Identification of the kinase-specific phosphorylation sites is helpful for understanding the phosphorylation mechanism and regulation processes. Although a number of computational approaches have been developed, currently few studies are concerned about hierarchical structures of kinases, and most of the existing tools use only local sequence information to construct predictive models. In this work, we conduct a systematic and hierarchy-specific investigation of protein phosphorylation site prediction in which protein kinases are clustered into hierarchical structures with four levels including kinase, subfamily, family and group. To enhance phosphorylation site prediction at all hierarchical levels, functional information of proteins, including gene ontology (GO) and protein–protein interaction (PPI), is adopted in addition to primary sequence to construct prediction models based on random forest. Analysis of selected GO and PPI features shows that functional information is critical in determining protein phosphorylation sites for every hierarchical level. Furthermore, the prediction results of Phospho.ELM and additional testing dataset demonstrate that the proposed method remarkably outperforms existing phosphorylation prediction methods at all hierarchical levels. The proposed method is freely available at http://bioinformatics.ustc.edu.cn/phos_pred/.  相似文献   

8.
Abstract

The synthesis and X-ray crystal structures of a series of 5-substituted-6-aza-2′-deoxyuridines is reported. These nucleoside analogues inhibit the phosphorylation of thymidine by HSV-1 TK but have no effect on the corresponding human enzyme. Detailed examination of one analogue proves it to be a competitive inhibitor of thymidine with a Ki of 0.34 μM and is a very poor substrate. The analogues are not substrates for the enzyme and also do not inhibit the degradation of thymidine by thymidine phosphorylase. Molecular modelling showed that the inhibitors fit well in the active site of HSV-1 TK, provided the conformation of the sugar moiety is the same for thymidine in the complex.  相似文献   

9.
Abstract

The gene encoding the amylolytic enzyme Amo45, originating from a metagenomic project, was retrieved by a consensus primer-based approach for glycoside hydrolase (GH) family 57 enzymes. Family 57 contains mainly uncharacterized proteins similar to archaeal thermoactive amylopullulanases. For characterization of these family members soluble, active enzymes have to be produced in sufficient amounts. Heterologous expression of amo45 in E.coli resulted in low yields of protein, most of which was found in inclusion bodies. To improve protein production and to increase the amount of soluble protein, two different modifications of the gene were applied. The first was fusion to an N-terminal His-tag sequence which increased the yield of protein, but still resulted in high amounts of inclusion bodies. Co-expression with chaperones enhanced the amount of soluble protein 4-fold. An alternative modification was the attachment of a peptide consisting of the amino acid sequence of the mobile-loop of the co-chaperonin GroES of E.coli. This sequence improved the soluble protein production 5-fold compared to His6-Amo45 and additional expression of chaperones was unnecessary.  相似文献   

10.
Abstract

It is widely believed that the prediction of the three-dimensional structures of proteins from the first principles is impossible. This view is based on the fact that the number of possible structures for each protein is astronomically large. The question is then why a protein folds into its native structure with the proper biological functions in the time scale of milliseconds to minutes, and this is called Levinthal's paradox. In this article I will discuss our strategy for attacking the protein folding problem. Our approach consists of two elements: the inclusion of accurate solvent effects and the development of powerful simulation algorithms that can avoid getting trapped in states of energy local minima. For the former, we discuss several models varying in nature from crude (distance-dependent dielectric function) to rigorous (reference interaction site model). For the latter, we show the effectiveness of Monte Carlo simulated annealing and generalized-ensemble algorithms.  相似文献   

11.
Here we identify the determinants of the nucleotide-binding ability associated with the P-loop-containing proteins, inferring their functional importance from their structural convergence to a unique three- dimensional (3D) motif. (1) A new surface 3D pattern is identified for the P-loop nucleotide-binding region, which is more selective than the corresponding sequence pattern; (2) the signature displays one residue that we propose is the determinant for the guanine-binding ability (the residues aligned to ras D119; this residue is known to be important only in the G-proteins, we extend the prediction to all the other P-loop- containing proteins); and (3) two cases of convergent evolution at the molecular level are highlighted in the analysis of the active site: the positive charge aligned to ras K117 and the arginine residues aligned to the GAP arginine finger.The analysis of the residues conserved on protein surfaces allows one to identify new functional or evolutionary relationships among protein structures that would not be detectable by conventional sequence or structure comparison methods.  相似文献   

12.

Background

Conformational flexibility creates errors in the comparison of protein structures. Even small changes in backbone or sidechain conformation can radically alter the shape of ligand binding cavities. These changes can cause structure comparison programs to overlook functionally related proteins with remote evolutionary similarities, and cause others to incorrectly conclude that closely related proteins have different binding preferences, when their specificities are actually similar. Towards the latter effort, this paper applies protein structure prediction algorithms to enhance the classification of homologous proteins according to their binding preferences, despite radical conformational differences.

Methods

Specifically, structure prediction algorithms can be used to "remodel" existing structures against the same template. This process can return proteins in very different conformations to similar, objectively comparable states. Operating on close homologs exploits the accuracy of structure predictions on closely related proteins, but structure prediction is often a nondeterministic process. Identical inputs can generate subtly different models with very different binding cavities that make structure comparison difficult. We present a first method to mitigate such errors, called "medial remodeling", that examines a large number of predicted structures to eliminate extreme models of the same binding cavity.

Results

Our results, on the enolase and tyrosine kinase superfamilies, demonstrate that remodeling can enable proteins in very different conformations to be returned to states that can be objectively compared. Structures that would have been erroneously classified as having different binding preferences were often correctly classified after remodeling, while structures that would have been correctly classified as having different binding preferences almost always remained distinct. The enolase superfamily, which exhibited less sequential diversity than the tyrosine kinase superfamily, was classified more accurately after remodeling than the tyrosine kinases. Medial remodeling reduced errors from models with unusual perturbations that distort the shape of the binding site, enhancing classification accuracy.

Conclusions

This paper demonstrates that protein structure prediction can compensate for conformational variety in the comparison of protein-ligand binding sites. While protein structure prediction introduces new uncertainties into the structure comparison problem, our results indicate that unusual models can be ignored through an analysis of many models, using techniques like medial remodeling. These results point to applications of protein structure comparison that extend beyond existing crystal structures.
  相似文献   

13.
Abstract

A set of software tools designed to study protein structure and kinetics has been developed. The core of these tools is a program called Folding Machine (FM) which is able to generate low resolution folding pathways using modest computational resources. The FM is based on a coarse-grained kinetic ab initio Monte-Carlo sampler that can optionally use information extracted from secondary structure prediction servers or from fragment libraries of local structure. The model underpinning this algorithm contains two novel elements: (a) the conformational space is discretized using the Ramachandran basins defined in the local φ-ψ energy maps; and (b) the solvent is treated implicitly by rescaling the pairwise terms of the non-bonded energy function according to the local solvent environments. The purpose of this hybrid ab initio/knowledge-based approach is threefold: to cover the long time scales of folding, to generate useful 3-dimensional models of protein structures, and to gain insight on the protein folding kinetics. Even though the algorithm is not yet fully developed, it has been used in a recent blind test of protein structure prediction (CASP5). The FM generated models within 6 Å backbone rmsd for fragments of about 60–70 residues of a-helical proteins. For a CASP5 target that turned out to be natively unfolded, the trajectory obtained for this sequence uniquely failed to converge. Also, a new measure to evaluate structure predictions is presented and used along the standard CASP assessment methods. Finally, recent improvements in the prediction of β-sheet structures are briefly described.  相似文献   

14.
Genomics has posed the challenge of determination of protein function from sequence and/or 3-D structure. Functional assignment from sequence relationships can be misleading, and structural similarity does not necessarily imply functional similarity. Proteins in the DJ-1 family, many of which are of unknown function, are examples of proteins with both sequence and fold similarity that span multiple functional classes. THEMATICS (theoretical microscopic titration curves), an electrostatics-based computational approach to functional site prediction, is used to sort proteins in the DJ-1 family into different functional classes. Active site residues are predicted for the eight distinct DJ-1 proteins with available 3-D structures. Placement of the predicted residues onto a structural alignment for six of these proteins reveals three distinct types of active sites. Each type overlaps only partially with the others, with only one residue in common across all six sets of predicted residues. Human DJ-1 and YajL from Escherichia coli have very similar predicted active sites and belong to the same probable functional group. Protease I, a known cysteine protease from Pyrococcus horikoshii, and PfpI/YhbO from E. coli, a hypothetical protein of unknown function, belong to a separate class. THEMATICS predicts a set of residues that is typical of a cysteine protease for Protease I; the prediction for PfpI/YhbO bears some similarity. YDR533Cp from Saccharomyces cerevisiae, of unknown function, and the known chaperone Hsp31 from E. coli constitute a third group with nearly identical predicted active sites. While the first four proteins have predicted active sites at dimer interfaces, YDR533Cp and Hsp31 both have predicted sites contained within each subunit. Although YDR533Cp and Hsp31 form different dimers with different orientations between the subunits, the predicted active sites are superimposable within the monomer structures. Thus, the three predicted functional classes form four different types of quaternary structures. The computational prediction of the functional sites for protein structures of unknown function provides valuable clues for functional classification.  相似文献   

15.
Abstract

A new approach using a 3-D Cartesian coordinate system to represent protein sequences has been derived. By the 3-D Graphical representation we make a comparison of sequences belonging to nine different proteins.  相似文献   

16.
Abstract

Escherichia coli is a common host that is widely used for producing recombinant proteins. However, it is a simple approach for production of heterologous proteins; the major drawbacks in using this organism include incorrect protein folding and formation of disordered aggregated proteins as inclusion bodies. Co-expression of target proteins with certain molecular chaperones is a rational approach for this problem. Aequorin is a calcium-activated photoprotein that is often prone to form insoluble inclusion bodies when overexpressed in E. coli cells resulting in low active yields. Therefore, in the present research, our main aim is to increase the soluble yield of aequorin as a model protein and minimize its inclusion body content in the bacterial cells. We have applied the chaperone-assisted protein folding strategy for enhancing the yield of properly folded protein with the assistance of artemin as an efficient molecular chaperone. The results here indicated that the content of the soluble form of aequorin was increased when it was co-expressed with artemin. Moreover, in the co-expressing cells, the bioluminescence activity was higher than the control sample. We presume that this method might be a potential tool to promote the solubility of other aggregation-prone proteins in bacterial cells.  相似文献   

17.
Abstract

A topological comparison of the two helix destabilizing proteins, pancreatic ribonuclease A and the gene S DNA binding protein of bacteriophage fd has been completed utilizing the available high resolution tertiary structures of each protein. The results indicate these two proteins are structurally if not also evolutionarily related. Regions of closest topological equivalence occur between beta loops directly involved in nucleotide binding or are required for the maintenance of their respective oligonucleotide binding channels. In addition, there is a similar placement of critical amino acid side chains about the binding site. Further evidence for this structural relationship is obtained by comparison of structural data for the mode of complexation of polynucleotides to each protein. The results of topological comparison suggest the essential property shared by helix destabilizing proteins, whether specialized DNA binding proteins such as G5BP or proteins with other primary functional roles, like ribonuclease A, is the presence of an elongated oligonucleotide binding channel. Although ribonuclease A and G5BP are structurally related, it seems likely any protein with this structural feature will exhibit a helix destabilizing capacity. This conclusion is supported by the diversity of molecular characteristics shown by other proteins having this activity.  相似文献   

18.
19.
Using an approach for protein comparison by computer analysis based on signal treatment methods without previous alignment of the sequence, we have analysed the structure/function relationship of related proteins. The aim was to demonstrate that from a few members of related proteins, specific parameters can be obtained and used for the characterisation of newly sequenced proteins obtained by molecular biology techniques. The analysis was performed on protein kinases, which comprise the largest known family of proteins, and therefore allows valid estimations to be made. We show that using only a dozen defined proteins, the specific parameters extracted from their sequences classified the protein kinase family into two sub-groups: the protein serine/threonine kinases (PSKs) and the protein tyrosine kinases (PTKs). The analysis, largely involving computation, appears applicable to large scale data-bank analysis and prediction of protein functions.  相似文献   

20.
The publication of the crystallographic structure of calmodulin protein has offered an example leading us to believe that it is possible for many protein sequence segments to exhibit multiple 3D structures referred to as multi-structural segments. To this end, this paper presents statistical analysis of uniqueness of the 3D-structure of all possible protein sequence segments stored in the Protein Data Bank (PDB, Jan. of 2003, release 103) that occur at least twice and whose lengths are greater than 10 amino acids (AAs). We refined the set of segments by choosing only those that are not parts of longer segments, which resulted in 9297 segments called a sponge set. By adding 8197 signature segments, which occur uniquely in the PDB, into the sponge set we have generated a benchmark set. Statistical analysis of the sponge set demonstrates that rotating, missing and disarranging operations described in the text, result in the segments becoming multi-structural. It turns out that missing segments do not exhibit a change of shape in the 3D-structure of a multi-structural segment. We use the root mean square distance for unit vector sequence (URMSD) as an improved measure to describe the characteristics of hinge rotations, missing, and disarranging segments. We estimated the rate of occurrence for rotating and disarranging segments in the sponge set and divided it by the number of sequences in the benchmark set which is found to be less than 0.85%. Since two of the structure changing operations concern negligible number of segment and the third one is found not to have impact on the structure, we conclude that the 3D-structure of proteins is conserved statistically for more than 98% of the segments. At the same time, the remaining 2% of the sequences may pose problems for the sequence alignment based structure prediction methods.*Jishou Ruan research was supported by Liuhui Center for Applied Mathematics, China-Canada exchange program administered by MITACS and NSFC (10271061). #Ke Chen and Lukasz A. Kurgan research was partially supported by NSERC Canada. Jack A. Tuszynkski research has been supported by MITACS, NSERC Canada and the Allard Foundation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号