首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Lu Z  Dunaway-Mariano D  Allen KN 《Biochemistry》2005,44(24):8684-8696
The BT4131 gene from the bacterium Bacteroides thetaiotaomicron VPI-5482 has been cloned and overexpressed in Escherichia coli. The protein, a member of the haloalkanoate dehalogenase superfamily (subfamily IIB), was purified to homogeneity, and its X-ray crystal structure was determined to1.9 A resolution using the molecular replacement phasing method. BT4131 was shown by an extensive substrate screen to be a broad-range sugar phosphate phosphatase. On the basis of substrate specificity and gene context, the physiological function of BT4131 in chitin metabolism has been tentatively assigned. Comparison of the BT4131 structure alpha/beta cap domain structure with those of other type IIB enzymes (phosphoglycolate phosphatase, trehalose-6-phosphate phosphatase, and proteins of unknown function known as PDB entries , , and ) identified two conserved loops (BT4131 residues 172-182 and 118-130) in the alphabetabeta(alphabetaalphabeta)alphabetabeta type caps and one conserved loop in the alphabetabetaalphabetabeta type caps, which contribute residues for contact with the substrate leaving group. In BT4131, the two loops contribute one polar and two nonpolar residues to encase the displaced sugar. This finding is consistent with the lax specificity BT4131 has for the ring size and stereochemistry of the sugar phosphate. In contrast, substrate docking showed that the high-specificity phosphoglycolate phosphatase (PDB entry ) uses a single substrate specificity loop to position three polar residues for interaction with the glycolate leaving group. We show how active site "solvent cages" derived from analysis of the structures of the type IIB HAD phosphatases could be used in conjunction with the identity of the residues stationed along the cap domain substrate specificity loops, as a means of substrate identification.  相似文献   

2.
The "cytochrome b5 fold": structure of a novel protein superfamily   总被引:6,自引:0,他引:6  
Selective proteolysis allows the isolation of a heme-binding fragment spectrally similar to microsomal cytochrome b5 from both baker's yeast flavocytochrome b2 (a flavohemoprotein) and liver sulfite oxidase (a molybdoprotein). The amino acid sequences of these two fragments have been published separately (Guiard &; Lederer, 1976,1979). We present in this paper an alignment of those sequences with that of microsomal cytochrome b5. The structural consequences of the similarity between the three primary structures are discussed in the light of the cytochrome b5 three-dimensional model (Mathews et al., 1971,1972,1975; Mathews &; Czerwinski, 1976).It is concluded that the three heme-binding proteins are in all probability the products of a divergent evolution from a common ancestor and that they must present a basically similar backbone with some surface alterations. We propose to name this backbone the “cytochrome b5 fold”. The comparison of the three proteins suggests hypotheses concerning the molecular surface areas involved in the recognition of cytochrome c (the common acceptor) and of the respective reductase (flavo- or molybdoprotein).In addition, our results suggest that at some point in evolution, several copies of an initial hemoprotein gene were formed in the cellular genome. Subsequently, one copy was fused with the gene for another function: a flavoreductase in yeast cells or a molybdoreductase in hepatic cells.  相似文献   

3.

Background

An organism's ability to adapt to its particular environmental niche is of fundamental importance to its survival and proliferation. In the largest study of its kind, we sought to identify and exploit the amino-acid signatures that make species-specific protein adaptation possible across 100 complete genomes.

Results

Environmental niche was determined to be a significant factor in variability from correspondence analysis using the amino acid composition of over 360,000 predicted open reading frames (ORFs) from 17 archae, 76 bacteria and 7 eukaryote complete genomes. Additionally, we found clusters of phylogenetically unrelated archae and bacteria that share similar environments by amino acid composition clustering. Composition analyses of conservative, domain-based homology modeling suggested an enrichment of small hydrophobic residues Ala, Gly, Val and charged residues Asp, Glu, His and Arg across all genomes. However, larger aromatic residues Phe, Trp and Tyr are reduced in folds, and these results were not affected by low complexity biases. We derived two simple log-odds scoring functions from ORFs (CG) and folds (CF) for each of the complete genomes. CF achieved an average cross-validation success rate of 85 ± 8% whereas the CG detected 73 ± 9% species-specific sequences when competing against all other non-redundant CG. Continuously updated results are available at http://genome.mshri.on.ca.

Conclusion

Our analysis of amino acid compositions from the complete genomes provides stronger evidence for species-specific and environmental residue preferences in genomic sequences as well as in folds. Scoring functions derived from this work will be useful in future protein engineering experiments and possibly in identifying horizontal transfer events.  相似文献   

4.
Fifty-two 3D structures of Ig-like domains covering the immunoglobulin fold family (IgFF) were compared and classified according to the conservation of their secondary structures. Members of the IgFF are distantly related proteins or evolutionarily unrelated proteins with a similar fold, the Ig fold. In this paper, a multiple structural alignment of the conserved common core is described and the correlation between corresponding sequences is discussed. While the members of the IgFF exhibit wide heterogeneity in terms of tissue and species distribution or functional implications, the 3D structures of these domains are far more conserved than their sequences. We define topologically equivalent residues in the Ig-like domains, describe the hydrophobic common cores and discuss the presence of additional strands. The disulfide bridges, not necessary for the stability of the Ig fold, may have an effect on the compactness of the domains. Based upon sequence and structure analysis, we propose the introduction of two new subtypes (C3 and C4) to the previous classifications, in addition to a new global structural classification. The very low mean sequence identity between subgroups of the IgFF suggests the occurrence of both divergent and convergent evolutionary processes, explaining the wide diversity of the superfamily. Finally, this review suggest that hydrophobic residues constituting the common hydrophobic cores are important clues to explain how highly divergent sequences can adopt a similar fold.  相似文献   

5.
6.
A new method to analyze the similarity between multiply aligned protein motifs (blocks) was developed. It identifies sets of consistently aligned blocks. These are found to be protein regions of similar function and structure that appear in different contexts. For example, the Rossmann fold ligand-binding region is found similar to TIM barrel and methylase regions, various protein families are predicted to have a TIM-barrel fold and the structural relation between the ClpP protease and crotonase folds is identified from their sequence. Besides identifying local structure features, sequence similarity across short sequence-regions (less than 20 amino acid regions) also predicts structure similarity of whole domains (folds) a few hundred amino acid residues long. Most of these relations could not be identified by other advanced sequence-to-sequence or sequence-to-multiple alignments comparisons. We describe the method (termed CYRCA), present examples of our findings, and discuss their implications.  相似文献   

7.
We present a protein fold recognition method, MANIFOLD, which uses the similarity between target and template proteins in predicted secondary structure, sequence and enzyme code to predict the fold of the target protein. We developed a non-linear ranking scheme in order to combine the scores of the three different similarity measures used. For a difficult test set of proteins with very little sequence similarity, the program predicts the fold class correctly in 34% of cases. This is an over twofold increase in accuracy compared with sequence-based methods such as PSI-BLAST or GenTHREADER, which score 13-14% correct first hits for the same test set. The functional similarity term increases the prediction accuracy by up to 3% compared with using the combination of secondary structure similarity and PSI-BLAST alone. We argue that using functional and secondary structure information can increase the fold recognition beyond sequence similarity.  相似文献   

8.
PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence‐structure‐dynamics‐function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence‐conserved residues and build phylogenetic tree. Three‐dimensional structure alignment was also applied to obtain structure‐conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics.  相似文献   

9.
The bacterial porin superfamily: sequence alignment and structure prediction   总被引:48,自引:0,他引:48  
The porins of Gram-negative bacteria are responsible for the 'molecular sieve' properties of the outer membrane. They form large water-filled channels which allow the diffusion of hydrophilic molecules into the periplasmic space. Owing to the strong hydrophilicity of their amino acid sequence and the nature of their secondary structure (beta strands), conventional hydropathy methods for predicting membrane topology are useless for this class of protein. The large number of available porin amino acid sequences was exploited to improve the accuracy of the prediction in combination with tools detecting amphipathicity of secondary structure. Using the constraints of beta-sheet structure these porins are predicted to contain 16 membrane-spanning strands, 14 of which are common to the two (enteric and the neisserial) porin subfamilies.  相似文献   

10.
A broad range of peroxides generated in subcellular compartments, including chloroplasts, are detoxified with peroxidases called peroxiredoxins (Prx). The Prx are ubiquitously distributed in all organisms including bacteria, fungi, animals and also in cyanobacteria and plants. Recently, the Prx have emerged as new molecules in antioxidant defense in plants. Here, the members which belong to Prx gene family in Arabidopsis and rice are been identified. Overall, the Prx members constitute a small family with 10 and 11 genes in Arabidopsis and rice respectively. The prx genes from rice are assigned to their functional groups based on homology search against Arabidopsis protein database. Deciphering the Prx functions in rice will add novel information to the mechanism of antioxidant defense in plants. Further, the Prx also forms the part of redox signaling cascade. Here, the Prx gene family has been described for rice.Key words: antioxidant defense, chloroplast, gene family, oxidative stress, reactive oxygen speciesThe formation of free radicals and reactive oxygen species (ROS) occur in several enzymatic and non-enzymatic reactions during cellular metabolism. The accumulation of these reactive and deleterious intermediates is suppressed by antioxidant defense mechanism comprised of low molecular weight antioxidants and enzymes. In photosynthetic organisms, the defense against the damage from free radicals and oxidative stress is crucial. For instance, the ROS production occurs in photosystem II with generation of singlet oxygen (1O2) and hydrogen peroxide (H2O2),1,2 photosystem I from superoxide anion radicals (O2),3 and during photorespiration with generation of H2O2.4 ROS production may exceed under environmental stress conditions like excess light, low temperature and drought.5The antioxidant defense mechanism is activated by antioxidant metabolities and enzymes which detoxify ROS and lipid peroxides. The detoxification of ROS can occur in various cellular compartments such as chloroplasts, mitochondria, peroxisomes and cytosol.6 The enzymes like ascorbate peroxidase, catalase, glutathione peroxidase and superoxide dismutase are prominent antioxidant enzymes.6 The peroxiredoxins (Prx) emerged as new components in the antioxidant defense network of barley.7,8 Later, Prx were studied in other plants.914Prx can be classified into four different functional groups, PrxQ, 1-Cys Prx, 2-Cys Prx and Type-2 Prx.15,16 They are members of the thioredoxin fold superfamily.17,18 In this study, the prx genes found in Arabidopsis and rice genomes are been identified. The Arabidopsis genome encodes 10 prx genes classified into four functional categories, 1-Cys Prx, 2-Cys Prx, PrxQ and Type-2 Prx.13 Of these, one each of 1-Cys Prx and PrxQ, two of 2-Cys Prx (2-Cys PrxA and 2-Cys PrxB) and six Type-2 Prx (PrxA–F) are identified13 (
LocusAnnotationSynonymA*B*C*
AT1G481301-Cysteine peroxiredoxin 1 (ATPER1)1-Cys Prx21624081.36.603
AT1G60740Peroxiredoxin type 2Type-2 PrxD16217471.95.2297
AT1G65970Thioredoxin-dependent peroxidase 2 (TPX2)Type-2 PrxC16217413.95.2297
AT1G65980Thioredoxin-dependent peroxidase 1 (TPX1)Type-2 PrxB16217427.84.9977
AT1G65990Type 2 peroxiredoxin-relatedType-2 PrxA55362653.66.4368
AT3G06050Peroxiredoxin IIF (PRXIIF)Type-2 PrxF20121445.29.3905
AT3G116302-Cys Peroxiredoxin A (2CPA, 2-Cys PrxA)2-Cys PrxA26629091.77.5686
AT3G26060ATPRX Q, periredoxin QPrxQ21623677.810.0565
AT3G52960Peroxiredoxin type 2Type-2 PrxE23424684.09.572
AT5G062902-Cysteine Peroxiredoxin B (2CPB, 2-Cys PrxB)2-Cys PrxB27329779.55.414
Open in a separate window*A, amino acids; B, molecular weight; C, isoelectric point.In rice (rice.plantbiology.msu.edu/), there are 11 genomic loci which encode for Prx proteins (and33). Interestingly, a new prx gene (LOC_Os07g15670) annotated as “peroxiredoxin, putative, expressed” is identified making the tally of prx genes to eleven in rice as compared to ten in Arabidopsis (and22). The BLAST search has identified its counterpart in Arabidopsis which has been annotated as “antioxidant/oxidoreductase” (AT1G21350) in the TAIR database (www.arabidopsis.org). The rice LOC_Os07g15670 and Arabidopsis AT1G21350 share protein homology %68/78 for 236 amino acids (ChromosomeLocus IdPutative function/AnnotationA*B*C*1LOC_Os01g16152peroxiredoxin, putative, expressed19920873.68.22091LOC_Os01g24740peroxiredoxin-2E-1, chloroplast precursor, putative10711591.56.79061LOC_Os01g48420peroxiredoxin, putative, expressed16317290.85.68282LOC_Os02g09940peroxiredoxin, putative, expressed22623179.56.5352LOC_Os02g33450peroxiredoxin, putative, expressed26228096.95.77094LOC_Os04g339702-Cys peroxiredoxin BAS1, chloroplast precursor, putative, expressed12213410.24.37056LOC_Os06g09610peroxiredoxin, putative, expressed2662892610.50976LOC_Os06g42000peroxiredoxin, putative, expressed23323688.39.20597LOC_Os07g15670peroxiredoxin, putative, expressed25327684.69.85457LOC_Os07g44440peroxiredoxin, putative, expressed22124232.65.36187LOC_Os07g44430peroxiredoxin, putative25627785.36.8544Open in a separate window*A, amino acids; B, molecular weight; C, isoelectric point.

Table 3

Identification of rice homologs of peroxiredoxins in A. thaliana
Locus Id (Os*)Homolog (At*)NomenclatureIdentitity/Similarity (%)No. of aa* compared
LOC_Os01g16152AT3G06050Type-2 PrxF73/84201
LOC_Os01g24740AT1G65980Type-2 PrxB42/5977
LOC_Os01g48420AT1G65970Type-2 PrxC74/86162
LOC_Os02g09940AT1G60740Type-2 PrxD56/72166
LOC_Os02g33450AT5G062902-Cys Prx B74/82272
LOC_Os04g33970AT3G116302-Cys PrxA92/9688
LOC_Os06g09610AT3G26060PrxQ78/89159
LOC_Os06g42000AT3G52960Type-2 PrxE61/74240
LOC_Os07g15670AT1G21350Antioxidant68/78236
LOC_Os07g44440AT1G65990Type-2 PrxA27/4483
LOC_Os07g44430AT1G481301-Cys Prx69/83221
Open in a separate window*Os, Oryza sativa L.; At, Arabidopsis thaliana L.; aa, amino acids.The protein alignment study of Prx members in rice with the canonical Prx2-B and Prx2-E of Arabidopsis is shown in Figure 1. The Type-2 Prx proteins are characterized by the presence of catalytic cysteine (Cys) residues (Fig. 1). The alignment of rice Prx proteins shows that the Cys residue is well conserved in members like LOC_Os02g09940 (Type-2 PrxD), LOC_Os06g42000 (Type-2 Prx E), LOC_Os01g48420 (Type-2 Prx C), LOC_Os01g16152 (Type-2 Prx F), LOC_Os02g33450 (2-Cys Prx B), LOC_Os07g44440 (Type-2 Prx A), LOC_Os07g44430 (1-Cys Prx) and LOC_Os06g09610 (PrxQ) (Fig. 1). However, LOC_Os01g24740 (Type-2 PrxB) and LOC_Os04g33970 (2-Cys PrxA) which contain a chloroplast precursor do not have the catalytic Cys residues (Fig. 1). The newly identified LOC_Os07g15670 and AT1G21350 with annotations “peroxiredoxin, putative, expressed” and “antioxidant/oxidoreductase” respectively do not have catalytic Cys residues as well (Fig. 1).Open in a separate windowFigure 1Amino acid alignment of peroxiredoxins (Prx) in rice. The rice proteins are aligned with the canonical Arabidopsis Prx2-B and Prx2-E. The conserved cysteine residues are indicated by arrows on top of the alignment. Note the sequence conservation between the newly identified LOC_Os07g15670 and AT1G21350. The rice locus Ids are identified on left and amino acid positions on right. The alignment was made with ClustalX.Taken together, the results demonstrate that like Arabidopsis, the Prx constitute a small gene family in rice. However, the functional role of Prx in rice is not clearly understood.  相似文献   

11.
Sequence analysis,structure and binding site prediction of Sigma 1 receptor protein by in silico method     
Narayanasamy Lokeswaran Latha  Garimella Gyananath  Pudukulathan Kader Zubaidha 《Bioinformation》2013,9(19):944-951
Sigma 1 Receptor is a subtype of opioid receptor that participates in membrane remodeling and cellular differentiation in the nervous system. Sigma1 Receptor protein with amino acid length ranging from 229 is widely distributed in the liver and moderately in the intestine, kidney, white pulp of the spleen, adrenal gland, brain, placenta and the lung. In this study, the three dimensional structure for sigma 1 receptor protein has been developed by in- silico analysis based on evolutionary trace analysis of 37 sigma proteins from different sources. The present work focus on identification of functionally important residues and its interaction with antipsychotic drugs reported in literature.  相似文献   

12.
The X-ray crystallographic structure and activity analysis of a Pseudomonas-specific subfamily of the HAD enzyme superfamily evidences a novel biochemical function     
Peisach E  Wang L  Burroughs AM  Aravind L  Dunaway-Mariano D  Allen KN 《Proteins》2008,70(1):197-207
The haloacid dehalogenase (HAD) superfamily is a large family of proteins dominated by phosphotransferases. Thirty-three sequence families within the HAD superfamily (HADSF) have been identified to assist in function assignment. One such family includes the enzyme phosphoacetaldehyde hydrolase (phosphonatase). Phosphonatase possesses the conserved Rossmanniod core domain and a C1-type cap domain. Other members of this family do not possess a cap domain and because the cap domain of phosphonatase plays an important role in active site desolvation and catalysis, the function of the capless family members must be unique. A representative of the capless subfamily, PSPTO_2114, from the plant pathogen Pseudomonas syringae, was targeted for catalytic activity and structure analyses. The X-ray structure of PSPTO_2114 reveals a capless homodimer that conserves some but not all of the intersubunit contacts contributed by the core domains of the phosphonatase homodimer. The region of the PSPTO_2114 that corresponds to the catalytic scaffold of phosphonatase (and other HAD phosphotransfereases) positions amino acid residues that are ill suited for Mg+2 cofactor binding and mediation of phosphoryl group transfer between donor and acceptor substrates. The absence of phosphotransferase activity in PSPTO_2114 was confirmed by kinetic assays. To explore PSPTO_2114 function, the conservation of sequence motifs extending outside of the HADSF catalytic scaffold was examined. The stringently conserved residues among PSPTO_2114 homologs were mapped onto the PSPTO_2114 three-dimensional structure to identify a surface region unique to the family members that do not possess a cap domain. The hypothesis that this region is used in protein-protein recognition is explored to define, for the first time, HADSF proteins which have acquired a function other than that of a catalyst.  相似文献   

13.
DPANN: improved sequence to structure alignments following fold recognition     
Reinhardt A  Eisenberg D 《Proteins》2004,56(3):528-538
In fold recognition (FR) a protein sequence of unknown structure is assigned to the closest known three-dimensional (3D) fold. Although FR programs can often identify among all possible folds the one a sequence adopts, they frequently fail to align the sequence to the equivalent residue positions in that fold. Such failures frustrate the next step in structure prediction, protein model building. Hence it is desirable to improve the quality of the alignments between the sequence and the identified structure. We have used artificial neural networks (ANN) to derive a substitution matrix to create alignments between a protein sequence and a protein structure through dynamic programming (DPANN: Dynamic Programming meets Artificial Neural Networks). The matrix is based on the amino acid type and the secondary structure state of each residue. In a database of protein pairs that have the same fold but lack sequences-similarity, DPANN aligns over 30% of all sequences to the paired structure, resembling closely the structural superposition of the pair. In over half of these cases the DPANN alignment is close to the structural superposition, although the initial alignment from the step of fold recognition is not close. Conversely, the alignment created during fold recognition outperforms DPANN in only 10% of all cases. Thus application of DPANN after fold recognition leads to substantial improvements in alignment accuracy, which in turn provides more useful templates for the modeling of protein structures. In the artificial case of using actual instead of predicted secondary structures for the probe protein, over 50% of the alignments are successful.  相似文献   

14.
IWS: integrated web server for protein sequence and structure analysis     
Shameer K  Sowdhamini R 《Bioinformation》2007,2(3):86-90
Rapid increase in protein sequence information from genome sequencing projects demand the intervention of bioinformatics tools to recognize interesting gene-products and associated function. Often, multiple algorithms need to be employed to improve accuracy in predictions and several structure prediction algorithms are on the public domain. Here, we report the availability of an Integrated Web-server as a bioinformatics online package dedicated for in-silico analysis of protein sequence and structure data (IWS). IWS provides web interface to both in-house and widely accepted programs from major bioinformatics groups, organized as 10 different modules. IWS also provides interactive images for Analysis Work Flow, which will provide transparency to the user to carry out analysis by moving across modules seamlessly and to perform their predictions in a rapid manner. AVAILABILITY: IWS IS AVAILABLE FROM THE URL: http://caps.ncbs.res.in/iws.  相似文献   

15.
Evaluation of short-range interactions as secondary structure energies for protein fold and sequence recognition.     
S Miyazawa  R L Jernigan 《Proteins》1999,36(3):347-356
Short-range interactions for secondary structures of proteins are evaluated as potentials of mean force from the observed frequencies of secondary structures in known protein structures which are assumed to have an equilibrium distribution with the Boltzmann factor of secondary structure energies. A secondary conformation at each residue position in a protein is described by a tripeptide, including one nearest neighbor on each side. The secondary structure potentials are approximated as additive contributions from neighboring residues along the sequence. These are part of an empirical potential to provide a crude estimate of protein conformational energy at a residue level. Unlike previous works, interactions are decoupled into intrinsic potentials of residues, potentials of backbone-backbone interactions, and of side chain-backbone interactions. Also interactions are decoupled into one-body, two-body, and higher order interactions between peptide backbone and side chain and between backbones. These decouplings are essential to correctly evaluate the total secondary structure energy of a protein structure without overcounting interactions. Each interaction potential is evaluated separately by taking account of the correlation in the amino acid order of protein sequences. Interactions among side chains are neglected, because of the relatively limited number of protein structures. Proteins 1999;36:347-356. Published 1999 Wiley-Liss, Inc.  相似文献   

16.
Investigation of metal ion binding in phosphonoacetaldehyde hydrolase identifies sequence markers for metal-activated enzymes of the HAD enzyme superfamily     
Zhang G  Morais MC  Dai J  Zhang W  Dunaway-Mariano D  Allen KN 《Biochemistry》2004,43(17):4990-4997
The 2-haloalkanoic acid dehalogenase (HAD) family, which contains both carbon and phosphoryl transferases, is one of the largest known enzyme superfamilies. HAD members conserve an alpha,beta-core domain that frames the four-loop active-site platform. Each loop contributes one or more catalytic groups, which function in mediating the core chemistry (i.e., group transfer). In this paper, we provide evidence that the number of carboxylate residues on loop 4 and their positions (stations) on the loop are determinants, and therefore reliable sequence markers, for metal ion activation among HAD family members. Using this predictor, we conclude that the vast majority of the HAD members utilize a metal cofactor. Analysis of the minimum requirements for metal cofactor binding was carried out using Mg(II)-activated Bacillus cereus phosphonoacetaldehyde hydrolase (phosphonatase) as an experimental model for metal-activated HAD members. Mg(II) binding occurs via ligation to the loop 1 Asp12 carboxylate and Thr14 backbone carbonyl and to the loop 4 Asp186 carboxylate. The loop 4 Asp190 forms a hydrogen bond to the Mg(II) water ligand. X-ray structure determination of the D12A mutant in the presence of the substrate phosphonoacetaldehyde showed that replacement of the loop 1 Asp, common to all HAD family members, with Ala shifts the position of Mg(II), thereby allowing innersphere coordination to Asp190 and causing a shift in the position of the substrate. Kinetic analysis of the loop 4 mutants showed that Asp186 is essential to cofactor binding while Asp190 simply enhances it. Within the phosphonatase subfamily, Asp186 is stringently conserved, while either position 185 or position 190 is used to position the second loop 4 Asp residue. Retention of a high level of catalytic activity in the G185D/D190G phosphonatase mutant demonstrated the plasticity of the metal binding loop, reflected in the variety of combinations in positioning of two or three Asp residues along the seven-residue motif of the 2700 potential HAD sequences that were examined.  相似文献   

17.
Trends in protein evolution inferred from sequence and structure analysis     
Aravind L  Mazumder R  Vasudevan S  Koonin EV 《Current opinion in structural biology》2002,12(3):392-399
Complementary developments in comparative genomics, protein structure determination and in-depth comparison of protein sequences and structures have provided a better understanding of the prevailing trends in the emergence and diversification of protein domains. The investigation of deep relationships among different classes of proteins involved in key cellular functions, such as nucleic acid polymerases and other nucleotide-dependent enzymes, indicates that a substantial set of diverse protein domains evolved within the primordial, ribozyme-dominated RNA world.  相似文献   

18.
Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases     
Wallqvist A  Fukunishi Y  Murphy LR  Fadel A  Levy RM 《Bioinformatics (Oxford, England)》2000,16(11):988-1002
MOTIVATION: Sequence alignment techniques have been developed into extremely powerful tools for identifying the folding families and function of proteins in newly sequenced genomes. For a sufficiently low sequence identity it is necessary to incorporate additional structural information to positively detect homologous proteins. We have carried out an extensive analysis of the effectiveness of incorporating secondary structure information directly into the alignments for fold recognition and identification of distant protein homologs. A secondary structure similarity matrix based on a database of three-dimensionally aligned proteins was first constructed. An iterative application of dynamic programming was used which incorporates linear combinations of amino acid and secondary structure sequence similarity scores. Initially, only primary sequence information is used. Subsequently contributions from secondary structure are phased in and new homologous proteins are positively identified if their scores are consistent with the predetermined error rate. RESULTS: We used the SCOP40 database, where only PDB sequences that have 40% homology or less are included, to calibrate homology detection by the combined amino acid and secondary structure sequence alignments. Combining predicted secondary structure with sequence information results in a 8-15% increase in homology detection within SCOP40 relative to the pairwise alignments using only amino acid sequence data at an error rate of 0.01 errors per query; a 35% increase is observed when the actual secondary structure sequences are used. Incorporating predicted secondary structure information in the analysis of six small genomes yields an improvement in the homology detection of approximately 20% over SSEARCH pairwise alignments, but no improvement in the total number of homologs detected over PSI-BLAST, at an error rate of 0.01 errors per query. However, because the pairwise alignments based on combinations of amino acid and secondary structure similarity are different from those produced by PSI-BLAST and the error rates can be calibrated, it is possible to combine the results of both searches. An additional 25% relative improvement in the number of genes identified at an error rate of 0.01 is observed when the data is pooled in this way. Similarly for the SCOP40 dataset, PSI-BLAST detected 15% of all possible homologs, whereas the pooled results increased the total number of homologs detected to 19%. These results are compared with recent reports of homology detection using sequence profiling methods. AVAILABILITY: Secondary structure alignment homepage at http://lutece.rutgers.edu/ssas CONTACT: anders@rutchem.rutgers.edu; ronlevy@lutece.rutgers.edu Supplementary Information: Genome sequence/structure alignment results at http://lutece.rutgers.edu/ss_fold_predictions.  相似文献   

19.
The ASTRAL compendium for protein structure and sequence analysis   总被引:10,自引:1,他引:9  
Brenner SE  Koehl P  Levitt M 《Nucleic acids research》2000,28(1):254-256
The ASTRAL compendium provides several databases and tools to aid in the analysis of protein structures, particularly through the use of their sequences. The SPACI scores included in the system summarize the overall characteristics of a protein structure. A structural alignments database indicates residue equivalencies in superimposed protein domain structures. The PDB sequence-map files provide a linkage between the amino acid sequence of the molecule studied (SEQRES records in a database entry) and the sequence of the atoms experimentally observed in the structure (ATOM records). These maps are combined with information in the SCOPdatabase to provide sequences of protein domains. Selected subsets of the domain database, with varying degrees of similarity measured in several different ways, are also available. ASTRALmay be accessed at http://astral.stanford.edu/  相似文献   

20.
Evolutionary genomics of the HAD superfamily: understanding the structural adaptations and catalytic diversity in a superfamily of phosphoesterases and allied enzymes     
Burroughs AM  Allen KN  Dunaway-Mariano D  Aravind L 《Journal of molecular biology》2006,361(5):1003-1034
The HAD (haloacid dehalogenase) superfamily includes phosphoesterases, ATPases, phosphonatases, dehalogenases, and sugar phosphomutases acting on a remarkably diverse set of substrates. The availability of numerous crystal structures of representatives belonging to diverse branches of the HAD superfamily provides us with a unique opportunity to reconstruct their evolutionary history and uncover the principal determinants that led to their diversification of structure and function. To this end we present a comprehensive analysis of the HAD superfamily that identifies their unique structural features and provides a detailed classification of the entire superfamily. We show that at the highest level the HAD superfamily is unified with several other superfamilies, namely the DHH, receiver (CheY-like), von Willebrand A, TOPRIM, classical histone deacetylases and PIN/FLAP nuclease domains, all of which contain a specific form of the Rossmannoid fold. These Rossmannoid folds are distinguished from others by the presence of equivalently placed acidic catalytic residues, including one at the end of the first core beta-strand of the central sheet. The HAD domain is distinguished from these related Rossmannoid folds by two key structural signatures, a "squiggle" (a single helical turn) and a "flap" (a beta hairpin motif) located immediately downstream of the first beta-strand of their core Rossmanoid fold. The squiggle and the flap motifs are predicted to provide the necessary mobility to these enzymes for them to alternate between the "open" and "closed" conformations. In addition, most members of the HAD superfamily contains inserts, termed caps, occurring at either of two positions in the core Rossmannoid fold. We show that the cap modules have been independently inserted into these two stereotypic positions on multiple occasions in evolution and display extensive evolutionary diversification independent of the core catalytic domain. The first group of caps, the C1 caps, is directly inserted into the flap motif and regulates access of reactants to the active site. The second group, the C2 caps, forms a roof over the active site, and access to their internal cavities might be in part regulated by the movement of the flap. The diversification of the cap module was a major factor in the exploration of a vast substrate space in the course of the evolution of this superfamily. We show that the HAD superfamily contains 33 major families distributed across the three superkingdoms of life. Analysis of the phyletic patterns suggests that at least five distinct HAD proteins are traceable to the last universal common ancestor (LUCA) of all extant organisms. While these prototypes diverged prior to the emergence of the LUCA, the major diversification in terms of both substrate specificity and reaction types occurred after the radiation of the three superkingdoms of life, primarily in bacteria. Most major diversification events appear to correlate with the acquisition of new metabolic capabilities, especially related to the elaboration of carbohydrate metabolism in the bacteria. The newly identified relationships and functional predictions provided here are likely to aid the future exploration of the numerous poorly understood members of this large superfamily of enzymes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号