首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Fifty-two 3D structures of Ig-like domains covering the immunoglobulin fold family (IgFF) were compared and classified according to the conservation of their secondary structures. Members of the IgFF are distantly related proteins or evolutionarily unrelated proteins with a similar fold, the Ig fold. In this paper, a multiple structural alignment of the conserved common core is described and the correlation between corresponding sequences is discussed. While the members of the IgFF exhibit wide heterogeneity in terms of tissue and species distribution or functional implications, the 3D structures of these domains are far more conserved than their sequences. We define topologically equivalent residues in the Ig-like domains, describe the hydrophobic common cores and discuss the presence of additional strands. The disulfide bridges, not necessary for the stability of the Ig fold, may have an effect on the compactness of the domains. Based upon sequence and structure analysis, we propose the introduction of two new subtypes (C3 and C4) to the previous classifications, in addition to a new global structural classification. The very low mean sequence identity between subgroups of the IgFF suggests the occurrence of both divergent and convergent evolutionary processes, explaining the wide diversity of the superfamily. Finally, this review suggest that hydrophobic residues constituting the common hydrophobic cores are important clues to explain how highly divergent sequences can adopt a similar fold.  相似文献   

2.
C Sander  R Schneider 《Proteins》1991,9(1):56-68
The database of known protein three-dimensional structures can be significantly increased by the use of sequence homology, based on the following observations. (1) The database of known sequences, currently at more than 12,000 proteins, is two orders of magnitude larger than the database of known structures. (2) The currently most powerful method of predicting protein structures is model building by homology. (3) Structural homology can be inferred from the level of sequence similarity. (4) The threshold of sequence similarity sufficient for structural homology depends strongly on the length of the alignment. Here, we first quantify the relation between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homology threshold curve as a function of alignment length. We then produce a database of homology-derived secondary structure of proteins (HSSP) by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the number of known protein structures by a factor of five to more than 1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homology.  相似文献   

3.
The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction.  相似文献   

4.
O Poch  H L'H?te  V Dallery  F Debeaux  R Fleer  R Sodoyer 《Gene》1992,118(1):55-63
The LAC4 gene encoding the beta-galactosidase (beta Gal) of the yeast, Kluyveromyces lactis, was cloned on a 7.2-kb fragment by complementation of a lacZ-deficient Escherichia coli strain. The nucleotide sequence of the structural gene, with 42 bp and 583 bp of the 5'- and 3'-flanking sequences, respectively, was determined. The deduced amino acid (aa) sequence of the K. lactis beta Gal predicts a 1025-aa polypeptide with a calculated M(r) of 117618 and reveals extended sequence homologies with all the published prokaryotic beta Gal sequences. This suggests that the eukaryotic beta Gal is closely related, evolutionarily and structurally, to the prokaryotic beta Gal's. In addition, sequence similarities were observed between the highly conserved N-terminal two-thirds of the beta Gal and the entire length of the beta-glucuronidase (beta Glu) polypeptides, which suggests that beta Glu is clearly related, structurally and evolutionarily, to the N-terminal two-thirds of the beta Gal. The structural analysis of the beta Gal alignment, performed by mean secondary structure prediction, revealed that most of the invariant residues are located in turn or loop structures. The location of the invariant residues is discussed with respect to their accessibility and their possible involvement in the catalytic process.  相似文献   

5.
U Sreenivasan  P H Axelsen 《Biochemistry》1992,31(51):12785-12791
Buried water molecules in the structurally homologous family of eukaryotic serine proteases were examined to determine whether buried waters and their protein environments are conserved in these proteins. We found 16 equivalent water sites conserved in trypsin/ogen, chymotrypsin/ogen, elastase, kallikrein, thrombin, rat tonin and rat mast cell protease, and 5 additional water sites in enzymes which share the primary specificity of trypsin. Based on an alignment of 30 serine protease sequences, it appears that the protein environments of these 21 conserved buried waters are highly conserved. The protein environments of buried waters are comprised primarily of atoms from highly conserved residues or main chain atoms from nonconserved residues. In one instance, the protein environment of a water is conserved even in the presence of an unlikely Pro/Ala substitution. We also note 3 instances in which a histidine side chain substitutes for water, suggesting that the structural role of water at these sites is satisfied by the presence of an alternative hydrogen bonding partner. Buried waters appear to be integral structural components of these proteins and should be incorporated into protein structures predicted on the basis of sequence homology to this family, including the catalytic domains of coagulation proteases.  相似文献   

6.
The SH3 domain, comprised of approximately 60 residues, is found within a wide variety of proteins, and is a mediator of protein-protein interactions. Due to the large number of SH3 domain sequences and structures in the databases, this domain provides one of the best available systems for the examination of sequence and structural conservation within a protein family. In this study, a large and diverse alignment of SH3 domain sequences was constructed, and the pattern of conservation within this alignment was compared to conserved structural features, as deduced from analysis of eighteen different SH3 domain structures. Seventeen SH3 domain structures solved in the presence of bound peptide were also examined to identify positions that are consistently most important in mediating the peptide-binding function of this domain. Although residues at the two most conserved positions in the alignment are directly involved in peptide binding, residues at most other conserved positions play structural roles, such as stabilizing turns or comprising the hydrophobic core. Surprisingly, several highly conserved side-chain to main-chain hydrogen bonds were observed in the functionally crucial RT-Src loop between residues with little direct involvement in peptide binding. These hydrogen bonds may be important for maintaining this region in the precise conformation necessary for specific peptide recognition. In addition, a previously unrecognized yet highly conserved beta-bulge was identified in the second beta-strand of the domain, which appears to provide a necessary kink in this strand, allowing it to hydrogen bond to both sheets comprising the fold.  相似文献   

7.
Amino acid substitution tables are calculated for residues in membrane proteins where the side chain is accessible to the lipid. The analysis is based upon the knowledge of the three-dimensional structures of two homologous bacterial photosynthetic reaction centers and alignments of their sequences with the sequences of related proteins. The patterns of residue substitutions show that the lipid-accessible residues are less conserved and have distinctly different substitution patterns from the inaccessible residues in water-soluble proteins. The observed substitutions obtained from sequence alignments of transmembrane regions (identified from, e.g., hydrophobicity analysis) can be compared with the patterns derived from the substitution tables to predict the accessibility of residues to the lipid. A Fourier transform method, similar to that used for the calculation of a hydrophobic moment, is used to detect periodicity in the predicted accessibility that is compatible with the presence of an alpha-helix. If the putative transmembrane region is identified as helical, then the buried and exposed faces can be discriminated. The presence of charged residues on the lipid-exposed face can help to identify the regions that are in contact with the polar environment on the borders of the bilayer, and the construction of a meaningful three-dimensional model is then possible. This method is tested on an alignment of bacteriorhodopsin and two related sequences for which there are structural data at near atomic resolution.  相似文献   

8.
The information required to generate a protein structure is contained in its amino acid sequence, but how three-dimensional information is mapped onto a linear sequence is still incompletely understood. Multiple structure alignments of similar protein structures have been used to investigate conserved sequence features but contradictory results have been obtained, due, in large part, to the absence of subjective criteria to be used in the construction of sequence profiles and in the quantitative comparison of alignment results. Here, we report a new procedure for multiple structure alignment and use it to construct structure-based sequence profiles for similar proteins. The definition of "similar" is based on the structural alignment procedure and on the protein structural distance (PSD) described in paper I of this series, which offers an objective measure for protein structure relationships. Our approach is tested in two well-studied groups of proteins; serine proteases and Ig-like proteins. It is demonstrated that the quality of a sequence profile generated by a multiple structure alignment is quite sensitive to the PSD used as a threshold for the inclusion of proteins in the alignment. Specifically, if the proteins included in the aligned set are too distant in structure from one another, there will be a dilution of information and patterns that are relevant to a subset of the proteins are likely to be lost.In order to understand better how the same three-dimensional information can be encoded in seemingly unrelated sequences, structure-based sequence profiles are constructed for subsets of proteins belonging to nine superfolds. We identify patterns of relatively conserved residues in each subset of proteins. It is demonstrated that the most conserved residues are generally located in the regions where tertiary interactions occur and that are relatively conserved in structure. Nevertheless, the conservation patterns are relatively weak in all cases studied, indicating that structure-determining factors that do not require a particular sequential arrangement of amino acids, such as secondary structure propensities and hydrophobic interactions, are important in encoding protein fold information. In general, we find that similar structures can fold without having a set of highly conserved residue clusters or a well-conserved sequence profile; indeed, in some cases there is no apparent conservation pattern common to structures with the same fold. Thus, when a group of proteins exhibits a common and well-defined sequence pattern, it is more likely that these sequences have a close evolutionary relationship rather than the similarities having arisen from the structural requirements of a given fold.  相似文献   

9.
A general protein sequence alignment methodology for detecting a priori unknown common structural and functional regions is described. The method proposed in this paper is based on two basic requirements for a meaningful alignment. First, each sequence or segment of a sequence is characterized by a multivariate physicochemical profile. Second, the alignment is performed by considering all the sequences simultaneously, and the algorithm detects those regions that form a set of similar profiles. In order to test the structural meaning of the alignment obtained from the sequences, quantitative comparisons are performed with structurally conserved regions (SCR) determined from the X-ray structures of three serine proteases. Results suggest that the limits of the SCR may be predicted from the similarities between the physicochemical profiles of the sequences. The procedures are not completely automated. The final step requires a visual screening of alternative pathways in order to determine an optimal alignment.  相似文献   

10.
It has been shown previously that some membrane proteins have a conserved core of amino acid residues. This idea not only serves to orient helices during model building exercises but may also provide insight into the structural role of residues mediating helix-helix interactions. Using experimentally determined high-resolution structures of alpha-helical transmembrane proteins we show that, of the residues within the hydrophobic transmembrane spans, the residues at lipid and subunit interfaces are more evolutionarily variable than those within the lipid-inaccessible core of a polypeptide's transmembrane domain. This supports the idea that helix-helix interactions within the same polypeptide chain and those at the interface between different polypeptide chains may arise in distinct ways. To show this, we use a new method to estimate the substitution rate of an amino acid residue given an alignment and phylogenetic tree of closely related proteins. This method gives better sensitivity in the otherwise-conserved transmembrane domains than a conventional similarity analysis and is relatively insensitive to the sequences used.  相似文献   

11.
The genes coding for bacterioopsin, haloopsin, and sensory opsin I of a halobacterial isolate from the Red Sea called Halobacterium sp. strain SG1 have been cloned and sequenced. The deduced protein sequences were aligned to the previously known halobacterial retinal proteins. The addition of these new sequences lowered the number of conserved residues to only 23 amino acids, or 8% of the alignment. Data base searches with two highly conserved peptides as well as with an alignment profile yielded no significant similarity to any other protein, so the halobacterial retinal proteins should be regarded as a distinct protein family. The protein alignment was used to make predictions about the structure of the retinal proteins as well as about the amino acids in contact with retinal proteins. These results were in excellent agreement with the structural model of bacteriorhodopsin of Halobacterium halobium as well as with mutant studies, indicating that (i) structure predictions based on the sequences of a membrane protein family can be quite accurate; (ii) halorhodopsin and sensory rhodopsin I have tertiary structures similar to that of bacteriorhodopsin; (iii) conserved amino acids do not take part in reactions specific for one group of proteins, e.g., proton translocation for bacteriorhodopsins, but have a crucial role in determining the conformation and reactions of the chromophore; and (iv) the general mode of action (light-induced chromophore and protein movements) is the same for all halobacterial retinal proteins, ion pumps as well as sensors.  相似文献   

12.
The three-dimensional structures of globins are known, from crystallographic analyses, to be very similar. Their amino acid sequences, however, differ greatly. Only two residues are absolutely conserved in all sequences, and the residue identities of some pairs of sequences are only 16%. We have determined the nature and exact extent of the sequence variations and the extent to which the conserved features of the globin sequences are unique to this family. The 226 globin sequences now known were aligned and analysed. Because distantly related protein sequences cannot be aligned correctly without the use of structural data, we developed a method that incorporated structural information into the alignment procedure. Analysis of the aligned sequences show that: (1) Although individual chains vary in size between 132 and 157 residues, deletions and insertions result in there being only 102 residue sites common to all globins. These sites form six separate regions. Insertions and deletions between these regions means that their separations can vary in different sequences. (2) Within the conserved regions there are 32 sites that almost always contain hydrophobic residues. In the known structures, these sites are in the protein interior. We measured the variations in the size of the residues that occur in the 226 sequences at these sites. At six sites the residues differ in size by less than 40 A3, at 11 sites they differ by 40 to 100 A3, and at 15 sites they differ by more than 100 A3. There are two other conserved buried sites: one contains the His linked to the haem iron and the other usually contains a His involved with the haem ligand. (3) Within the conserved regions there are another 32 sites that are almost always occupied by charged, polar or small non-polar (Gly or Ala) residues. In the known structures, these sites are on the protein surface. To determine the extent to which the conserved features found for the globin sequences are unique to that protein family, the following procedure was used. The six conserved regions, and the residue restrictions that occur at the 66 sites within these regions, were encoded into two "templates". One was based only on the sequences so far determined; the other was extended to include as yet unobserved substitutions that seemed plausible on the basis of size, hydrophobicity and polarity. Each of the 3286 non-globin sequences in the data bank was then examined by a computer program to see how closely it could be matched to these templates.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

13.
A comparative analysis is presented of 24 known amino acid sequences of RNA-dependent RNA polymerases of positive strand RNA viruses infecting animals, plants and bacteria. Using a newly proposed methodology of group alignment for weakly similar sequences, evolutionary conserved fragments of all these proteins were unambiguously aligned. A unique pattern (consensus) of 7 invariant amino acid residues was revealed which is absent from the sequences of other RNA and DNA polymerases and is thought to unequivocally identify the RNA-dependent RNA polymerases of positive strand RNA viruses. Based on the obtained alignment a tentative phylogenetic tree of viral RNA polymerases was constructed for the first time. The RNA-dependent RNA polymerases of positive strand RNA viruses are concluded to comprise a distinct family of evolutionary related proteins.  相似文献   

14.
Sequence alignment is a common method for finding protein structurally conserved/similar regions. However, sequence alignment is often not accurate if sequence identities between to-be-aligned sequences are less than 30%. This is because that for these sequences, different residues may play similar structural roles and they are incorrectly aligned during the sequence alignment using substitution matrix consisting of 20 types of residues. Based on the similarity of physicochemical features, residues can be clustered into a few groups. Using such simplified alphabets, the complexity of protein sequences is reduced and at the same time the key information encoded in the sequences remains. As a result, the accuracy of sequence alignment might be improved if the residues are properly clustered. Here, by using a database of aligned protein structures (DAPS), a new clustering method based on the substitution scores is proposed for the grouping of residues, and substitution matrices of residues at different levels of simplification are constructed. The validity of the reduced alphabets is confirmed by relative entropy analysis. The reduced alphabets are applied to recognition of protein structurally conserved/similar regions by sequence alignment. The results indicate that the accuracy or efficiency of sequence alignment can be improved with the optimal reduced alphabet with N around 9.  相似文献   

15.
An approach is described for modelling the three-dimensional structure of a protein from the tertiary structures of several homologous proteins that have been determined by X-ray analysis. A method is developed for the simultaneous superposition of several protein molecules and for the calculation of an 'average structure' or 'framework'. Investigation of the convergence properties of this method, in the case of both weighted and unweighted least squares, demonstrates that both give a unique answer and the latter is robust for an homologous family of proteins. Multi-dimensional scaling is used to subgroup of the proteins with respect to structural homology. The framework calculated on the basis of the family of homologous proteins, or of an appropriate subgroup, is used to align fragments of the known protein structures of high sequence homology with the unknown. This alignment provides a basis for model building the tertiary structure. Different techniques for using the framework to model the mainchain of various globins and an immunoglobulin domain in the structurally conserved regions are investigated.  相似文献   

16.
The chaperonin HSP60 (GroEL) proteins are essential in eubacterial genomes and in eukaryotic organelles. Functional regions inferred from mutation studies and the Escherichia coli GroEL 3D crystal complexes are evaluated in a multiple alignment across 43 diverse HSP60 sequences, centering on ATP/ADP and Mg2+ binding sites, on residues interacting with substrate, on GroES contact positions, on interface regions between monomers and domains, and on residues important in allosteric conformational changes. The most evolutionary conserved residues relate to the ATP/ADP and Mg2+ binding sites. Hydrophobic residues that contribute in substrate binding are also significantly conserved. A large number of charged residues line the central cavity of the GroEL-GroES complex in the substrate-releasing conformation. These span statistically significant intra- and inter-monomer three-dimensional (3D) charge clusters that are highly conserved among sequences and presumably play an important role interacting with the substrate. Unaligned short segments between blocks of alignment are generally exposed at the outside wall of the Anfinsen cage complex. The multiple alignment reveals regions of divergence common to specific evolutionary groups. For example, rickettsial sequences diverge in the ATP/ADP binding domain and gram-positive sequences diverge in the allosteric transition domain. The evolutionary information of the multiple alignment proffers attractive sites for mutational studies.  相似文献   

17.
We present a multiple alignment of the amino acid sequences of eight class A beta-lactamases and utilized it to propose a phylogeny, based on the nucleotide sequences of their corresponding genes. We have also used the alignment, together with the alpha-carbon co-ordinates of the Staphylococcus aureus protein, to search systematically for neighbouring residues that share the same pattern of conservation among the different members of the protein family. The distribution of invariant residues and of groups of residues with co-ordinate changes map, predominantly, at the region of the active site and at interfaces between structural elements, respectively. We have also contrasted the distribution of conserved residues with the positions which are known to differ in mutants and variants of class A beta-lactamases.  相似文献   

18.
For applications such as comparative modelling one major issue is the reliability of sequence alignments. Reliable regions in alignments can be predicted using sub-optimal alignments of the same pair of sequences. Here we show that reliable regions in alignments can also be predicted from multiple sequence profile information alone.Alignments were created for a set of remotely related pairs of proteins using five different test methods. Structural alignments were used to assess the quality of the alignments and the aligned positions were scored using information from the observed frequencies of amino acid residues in sequence profiles pre-generated for each template structure. High-scoring regions of these profile-derived alignment scores were a good predictor of reliably aligned regions.These profile-derived alignment scores are easy to obtain and are applicable to any alignment method. They can be used to detect those regions of alignments that are reliably aligned and to help predict the quality of an alignment. For those residues within secondary structure elements, the regions predicted as reliably aligned agreed with the structural alignments for between 92% and 97.4% of the residues. In loop regions just under 92% of the residues predicted to be reliable agreed with the structural alignments. The percentage of residues predicted as reliable ranged from 32.1% for helix residues to 52.8% for strand residues.This information could also be used to help predict conserved binding sites from sequence alignments. Residues in the template that were identified as binding sites, that aligned to an identical amino acid residue and where the sequence alignment agreed with the structural alignment were in highly conserved, high scoring regions over 80% of the time. This suggests that many binding sites that are present in both target and template sequences are in sequence-conserved regions and that there is the possibility of translating reliability to binding site prediction.  相似文献   

19.
Differences in salt bridges are believed to be a structural hallmark of homologous enzymes from differently temperature-adapted organisms. Nevertheless, the role of salt bridges on structural stability is still controversial. While it is clear that most buried salt bridges can have a functional or structural role, the same cannot be firmly stated for ion pairs that are exposed on the protein surface. Salt bridges, found in X-ray structures, may not be stably formed in solution as a result of high flexibility or high desolvation penalty. More studies are thus needed to clarify the picture on salt bridges and temperature adaptation. We contribute here to this scenario by combining atomistic simulations and experimental mutagenesis of eight mutant variants of aqualysin I, a thermophilic subtilisin-like proteinase, in which the residues involved in salt bridges and not conserved in a psychrophilic homolog were systematically mutated. We evaluated the effects of those mutations on thermal stability and on the kinetic parameters.Overall, we show here that only few key charged residues involved in salt bridges really contribute to the enzyme thermal stability. This is especially true when they are organized in networks, as here attested by the D17N mutation, which has the most remarkable effect on stability. Other mutations had smaller effects on the properties of the enzyme indicating that most of the isolated salt bridges are not a distinctive trait related to the enhanced thermal stability of the thermophilic subtilase.  相似文献   

20.
In recent years, there has been an increased number of sequenced RNAs leading to the development of new RNA databases. Thus, predicting RNA structure from multiple alignments is an important issue to understand its function. Since RNA secondary structures are often conserved in evolution, developing methods to identify covariate sites in an alignment can be essential for discovering structural elements. Structure Logo is a technique established on the basis of entropy and mutual information measured to analyze RNA sequences from an alignment. We proposed an efficient Structure Logo approach to analyze conservations and correlations in a set of Cardioviral RNA sequences. The entropy and mutual information content were measured to examine the conservations and correlations, respectively. The conserved secondary structure motifs were predicted on the basis of the conservation and correlation analyses. Our predictive motifs were similar to the ones observed in the viral RNA structure database, and the correlations between bases also corresponded to the secondary structure in the database.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号