首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Structural comparisons of the two GTPase activating proteins (GAPs) p120 and p50 in complex with Ras and Rho, respectively, allowed us to decipher the functional role of specific structural features, such as helix alpha8c of p120 and helix A1 of p50, necessary for small GTPase recognition. We identified important residues that may be critical for stabilization of the GAP/GTPase binary complexes. Detection of topohydrophobic positions (positions which are most often occupied by hydrophobic amino acids within a family of protein domains) conserved between the two GAP families led to the characterization of a common flexible four-helix bundle. Altogether, these data are consistent with a rearrangement of several helices around a common core, which strongly supports the assumption that p50 and p120 GAPs derive from a unique fold. Considered as a whole, the remarkable plasticity of GAPs appears to be a means used by nature to accurately confer functional specificity.  相似文献   

2.
The sequences of four-alpha-helical bundle proteins are characterized by a pattern of hydrophilic and hydrophobic amino acids which is repeated every seven residues. At each position of the heptad repeat there are specific constraints on the amino acid properties which result from the topology of the tertiary motif. These constraints give rise to patterns of amino acid distribution which are distinct from those of other proteins. The distributions in each of the heptad positions have been determined by a statistical analysis of structural and sequence data derived from seven families of aligned protein sequences. The constitution of each position is dominated by a very small number of different amino acids, with the core positions consisting overwhelmingly of Leu and Ala. The positional preferences of the individual amino acids can be generally interpreted in terms of residue properties and topological constraints. The potential for four-alpha-helix bundle folding is reflected primarily in the pattern of residue occurrence in the heptad and not in the overall amino acid composition of the protein. Possible applications of this analysis in structure predictions, sequence alignments and in the rational design and engineering of four-alpha-helical bundle proteins are discussed.  相似文献   

3.
It is known that larger globular proteins are built from domains, relatively independent structural units. A domain size seems to be limited, and a single domain consists of from few tens to a couple of hundred amino acids. Based on Monte Carlo simulations of a reduced protein model restricted to the face centered simple cubic lattice, with a minimal set of short-range and long-range interactions, we have shown that some model sequences upon the folding transition spontaneously divide into separate domains. The observed domain sizes closely correspond to the sizes of real protein domains. Short chains with a proper sequence pattern of the hydrophobic and polar residues undergo a two-state folding transition to the structurally ordered globular state, while similar longer sequences follow a multistate transition. Homopolymeric (uniformly hydrophobic) chains and random heteropolymers undergo a continuous collapse transition into a single globule, and the globular state is much less ordered. Thus, the factors responsible for the multidomain structure of proteins are sufficiently long polypeptide chain and characteristic, protein-like, sequence patterns. These findings provide some hints for the analysis of real sequences aimed at prediction of the domain structure of large proteins.  相似文献   

4.
Many protein regions have been shown to be intrinsically disordered, lacking unique structure under physiological conditions. These intrinsically disordered regions are not only very common in proteomes, but also crucial to the function of many proteins, especially those involved in signaling, recognition, and regulation. The goal of this work was to identify the prevalence, characteristics, and functions of conserved disordered regions within protein domains and families. A database was created to store the amino acid sequences of nearly one million proteins and their domain matches from the InterPro database, a resource integrating eight different protein family and domain databases. Disorder prediction was performed on these protein sequences. Regions of sequence corresponding to domains were aligned using a multiple sequence alignment tool. From this initial information, regions of conserved predicted disorder were found within the domains. The methodology for this search consisted of finding regions of consecutive positions in the multiple sequence alignments in which a 90% or more of the sequences were predicted to be disordered. This procedure was constrained to find such regions of conserved disorder prediction that were at least 20 amino acids in length. The results of this work included 3,653 regions of conserved disorder prediction, found within 2,898 distinct InterPro entries. Most regions of conserved predicted disorder detected were short, with less than 10% of those found exceeding 30 residues in length.  相似文献   

5.
6.
Choulier L  Lafont V  Hugo N  Altschuh D 《Proteins》2000,41(4):475-484
A nonrestrictive method for identifying covariance in protein families is described and applied to human and mouse germline Vkappa and VH sequence alignments. Amino acids that occur at each position in a sequence alignment are divided into two sets, called a word, by generating all possible combinations of alternative amino acids. Each word is associated with a pattern of changes. Words with identical patterns identify covariant positions. In antibody variable domains, the number of words generated ranged between 1103 and 2195 depending on the alignment, of which 4 to 12 % occurred in covariant pairs. Despite the nonrestrictive character of pattern generation, covariant residues did not reflect a random selection with respect to the nature of amino acid changes and/or their spatial proximity in a reference crystallographic structure. This approach allowed the identification of a covariance signal for positions with high variability, mostly located in the outer part of the common structural framework of antibody variable domains. Covariance in these regions may reflect the existence of alternative and mutually exclusive atomic arrangements that are compatible with antibody function. The method may be of general applicability to rationalize residue variability in protein families.  相似文献   

7.
The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.  相似文献   

8.
A set of 298 protein families from psychrophilic Vibrio salmonicida was compiled to identify genotypic characteristics that discern it from orthologous sequences from the mesophilic Vibrio/Photobacterium branch of the gamma-Proteobacteria (Vibrionaceae family). In our comparative exploration we employed alignment based bioinformatical and statistical methods. Interesting information was found in the substitution matrices, and the pattern of asymmetries in the amino acid substitution process. Together with the compositional difference, they identified the amino acids Ile, Asn, Ala and Gln as those having the most psycrophilic involvement. Ile and Asn are enhanced whereas Gln and Ala are suppressed. The inflexible Pro residue is also suppressed in loop regions, as expected in a flexible structure. The dataset were also classified and analysed according to the predicted subcellular location, and we made an additional study of 183 intracellular and 65 membrane proteins. Our results revealed that the psychrophilic proteins have similar hydrophobic and charge contributions in the core of the protein as mesophilic proteins, while the solvent-exposed surface area is significantly more hydrophobic. In addition, the psychrophilic intracellular (but not the membrane) proteins are significantly more negatively charged at the surface. Our analysis supports the hypothesis of preference for more flexible amino acids at the molecular surface. Life in cold climate seems to be obtained through many minor structural modifications rather than certain amino acids substitutions.  相似文献   

9.
10.
To investigate the relationships between sequence conservation, protein stability, and protein function, we have measured the thermodynamic stability, folding kinetics, and in vitro peptide-binding activity of a large number of single-site substitutions in the hydrophobic core of the Fyn SH3 domain. Comparison of these data to that derived from an analysis of a large alignment of SH3 domain sequences revealed a very good correlation between the distinct pattern of conservation observed at each core position and the thermodynamic stability of mutants. Conservation was also found to correlate well with the unfolding rates of mutants, but not to the folding rates, suggesting that evolution selects more strongly for optimal native state packing interactions than for maximal folding rates. Structural analysis suggests that residue-residue core packing interactions are very similar in all SH3 domains, which provides an explanation for the correlation between conservation and mutant stability effects studied in a single SH3 domain. We also demonstrate a correlation between stability and the in vivo activity of mutants, and between conservation and activity. However, the relationship between conservation and activity was very strong only for the three most conserved hydrophobic core positions. The weaker correlation between activity and conservation seen at the other seven core positions indicates that maintenance of protein stability is the dominant selective pressure at these positions. In general, the pattern of conservation at hydrophobic core positions appears to arise from conserved packing constraints, and can be effectively utilized to predict the destabilizing effects of amino acid substitutions.  相似文献   

11.
Presecretory signal peptides of 39 proteins from diverse prokaryotic and eukaryotic sources have been compared. Although varying in length and amino acid composition, the labile peptides share a hydrophobic core of approximately 12 amino acids. A positively charged residue (Lys or Arg) usually precedes the hydrophobic core. Core termination is defined by the occurrence of a charged residue, a sequence of residues which may induce a beta-turn in a polypeptide, or an interruption in potential alpha-helix or beta-extended strand structure. The hydrophobic cores contain, by weight average, 37% Leu: 15% Ala: 10% Val: 10% Phe: 7% Ile plus 21% other hydrophobic amino acids arranged in a non-random sequence. Following the hydrophobic cores (aligned by their last residue) a highly non-random and localized distribution of Ala is apparent within the initial eight positions following the core: (formula; see text) Coincident with this observation, Ala-X-Ala is the most frequent sequence preceding signal peptidase cleavage. We propose the existence of a signal peptidase recognition sequence A-X-B with the preferred cleavage site located after the sixth amino acid following the core sequence. Twenty-two of the above 27 underlined Ala residues would participate as A or B in peptidase cleavage. Position A includes the larger aliphatic amino acids, Leu, Val and Ile, as well as the residues already found at B (principally Ala, Gly and Ser). Since a preferred cleavage site can be discerned from carboxyl and not amino terminal alignment of the hydrophobic cores it is proposed that the carboxyl ends are oriented inward toward the lumen of the endoplasmic reticulum where cleavage is thought to occur. This orientation coupled with the predicted beta-turn typically found between the core and the cleavage site implies reverse hairpin insertion of the signal sequence. The structural features which we describe should help identify signal peptides and cleavage sites in presumptive amino acid sequences derived from DNA sequences.  相似文献   

12.
13.
MOTIVATION: A large, high-quality database of homologous sequence alignments with good estimates of their corresponding phylogenetic trees will be a valuable resource to those studying phylogenetics. It will allow researchers to compare current and new models of sequence evolution across a large variety of sequences. The large quantity of data may provide inspiration for new models and methodology to study sequence evolution and may allow general statements about the relative effect of different molecular processes on evolution. RESULTS: The Pandit 7.6 database contains 4341 families of sequences derived from the seed alignments of the Pfam database of amino acid alignments of families of homologous protein domains (Bateman et al., 2002). Each family in Pandit includes an alignment of amino acid sequences that matches the corresponding Pfam family seed alignment, an alignment of DNA sequences that contain the coding sequence of the Pfam alignment when they can be recovered (overall, 82.9% of sequences taken from Pfam) and the alignment of amino acid sequences restricted to only those sequences for which a DNA sequence could be recovered. Each of the alignments has an estimate of the phylogenetic tree associated with it. The tree topologies were obtained using the neighbor joining method based on maximum likelihood estimates of the evolutionary distances, with branch lengths then calculated using a standard maximum likelihood approach.  相似文献   

14.
Patterns of hydrophobic and hydrophilic residues play a major role in protein folding and function. Long, predominantly hydrophobic strings of 20-22 amino acids each are associated with transmembrane helices and have been used to identify such sequences. Much less attention has been paid to hydrophobic sequences within globular proteins. In prior work on computer simulations of the competition between on-pathway folding and off-pathway aggregate formation, we found that long sequences of consecutive hydrophobic residues promoted aggregation within the model, even controlling for overall hydrophobic content. We report here on an analysis of the frequencies of different lengths of contiguous blocks of hydrophobic residues in a database of amino acid sequences of proteins of known structure. Sequences of three or more consecutive hydrophobic residues are found to be significantly less common in actual globular proteins than would be predicted if residues were selected independently. The result may reflect selection against long blocks of hydrophobic residues within globular proteins relative to what would be expected if residue hydrophobicities were independent of those of nearby residues in the sequence.  相似文献   

15.
In order to study structural aspects of sequence conservation in families of homologous proteins, we have analyzed structurally aligned sequences of 585 proteins grouped into 128 homologous families. The conservation of a residue in a family is defined as the average residue similarity in a given position of aligned sequences. The residue similarities were expressed in the form of log-odd substitution tables that take into account the environments of amino acids in three-dimensional structures. The protein core is defined as those residues that have less then 7% solvent accessibility. The density of a protein core is described in terms of atom packing, which is investigated as a criterion for residue substitution and conservation. Although there is no significant correlation between sequence conservation and average atom packing around nonpolar residues such as leucine, valine and isoleucine, a significant correlation is observed for polar residues in the protein core. This may be explained by the hydrogen bonds in which polar residues are involved; the better their protection from water access the more stable should be the structure in that position. Proteins 33:358–366, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

16.
We identified key residues from the structural alignment of families of protein domains from SCOP which we represented in the form of sparse protein signatures. A signature-generating algorithm (SigGen) was developed and used to automatically identify key residues based on several structural and sequence-based criteria. The capacity of the signatures to detect related sequences from the SWISSPROT database was assessed by receiver operator characteristic (ROC) analysis and jack-knife testing. Test signatures for families from each of the main SCOP classes are described in relation to the quality of the structural alignments, the SigGen parameters used, and their diagnostic performance. We show that automatically generated signatures are potently diagnostic for their family (ROC50 scores typically >0.8), consistently outperform random signatures, and can identify sequence relationships in the "twilight zone" of protein sequence similarity (<40%). Signatures based on 15%-30% of alignment positions occurred most frequently among the best-performing signatures. When alignment quality is poor, sparser signatures perform better, whereas signatures generated from higher-quality alignments of fewer structures require more positions to be diagnostic. Our validation of signatures from the Globin family shows that when sequences from the structural alignment are removed and new signatures generated, the omitted sequences are still detected. The positions highlighted by the signature often correspond (alignment specificity >0.7) to the key positions in the original (non-jack-knifed) alignment. We discuss potential applications of sparse signatures in sequence annotation and homology modeling.  相似文献   

17.
18.
S K Holland  K Harlos    C C Blake 《The EMBO journal》1987,6(7):1875-1880
The proposed homology between the fibronectin type II domain and the Kringle domains of blood clotting and fibrinolytic proteins has been examined in three dimensions by substituting the type II sequence into the bovine prothrombin Kringle 1 tertiary structure, determined by X-ray crystallographical methods at 3.8 A. Structural substitution of aligned amino acids of the type II domains and the Kringle produces a compact chain fold and deletions and insertions in the type II sequence are accommodated within the modelled structure. This confirms the structural homology between the two domains and verifies the sequence alignment and common evolution of the type II and Kringle units. The two structures contain homologous hydrophobic cores, centered around the two disulphide bridges which link conserved beta-type strands. Gross differences between the two domains occur in exterior loops and potential functional sites in these regions of the type II structures as found in fibronectin, Factor XII and seminal fluid protein PDC-109 are proposed. We suggest that the domains evolved from a common ancestral protein comprising the hydrophobic core and disulphide arrangement which later diverged to bind different macromolecules through adaptation of the external loops.  相似文献   

19.
Abstract

Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20–25 %, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555–574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis.

HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.  相似文献   

20.
Src homology 2 (SH2) regions are short (approximately 100 amino acids), non-catalytic domains conserved among a wide variety of proteins involved in cytoplasmic signaling induced by growth factors. It is thought that SH2 domains play an important role in the intracellular response to growth factor stimulation by binding to phosphotyrosine containing proteins. In this paper we apply the techniques of multiple sequence alignment, secondary structure prediction and conservation analysis to 67 SH2 domain amino acid sequences. This combined approach predicts seven core secondary structure regions with the pattern beta-alpha-beta-beta-beta-beta-alpha, identifies those residues most likely to be buried in the hydrophobic core of the native SH2 domain, and highlights patterns of conservation indicative of secondary structural elements. Residues likely to be involved in phosphotyrosine binding are shown and orientations of the predicted secondary structures suggested which could enable such residues to cooperate in phosphate binding. We propose a consensus pattern that encapsulates the principal conserved features of the SH2 domains. Comparison of the proposed SH2 domain of akt to this pattern shows only 12/40 matches, suggesting that this domain may not exhibit SH2-like properties.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号