首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Helical membrane proteins are more tightly packed and the packing interactions are more diverse than those found in helical soluble proteins. Based on a linear correlation between amino acid packing values and interhelical propensity, we propose the concept of a helix packing moment to predict the orientation of helices in helical membrane proteins and membrane protein complexes. We show that the helix packing moment correlates with the helix interfaces of helix dimers of single pass membrane proteins of known structure. Helix packing moments are also shown to help identify the packing interfaces in membrane proteins with multiple transmembrane helices, where a single helix can have multiple contact surfaces. Analyses are described on class A G protein-coupled receptors (GPCRs) with seven transmembrane helices. We show that the helix packing moments are conserved across the class A family of GPCRs and correspond to key structural contacts in rhodopsin. These contacts are distinct from the highly conserved signature motifs of GPCRs and have not previously been recognized. The specific amino acid types involved in these contacts, however, are not necessarily conserved between subfamilies of GPCRs, indicating that the same protein architecture can be supported by a diverse set of interactions. In GPCRs, as well as membrane channels and transporters, amino acid residues with small side-chains (Gly, Ala, Ser, Cys) allow tight helix packing by mediating strong van der Waals interactions between helices. Closely packed helices, in turn, facilitate interhelical hydrogen bonding of both weakly polar (Ser, Thr, Cys) and strongly polar (Asn, Gln, Glu, Asp, His, Arg, Lys) amino acid residues. We propose the use of the helix packing moment as a complementary tool to the helical hydrophobic moment in the analysis of transmembrane sequences.  相似文献   

2.
Multiple sequence alignments become biologically meaningful only if conserved and functionally important residues and secondary structural elements preserved can be identified at equivalent positions. This is particularly important for transmembrane proteins like G-protein coupled receptors (GPCRs) with seven transmembrane helices. TM-MOTIF is a software package and an effective alignment viewer to identify and display conserved motifs and amino acid substitutions (AAS) at each position of the aligned set of homologous sequences of GPCRs. The key feature of the package is to display the predicted membrane topology for seven transmembrane helices in seven colours (VIBGYOR colouring scheme) and to map the identified motifs on its respective helices /loop regions. It is an interactive package which provides options to the user to submit query or pre-aligned set of GPCR sequences to align with a reference sequence, like rhodopsin, whose structure has been solved experimentally. It also provides the possibility to identify the nearest homologue from the available inbuilt GPCR or Olfactory Receptor cluster dataset whose association is already known for its receptor type. AVAILABILITY: The database is available for free at mini@ncbs.res.in.  相似文献   

3.
The three-dimensional structures of homologous proteins are usually conserved during evolution, as are critical residues in a few short sequence motifs that often constitute the active site in enzymes. The precise spatial organization of such sites depends on the lengths and positions of the secondary structural elements connecting the motifs. We show how members of protein superfamilies, such as kinesins, myosins, and G(alpha) subunits of trimeric G proteins, are identified and classed by simply counting the number of amino acid residues between important sequence motifs in their nucleotide triphosphate-hydrolyzing domains. Subfamily-specific landmark patterns (motif to motif scores) are principally due to inserts and gaps in surface loops. Unusual protein sequences and possible sequence prediction errors are detected.  相似文献   

4.

Background

Diacylglycerol acyltransferase families (DGATs) catalyze the final and rate-limiting step of triacylglycerol (TAG) biosynthesis in eukaryotic organisms. Understanding the roles of DGATs will help to create transgenic plants with value-added properties and provide clues for therapeutic intervention for obesity and related diseases. The objective of this analysis was to identify conserved sequence motifs and amino acid residues for better understanding of the structure-function relationship of these important enzymes.

Results

117 DGAT sequences from 70 organisms including plants, animals, fungi and human are obtained from database search using tung tree DGATs. Phylogenetic analysis separates these proteins into DGAT1 and DGAT2 subfamilies. These DGATs are integral membrane proteins with more than 40% of the total amino acid residues being hydrophobic. They have similar properties and amino acid composition except that DGAT1s are approximately 20 kDa larger than DGAT2s. DGAT1s and DGAT2s have 41 and 16 completely conserved amino acid residues, respectively, although only two of them are shared by all DGATs. These residues are distributed in 7 and 6 sequence blocks for DGAT1s and DGAT2s, respectively, and located at the carboxyl termini, suggesting the location of the catalytic domains. These conserved sequence blocks do not contain the putative neutral lipid-binding domain, mitochondrial targeting signal, or ER retrieval motif. The importance of conserved residues has been demonstrated by site-directed and natural mutants.

Conclusions

This study has identified conserved sequence motifs and amino acid residues in all 117 DGATs and the two subfamilies. None of the completely conserved residues in DGAT1s and DGAT2s is present in recently reported isoforms in the multiple sequences alignment, raising an important question how proteins with completely different amino acid sequences could perform the same biochemical reaction. The sequence analysis should facilitate studying the structure-function relationship of DGATs with the ultimate goal to identify critical amino acid residues for engineering superb enzymes in metabolic engineering and selecting enzyme inhibitors in therapeutic application for obesity and related diseases.  相似文献   

5.
Based on the now available crystallographic data of the G-protein-coupled receptor (GPCR) prototype rhodopsin, many studies have been undertaken to build or verify models of other GPCRs. Here, we mined evolution as an additional source of structural information that may guide GPCR model generation as well as mutagenesis studies. The sequence information of 61 cloned orthologs of a P2Y-like receptor (GPR34) enabled us to identify motifs and residues that are important for maintaining the receptor function. The sequence data were compared with available sequences of 77 rhodopsin orthologs. Under a negative selection mode, only 17% of amino acid residues were preserved during 450 million years of GPR34 evolution. On the contrary, in rhodopsin evolution approximately 43% residues were absolutely conserved between fish and mammals. Despite major differences in their structural conservation, a comparison of structural data suggests that the global arrangement of the transmembrane core of GPR34 orthologs is similar to rhodopsin. The evolutionary approach was further applied to functionally analyze the relevance of common scaffold residues and motifs found in most of the rhodopsin-like GPCRs. Our analysis indicates that, in contrast to other GPCRs, maintaining the unique function of rhodopsin requires a more stringent network of relevant intramolecular constrains.  相似文献   

6.
G-protein coupled receptors (GPCRs) are the largest class of molecules involved in signal transduction across membranes, and represent major targets in the development of novel drug candidates in all clinical areas. Membrane cholesterol has been reported to have an important role in the function of a number of GPCRs. Several structural features of proteins, believed to result in preferential association with cholesterol, have been recognized. Cholesterol recognition/interaction amino acid consensus (CRAC) sequence represents such a motif. Many proteins that interact with cholesterol have been shown to contain the CRAC motif in their sequence. We report here the presence of CRAC motifs in three representative GPCRs, namely, rhodopsin, the β(2)-adrenergic receptor, and the serotonin(1A) receptor. Interestingly, the function of these GPCRs has been previously shown to be dependent on membrane cholesterol. The presence of CRAC motifs in GPCRs indicates that interaction of cholesterol with GPCRs could be specific in nature. Further analysis shows that CRAC motifs are inherent characteristic features of the serotonin(1A) receptor and are conserved over natural evolution. These results constitute the first report of the presence of CRAC motifs in GPCRs and provide novel insight in the molecular nature of GPCR-cholesterol interaction.  相似文献   

7.
The unusually stable and multifunctional, thin aggregative fimbriae common to all Salmonella spp. are principally polymers of the fimbrin subunit, AgfA. AgfA of Salmonella enteritidis consists of two domains: a protease-sensitive, 22 amino acid residue N-terminal region and a protease-resistant, 109 residue C-terminal core. The unusual amino acid sequence of the AgfA core region comprises two-, five- and tenfold internal sequence homology patterns reflected in five conserved, 18-residue tandem repeats. These repeats have the consensus sequence, Sx5QxGx2NxAx3Q and are linked together by four or five residues, (x)xAx2. The predicted secondary structure for this unusual arrangement of tandem repeats in AgfA indicates mainly extended conformation with the beta strands linked by four to six residues. Candidate proteins of known structure with motifs of alternating beta strands and short loops were selected from folds described in SCOP as a source of coordinates for AgfA model construction. Three all-beta class motifs selected from the Serratia marcescens metalloprotease, myelin P2 protein or vitelline membrane outer protein I were used for initial AgfA homology build-up procedures ultimately resulting in three structural models; beta barrel, beta prism and parallel beta helix. The beta barrel model is a compact, albeit irregular structure, with the beta strands arranged in two antiparallel beta sheet faces. The beta prism model does not reflect the 5 or 10-fold symmetry of the AgfA primary sequence. However, the favored, parallel beta helix model is a compact coil of ten helically arranged beta strands forming two parallel beta sheet faces. This arrangement predicts a regular, potentially stable, C-terminal core region consistent with the observed tandem repeat sequences, protease-resistance and strong tendency of this fimbrin to oligomerize and aggregate. Positional conservation of amino acid residues in AgfA and the Escherichia coli AgfA homologue, CsgA, provides strong support for this model. The parallel beta helix model of AgfA offers an interesting solution to a multifunctional fimbrin molecular surface having solvent exposed areas, regions for major and minor subunit interactions as well as fiber-fiber interactions common to many bacterial fimbriae.  相似文献   

8.

Background  

Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physico-chemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, Inductive Logic Programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM).  相似文献   

9.
We characterized a full-length gene encoding wild silkmoth Antheraea pernyi fibroin (Ap-fibroin) to clarify the conformation of repetitive sequences. The gene consisted of a first exon encoding 14 amino acid residues, a short intron (120 bp), and a long second exon encoding 2,625 amino acid residues. Three amino acids, alanine, glycine, and serine, amounted to 81% of the Ap-fibroin sequence. The Ap-fibroin, except for 155 residues of the amino terminus, was composed of 80 tandemly arranged polyalanine-containing units (motifs). A motif was a doublet of a polyalanine block (PAB) and a nonpolyalanine block (NPAB). Seventy-eight of the 80 motifs were classified into four types based on differences in the NPAB sequences. Although respective motifs were significantly conserved, many rearrangements were observed within the second exon, i.e., the triplication of a 558-bp-long sequence and other duplication events of shorter sequences. Chi-like sequences, GCTGGAG, might contribute to the rearrangement within the gene as described in human minisatellite loci, because they were found at specific sites of NPAB-encoding sequences in three of four types of motifs. The present results support the idea that the Ap-fibroin gene is unstable like minisatellite sequences and that the evolution of this gene is strongly associated with its instability. Received: 18 February 2000 / Accepted: 30 June 2000  相似文献   

10.
Computational methods such as sequence alignment and motif construction are useful in grouping related proteins into families, as well as helping to annotate new proteins of unknown function. These methods identify conserved amino acids in protein sequences, but cannot determine the specific functional or structural roles of conserved amino acids without additional study. In this work, we present 3MATRIX (http://3matrix.stanford.edu) and 3MOTIF (http://3motif.stanford.edu), a web-based sequence motif visualization system that displays sequence motif information in its appropriate three-dimensional (3D) context. This system is flexible in that users can enter sequences, keywords, structures or sequence motifs to generate visualizations. In 3MOTIF, users can search using discrete sequence motifs such as PROSITE patterns, eMOTIFs, or any other regular expression-like motif. Similarly, 3MATRIX accepts an eMATRIX position-specific scoring matrix, or will convert a multiple sequence alignment block into an eMATRIX for visualization. Each query motif is used to search the protein structure database for matches, in which the motif is then visually highlighted in three dimensions. Important properties of motifs such as sequence conservation and solvent accessible surface area are also displayed in the visualizations, using carefully chosen color shading schemes.  相似文献   

11.
The amino acid sequences of the a subunits of tryptophan synthase from ten different microorganisms were aligned by standard procedures. The alpha helices, beta strands and turns of each sequence were predicted separately by two standard prediction algorithms and averaged at homologous sequence positions. Additional evidence for conserved secondary structure was derived from profiles of average hydropathy and chain flexibility values, leading to a joint prediction. There is good agreement between (1) predicted beta strands, maximal hydropathy and minimal flexibility, and (2) predicted loops, great chain flexibility, and protein segments that accept insertions of various lengths in individual sequences. The a subunit is predicted to have eight repeated beta-loop-alpha-loop motifs with an extra N-terminal alpha helix and an intercalated segment of highly conserved residues. This pattern suggests that the territory structure of the a subunit is an eightfold alpha/beta barrel. The distribution of conserved amino acid residues and published data on limited proteolysis, chemical modification, and mutagenesis are consistent with the alpha/beta barrel structure. Both the active site of the a subunit and the combining site for the beta 2 subunit are at the end of the barrel formed by the carboxyl-termini of the beta strands.  相似文献   

12.
The amino acid sequence of the first domain (positions 1-175) of Panulirus interruptus hemocyanin subunit a has been determined. The sequence of residues 1-158 (18-kDa fragment obtained by limited proteolysis) was derived from peptides obtained by digestion of this fragment with CNBr and trypsin and by subdigestion of these peptides with other enzymes. The peptides were sequences automatically or manually. The amino acid sequence has been fitted into the electron-density map at 0.32-nm resolution. The residues of domain 1 are folded into a large, mainly helical, globular part, containing one disulfide bridge, and a smaller part near the molecular twofold axis. The latter part consists of an alpha helix and a beta strand which contains a covalently attached carbohydrate moiety. The sites susceptible to limited proteolytic cleavage of the subunit are discussed. Comparison of the N-terminal sequence with those of other arthropod hemocyanins revealed, besides an N-terminal extension of five residues, the presence of a 21-residue loop (positions 22-42) in the crustacean sequences. This loop contains helix 1.2, a less defined region in the electron-density map. It is absent in chelicerate sequences. Strong evidence is presented that: (a) the structure of the first 21 residues (including helix 1.1) is the same in all arthropod hemocyanins with known amino acid sequence; (b) a stretch containing about 15 residues (including part of helix 1.3) following the 21-residue loop has a different structure in crustaceans and chelicerates; (c) the rest of domain 1 has the same structure again. It is shown that all conserved residues are in the contact region with the other two domains.  相似文献   

13.
J M Baldwin 《The EMBO journal》1993,12(4):1693-1703
G protein-coupled receptors form a large family of integral membrane proteins whose amino acid sequences have seven hydrophobic segments containing distinctive sequence patterns. Rhodopsin, a member of the family, is known to have transmembrane alpha-helices. The probable arrangement of the seven helices, in all receptors, was deduced from structural information extracted from a detailed analysis of the sequences. Constraints established include: (1) each helix must be positioned next to its neighbours in the sequence; (2) helices I, IV and V must be most exposed to the lipid surrounding the receptor and helix III least exposed. (1) is established from the lengths of the shortest loops. (2) is determined by considering: (i) sites of the most conserved residues; (ii) other sites where variability is restricted; (iii) sites that accommodate polar residues; (iv) sites of differences in sequence between pairs or within groups of closely related receptors. Most sites in the last category should be in unimportant positions and are most useful in determining the position and extent of lipid-facing surface in each helix. The structural constraints for the receptors are used to allocate particular helices to the peaks in the recently published projection map of rhodopsin and to propose a tentative three-dimensional arrangement of the helices in G protein-coupled receptors.  相似文献   

14.
15.
Studies of the dimerization of transmembrane (TM) helices have been ongoing for many years now, and have provided clues to the fundamental principles behind membrane protein (MP) folding. Our understanding of TM helix dimerization has been dominated by the idea that sequence motifs, simple recognizable amino acid sequences that drive lateral interaction, can be used to explain and predict the lateral interactions between TM helices in membrane proteins. But as more and more unique interacting helices are characterized, it is becoming clear that the sequence motif paradigm is incomplete. Experimental evidence suggests that the search for sequence motifs, as mediators of TM helix dimerization, cannot solve the membrane protein folding problem alone. Here we review the current understanding in the field, as it has evolved from the paradigm of sequence motifs into a view in which the interactions between TM helices are much more complex. This article is part of a Special Issue entitled: Membrane protein structure and function.  相似文献   

16.
Class A G-protein-coupled receptors (GPCRs) constitute the largest family of transmembrane receptors in the human genome. Understanding the mechanisms which drove the evolution of such a large family would help understand the specificity of each GPCR sub-family with applications to drug design. To gain evolutionary information on class A GPCRs, we explored their sequence space by metric multidimensional scaling analysis (MDS). Three-dimensional mapping of human sequences shows a non-uniform distribution of GPCRs, organized in clusters that lay along four privileged directions. To interpret these directions, we projected supplementary sequences from different species onto the human space used as a reference. With this technique, we can easily monitor the evolutionary drift of several GPCR sub-families from cnidarians to humans. Results support a model of radiative evolution of class A GPCRs from a central node formed by peptide receptors. The privileged directions obtained from the MDS analysis are interpretable in terms of three main evolutionary pathways related to specific sequence determinants. The first pathway was initiated by a deletion in transmembrane helix 2 (TM2) and led to three sub-families by divergent evolution. The second pathway corresponds to the differentiation of the amine receptors. The third pathway corresponds to parallel evolution of several sub-families in relation with a covarion process involving proline residues in TM2 and TM5. As exemplified with GPCRs, the MDS projection technique is an important tool to compare orthologous sequence sets and to help decipher the mutational events that drove the evolution of protein families.  相似文献   

17.
By using amino acid sequence patterns (motifs) diagnostic of conserved regions within the catalytic domains of protein kinases, homologous open reading frames of three herpesviruses were identified as protein kinase-related genes. The three sequences, herpes simplex virus gene UL13, varicella-zoster virus gene 47, and Epstein-Barr virus gene BGLF4, resemble serine/threonine kinases rather than tyrosine kinases.  相似文献   

18.
Jiang W  Puch S  Guo X  Bhavanandan VP 《IUBMB life》1999,48(6):601-605
Galectins are a distinct family of animal lectins that have a cation-independent affinity for beta-galactoside sugars and share characteristic amino acid sequences. The cDNA encoding rabbit bladder galectin-4 has been cloned and sequenced (GenBank accession no. AF091738). The deduced 328 amino acid sequence predicts a multidomain structure consisting of an N-terminal peptide (19 residues) and two carbohydrate recognition domains (130 residues each) connected by a linker region (49 residues). Comparison of rabbit galectin-4 with related proteins reveals that two peptide motifs, M-A-F/Y-V-P-A-P-G-Y-Q-P-T-Y-N-P-T-L-P-Y in the N terminus and A-F-H-F-N-P-R-F-D-G-W-D-K-V-V-F in the first carbohydrate recognition domain are highly conserved in human, pig, rat, and mouse galectin-4 as well as in mouse galectin-6. The two peptide motifs are proposed here as the signature sequences to identify new members of the galectin-4 subfamily.  相似文献   

19.
Eighty-two amino acid sequences of the catalytic domains of mature endoxylanases belonging to family 11 have been aligned using the programs MATCHBOX and CLUSTAL. The sequences range in length from 175 to 233 residues. The two glutamates acting as catalytic residues are conserved in all sequences. A very good correlation is found between the presence (at position 100) of an asparagine in the so-called 'alkaline' xylanases, or an aspartic acid in those with a more acidic pH optimum. Four boxes defining segments of highest similarity were detected; they correspond to regions of defined secondary structure: B5, B6, B8 and the carboxyl end of the alpha helix, respectively. Cysteine residues are not common in these sequences (0.7% of all residues), and disulfide bridges are not important in explaining the stability of several thermophilic xylanases. The alignment allows the classification of the enzymes in groups according to sequence similarity. Fungal and bacterial enzymes were found to form mostly separate clusters of higher similarity.  相似文献   

20.
Leucine and Isoleucine are two amino acids that differ only by the positioning of one methyl group. This small difference can have important consequences in α-helices, as the β-branching of Ile results in helix destabilization. We set out to investigate whether there are general trends for the occurrences of Leu and Ile residues in the structures and sequences of class A GPCRs (G protein-coupled receptors). GPCRs are integral membrane proteins in which α-helices span the plasma membrane seven times and which play a crucial role in signal transmission. We found that Leu side chains are generally more exposed at the protein surface than Ile side chains. We explored whether this difference might be attributed to different functions of the two amino acids and tested if Leu tunes the hydrophobicity of the transmembrane domain based on the Wimley-White whole-residue hydrophobicity scales. Leu content decreases the variation in hydropathy between receptors and correlates with the non-Leu receptor hydropathy. Both measures indicate that hydropathy is tuned by Leu. To test this idea further, we generated protein sequences with random amino acid compositions using a simple numerical model, in which hydropathy was tuned by adjusting the number of Leu residues. The model was able to replicate the observations made with class A GPCR sequences. We speculate that the hydropathy of transmembrane domains of class A GPCRs is tuned by Leu (and to some lesser degree by Lys and Val) to facilitate correct insertion into membranes and/or to stably anchor the receptors within membranes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号