首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Despite significant methodological advances in protein structure determination high-resolution structures of membrane proteins are still rare, leaving sequence-based predictions as the only option for exploring the structural variability of membrane proteins at large scale. Here, a new structural classification approach for α-helical membrane proteins is introduced based on the similarity of predicted helix interaction patterns. Its application to proteins with known 3D structure showed that it is able to reliably detect structurally similar proteins even in the absence of any sequence similarity, reproducing the SCOP and CATH classifications with a sensitivity of 65% at a specificity of 90%. We applied the new approach to enhance our comprehensive structural classification of α-helical membrane proteins (CAMPS), which is primarily based on sequence and topology similarity, in order to find protein clusters that describe the same fold in the absence of sequence similarity. The total of 151 helix architectures were delineated for proteins with more than four transmembrane segments. Interestingly, we observed that proteins with 8 and more transmembrane helices correspond to fewer different architectures than proteins with up to 7 helices, suggesting that in large membrane proteins the evolutionary tendency to re-use already available folds is more pronounced.  相似文献   

2.
Recent progress in structure determination techniques has led to a significant growth in the number of known membrane protein structures, and the first structural genomics projects focusing on membrane proteins have been initiated, warranting an investigation of appropriate bioinformatics strategies for optimal structural target selection for these molecules. What determines a membrane protein fold? How many membrane structures need to be solved to provide sufficient structural coverage of the membrane protein sequence space? We present the CAMPS database (Computational Analysis of the Membrane Protein Space) containing almost 45,000 proteins with three or more predicted transmembrane helices (TMH) from 120 bacterial species. This large set of membrane proteins was subjected to single‐linkage clustering using only sequence alignments covering at least 40% of the TMH present in a given family. This process yielded 266 sequence clusters with at least 15 members, roughly corresponding to membrane structural folds, sufficiently structurally homogeneous in terms of the variation of TMH number between individual sequences. These clusters were further subdivided into functionally homogeneous subclusters according to the COG (Clusters of Orthologous Groups) system as well as more stringently defined families sharing at least 30% identity. The CAMPS sequence clusters are thus designed to reflect three main levels of interest for structural genomics: fold, function, and modeling distance. We present a library of Hidden Markov Models (HMM) derived from sequence alignments of TMH at these three levels of sequence similarity. Given that 24 out of 266 clusters corresponding to membrane folds already have associated known structures, we estimate that 242 additional new structures, one for each remaining cluster, would provide structural coverage at the fold level of roughly 70% of prokaryotic membrane proteins belonging to the currently most populated families. Proteins 2006. © 2006 Wiley‐Liss, Inc.  相似文献   

3.
Suwa M  Yudate HT  Masuho Y  Mitaku S 《Proteins》2000,41(4):504-517
A new theoretical method has been developed for recognition and classification of membrane proteins. The method is based on computation of a polar energy surface that can reveal characteristic interaction patterns for individual helices even if crystal or NMR structure coordinates are not available. A protein with N transmembrane helices is described as a set of N vectors that are derived from a Fourier analysis of this polar energy surface computed for each helix. We then derive a polarity difference score (PDS) for any two proteins computed as the root mean square deviation between the respective vector coordinate sets. The score was found to correlate with the degree of structural similarity between the following three protein families for which tertiary structures have been determined: bacteriorhodopsin, rhodopsin, and the cytochrome c oxidase III subunit.  相似文献   

4.
Structural classification of membrane proteins is still in its infancy due to the relative paucity of available three‐dimensional structures compared with soluble proteins. However, recent technological advances in protein structure determination have led to a significant increase in experimentally known membrane protein folds, warranting exploration of the structural universe of membrane proteins. Here, a new and completely membrane protein specific structural classification system is introduced that classifies α‐helical membrane proteins according to common helix architectures. Each membrane protein is represented by a helix interaction graph depicting transmembrane helices with their pairwise interactions resulting from individual residue contacts. Subsequently, proteins are clustered according to similarities among these helix interaction graphs using a newly developed structural similarity score called HISS. As HISS scores explicitly disregard structural properties of loop regions, they are more suitable to capture conserved transmembrane helix bundle architectures than other structural similarity scores. Importantly, we are able to show that a classification approach based on helix interaction similarity closely resembles conventional structural classification databases such as SCOP and CATH implying that helix interactions are one of the major determinants of α‐helical membrane protein folds. Furthermore, the classification of all currently available membrane protein structures into 20 recurrent helix architectures and 15 singleton proteins demonstrates not only an impressive variability of membrane helix bundles but also the conservation of common helix interaction patterns among proteins with distinctly different sequences. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

5.
For over 2 decades, continuous efforts to organize the jungle of available protein structures have been underway. Although a number of discrepancies between different classification approaches for soluble proteins have been reported, the classification of membrane proteins has so far not been comparatively studied because of the limited amount of available structural data. Here, we present an analysis of α‐helical membrane protein classification in the SCOP and CATH databases. In the current set of 63 α‐helical membrane protein chains having between 1 and 13 transmembrane helices, we observed a number of differently classified proteins both regarding their domain and fold assignment. The majority of all discrepancies affect single transmembrane helix, two helix hairpin, and four helix bundle domains, while domains with more than five helices are mostly classified consistently between SCOP and CATH. It thus appears that the structural constraints imposed by the lipid bilayer complicate the classification of membrane proteins with only few membrane‐spanning regions. This problem seems to be specific for membrane proteins as soluble four helix bundles, not restrained by the membrane, are more consistently classified by SCOP and CATH. Our findings indicate that the structural space of small membrane helix bundles is highly continuous such that even minor differences in individual classification procedures may lead to a significantly different classification. Membrane proteins with few helices and limited structural diversity only seem to be reasonably classifiable if the definition of a fold is adapted to include more fine‐grained structural features such as helix–helix interactions and reentrant regions. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

6.
Transmembrane proteins make up at least one-fifth of the genome of most organisms and are critical components of key pathways for cell survival and interactions with the environment. The function of helices found at the membrane surface in transmembrane proteins has not been greatly explored, but it is likely that they play an ancillary role to membrane spanning helices and are analogous to the surface active helices of peripheral membrane proteins, being involved in: lipid association, membrane perturbation, transmembrane signal transduction and regulation, and transmembrane helical bundle formation. Due to the difficulties in obtaining high-resolution structural data for this class of proteins, structure-from-sequence predictive methods continue to be developed as a means to obtain structural models for these largely intractable systems. A simple but effective variant of the hydrophobic moment analysis of amino acid sequences is described here as part of a protocol for distinguishing helical sequences that are parallel to or 'horizontal' at the membrane bilayer/aqueous phase interface from helices that are membrane-embedded or located in extra-membranous domains. This protocol when tested on transmembrane spanning protein amino acid sequences not used in its development, was found to be 84-91% accurate when the results were compared to the partition locations in the corresponding structures determined by X-ray crystallography, and 72% accurate in determining which helices lie horizontal or near horizontal at the lipid interface.  相似文献   

7.
A total of 160 transmembrane helices of 15 non-homologous high-resolution X-ray protein structures have been analyzed in respect of their structural features. The dihedral angles and hydrogen bonds of the helical sections that span the hydrophobic interior of the lipid bilayer have been investigated. The Ramachandran plot of protein channels and solute transporters exhibit a significant shift Delta (phi- and psi-angles) of Delta mean (+4.5 degrees and -5.4 degrees ), compared to a reference group of 151 alpha-helices of the same average length derived from water-soluble globular proteins. At the C-termini of transmembrane helices structural motifs equivalent to the Gly-caps of helices in globular proteins have been found, with two third of the transmembrane Gly-caps taking up a primary structure that is typically not found at helix termini exposed to a polar solvent. The structural particularities reported here are relevant for the three-dimensional modelling of membrane protein structures.  相似文献   

8.
Fuchs A  Kirschner A  Frishman D 《Proteins》2009,74(4):857-871
Despite rapidly increasing numbers of available 3D structures, membrane proteins still account for less than 1% of all structures in the Protein Data Bank. Recent high-resolution structures indicate a clearly broader structural diversity of membrane proteins than initially anticipated, motivating the development of reliable structure prediction methods specifically tailored for this class of molecules. One important prediction target capturing all major aspects of a protein's 3D structure is its contact map. Our analysis shows that computational methods trained to predict residue contacts in globular proteins perform poorly when applied to membrane proteins. We have recently published a method to identify interacting alpha-helices in membrane proteins based on the analysis of coevolving residues in predicted transmembrane regions. Here, we present a substantially improved algorithm for the same problem, which uses a newly developed neural network approach to predict helix-helix contacts. In addition to the input features commonly used for contact prediction of soluble proteins, such as windowed residue profiles and residue distance in the sequence, our network also incorporates features that apply to membrane proteins only, such as residue position within the transmembrane segment and its orientation toward the lipophilic environment. The obtained neural network can predict contacts between residues in transmembrane segments with nearly 26% accuracy. It is therefore the first published contact predictor developed specifically for membrane proteins performing with equal accuracy to state-of-the-art contact predictors available for soluble proteins. The predicted helix-helix contacts were employed in a second step to identify interacting helices. For our dataset consisting of 62 membrane proteins of solved structure, we gained an accuracy of 78.1%. Because the reliable prediction of helix interaction patterns is an important step in the classification and prediction of membrane protein folds, our method will be a helpful tool in compiling a structural census of membrane proteins.  相似文献   

9.
More than 30 organisms have been sequenced entirely. Here, we applied a variety of simple bioinformatics tools to analyze 29 proteomes for representatives from all three kingdoms: eukaryotes, prokaryotes, and archaebacteria. We confirmed that eukaryotes have relatively more long proteins than prokaryotes and archaes, and that the overall amino acid composition is similar among the three. We predicted that approximately 15%-30% of all proteins contained transmembrane helices. We could not find a correlation between the content of membrane proteins and the complexity of the organism. In particular, we did not find significantly higher percentages of helical membrane proteins in eukaryotes than in prokaryotes or archae. However, we found more proteins with seven transmembrane helices in eukaryotes and more with six and 12 transmembrane helices in prokaryotes. We found twice as many coiled-coil proteins in eukaryotes (10%) as in prokaryotes and archaes (4%-5%), and we predicted approximately 15%-25% of all proteins to be secreted by most eukaryotes and prokaryotes. Every tenth protein had no known homolog in current databases, and 30%-40% of the proteins fell into structural families with >100 members. A classification by cellular function verified that eukaryotes have a higher proportion of proteins for communication with the environment. Finally, we found at least one homolog of experimentally known structure for approximately 20%-45% of all proteins; the regions with structural homology covered 20%-30% of all residues. These numbers may or may not suggest that there are 1200-2600 folds in the universe of protein structures. All predictions are available at http://cubic.bioc.columbia.edu/genomes.  相似文献   

10.
The problem of rational target selection for protein structure determination in structural genomics projects on microbes is addressed. A flexible computational procedure is described that directly incorporates the whole body of annotation available in the PEDANT genome database into the sequence clustering and selection process in order to identify proteins that are likely to possess currently unknown structural domains. Filtering out gene products based on predicted structural features, such as known three-dimensional structures and transmembrane regions, allows one to reduce the complexity of neighbor relationships between sequences and all but eliminates the need for further partitioning of single-linkage clusters into disjoint protein groups corresponding to homologous families. The results of a large-scale computation experiment in which exemplary target selection for 32 prokaryotic genomes was conducted are presented.  相似文献   

11.
Bacteriorhodopsin is one of very few transmembrane proteins for which high resolution structures have been solved. The structure shows a bundle of seven helices connected by six turns. Some turns in proteins are stabilized by short range interactions and can behave as small domains. These observations suggest that peptides containing the sequence of the turns in a membrane protein such as bacteriorhodopsin may form stable turn structures in solution. To test this hypothesis, we determined the solution structure of three peptides each containing the sequence of one of the turns in bacteriorhodopsin. The solution structures of the peptides closely resemble the structures of the corresponding turns in the high resolution structures of the intact protein.  相似文献   

12.
The Profiles-3D application, an inverse-folding methodology appropriate for water-soluble proteins, has been modified to allow the determination of structural properties of integral-membrane proteins (IMPs) and for testing the validity of solved and model structures of IMPs. The modification, known as reverse-environment prediction of integral membrane protein structure (REPIMPS), takes into account the fact that exposed areas of side chains for many residues in IMPs are in contact with lipid and not the aqueous phase. This (1) allows lipid-exposed residues to be classified into the correct physicochemical environment class, (2) significantly improves compatibility scores for IMPs whose structures have been solved, and (3) reduces the possibility of rejecting a three-dimensional structure for an IMP because the presence of lipid was not included. Validation tests of REPIMPS showed that it (1) can locate the transmembrane domain of IMPs with single transmembrane helices more frequently than a range of other methodologies, (2) can rotationally orient transmembrane helices with respect to the lipid environment and surrounding helices in IMPs with multiple transmembrane helices, and (3) has the potential to accurately locate transmembrane domains in IMPs with multiple transmembrane helices. We conclude that correcting for the presence of the lipid environment surrounding the transmembrane segments of IMPs is an essential step for reasonable modeling and verification of the three-dimensional structures of these proteins.  相似文献   

13.
Adamian L  Liang J 《Proteins》2006,63(1):1-5
Analysis of a database of structures of membrane proteins shows that membrane proteins composed of 10 or more transmembrane (TM) helices often contain buried helices that are inaccessible to phospholipids. We introduce a method for identifying TM helices that are least phospholipid accessible and for prediction of fully buried TM helices in membrane proteins from sequence information alone. Our method is based on the calculation of residue lipophilicity and evolutionary conservation. Given that the number of buried helices in a membrane protein is known, our method achieves an accuracy of 78% and a Matthew's correlation coefficient of 0.68. A server for this tool (RANTS) is available online at http://gila.bioengr.uic.edu/lab/.  相似文献   

14.
An hypothesis is tested that individual peptides corresponding to the transmembrane helices of the membrane protein, rhodopsin, would form helices in solution similar to those in the native protein. Peptides containing the sequences of helices 1, 4 and 5 of rhodopsin were synthesized. Two peptides, with overlapping sequences at their termini, were synthesized to cover each of the helices. The peptides from helix 1 and helix 4 were helical throughout most of their length. The N- and C-termini of all the peptides were disordered and proline caused opening of the helical structure in both helix 1 and helix 4. The peptides from helix 5 were helical in the middle segment of each peptide, with larger disordered regions in the N- and C-termini than for helices 1 and 4. These observations show that there is a strong helical propensity in the amino acid sequences corresponding to the transmembrane domain of this G-protein coupled receptor. In the case of the peptides from helix 4, it was possible to superimpose the structures of the overlapping sequences to produce a construct covering the whole of the sequence of helix 4 of rhodopsin. As similar superposition for the peptides from helix 1 also produced a construct, but somewhat less successfully because of the disordering in the region of sequence overlap. This latter problem was more severe for helix 5 and therefore a single peptide was synthesized for the entire sequence of this helix, and its structure determined. It proved to be helical throughout. Comparison of all these structures with the recent crystal structure of rhodopsin revealed that the peptide structures mimicked the structures seen in the whole protein. Thus similar studies of peptides may provide useful information on the secondary structure of other transmembrane proteins built around helical bundles.  相似文献   

15.
Liu Y  Engelman DM  Gerstein M 《Genome biology》2002,3(10):research0054.1-research005412

Background

Polytopic membrane proteins can be related to each other on the basis of the number of transmembrane helices and sequence similarities. Building on the Pfam classification of protein domain families, and using transmembrane-helix prediction and sequence-similarity searching, we identified a total of 526 well-characterized membrane protein families in 26 recently sequenced genomes. To this we added a clustering of a number of predicted but unclassified membrane proteins, resulting in a total of 637 membrane protein families.

Results

Analysis of the occurrence and composition of these families revealed several interesting trends. The number of assigned membrane protein domains has an approximately linear relationship to the total number of open reading frames (ORFs) in 26 genomes studied. Caenorhabditis elegans is an apparent outlier, because of its high representation of seven-span transmembrane (7-TM) chemoreceptor families. In all genomes, including that of C. elegans, the number of distinct membrane protein families has a logarithmic relation to the number of ORFs. Glycine, proline, and tyrosine locations tend to be conserved in transmembrane regions within families, whereas isoleucine, valine, and methionine locations are relatively mutable. Analysis of motifs in putative transmembrane helices reveals that GxxxG and GxxxxxxG (which can be written GG4 and GG7, respectively; see Materials and methods) are among the most prevalent. This was noted in earlier studies; we now find these motifs are particularly well conserved in families, however, especially those corresponding to transporters, symporters, and channels.

Conclusions

We carried out a genome-wide analysis on patterns of the classified polytopic membrane protein families and analyzed the distribution of conserved amino acids and motifs in the transmembrane helix regions in these families.
  相似文献   

16.
We have determined the optimal placement of individual transmembrane helices in the Pyrococcus horikoshii GltPh glutamate transporter homolog in the membrane. The results are in close agreement with theoretical predictions based on hydrophobicity, but do not, in general, match the known three-dimensional structure, suggesting that transmembrane helices can be repositioned relative to the membrane during folding and oligomerization. Theoretical analysis of a database of membrane protein structures provides additional support for this idea. These observations raise new challenges for the structure prediction of membrane proteins and suggest that the classical two-stage model often used to describe membrane protein folding needs to be modified.  相似文献   

17.
Shen H  Chou JJ 《PloS one》2008,3(6):e2399
Prediction of transmembrane helices (TMH) in alpha helical membrane proteins provides valuable information about the protein topology when the high resolution structures are not available. Many predictors have been developed based on either amino acid hydrophobicity scale or pure statistical approaches. While these predictors perform reasonably well in identifying the number of TMHs in a protein, they are generally inaccurate in predicting the ends of TMHs, or TMHs of unusual length. To improve the accuracy of TMH detection, we developed a machine-learning based predictor, MemBrain, which integrates a number of modern bioinformatics approaches including sequence representation by multiple sequence alignment matrix, the optimized evidence-theoretic K-nearest neighbor prediction algorithm, fusion of multiple prediction window sizes, and classification by dynamic threshold. MemBrain demonstrates an overall improvement of about 20% in prediction accuracy, particularly, in predicting the ends of TMHs and TMHs that are shorter than 15 residues. It also has the capability to detect N-terminal signal peptides. The MemBrain predictor is a useful sequence-based analysis tool for functional and structural characterization of helical membrane proteins; it is freely available at http://chou.med.harvard.edu/bioinf/MemBrain/.  相似文献   

18.
Structure prediction of membrane proteins could be constrained and thereby improved by introducing data of the observed molecular shape. We studied a coarse-grained molecular model that relied on residue-based dummy atoms to fold the transmembrane helices of a protein in the observed molecular shape. Based on the inter-residue potential, the α-helices were folded to contact each other in a simulated annealing protocol to search optimized conformation. Fitting the model into a three-dimensional volume was tested for proteins with known structures and resulted in a fairly reasonable arrangement of helices. In addition, the constraint to the packing transmembrane helix with the two-dimensional region was tested and found to work as a very similar folding guide. The obtained models nicely represented α-helices with the desired slight bend. Our structure prediction method for membrane proteins well demonstrated reasonable folding results using a low-resolution structural constraint introduced from recent cell-surface imaging techniques.  相似文献   

19.
We review recent computational advances in the study of membrane proteins, focusing on those that have at least one transmembrane helix. Transmembrane protein regions are, in many respects, easier to investigate computationally than experimentally, due to the uniformity of their structure and interactions (e.g. consisting predominately of nearly parallel helices packed together) on one hand and presenting the challenges of solubility on the other. We present the progress made on identifying and classifying membrane proteins into families, predicting their structure from amino-acid sequence patterns (using many different methods), and analyzing their interactions and packing The total result of this work allows us for the first time to begin to think about the membrane protein interactome, the set of all interactions between distinct transmembrane helices in the lipid bilayer.  相似文献   

20.

Background

Few high-resolution structures of integral membranes proteins are available, as crystallization of such proteins needs yet to overcome too many technical limitations. Nevertheless, prediction of their transmembrane (TM) structure by bioinformatics tools provides interesting insights on the topology of these proteins.

Methods

We describe here how to extract new information from the analysis of hydrophobicity variations or hydrophobic pulses (HPulses) in the sequence of integral membrane proteins using the Hydrophobic Pulse Predictor, a new tool we developed for this purpose. To analyze the primary sequence of 70 integral membrane proteins we defined two levels of analysis: G1-HPulses for sliding windows of n = 2 to 6 and G2-HPulses for sliding windows of n = 12 to 16.

Results

The G2-HPulse analysis of 541 transmembrane helices allowed the definition of the new concept of transmembrane unit (TMU) that groups together transmembrane helices and segments with potential adjacent structures. In addition, the G1-HPulse analysis identified helix irregularities that corresponded to kinks, partial helices or unannotated structural events. These irregularities could represent key dynamic elements that are alternatively activated depending on the channel status as illustrated by the crystal structures of the lactose permease in different conformations.

Conclusions

Our results open a new way in the understanding of transmembrane secondary structures: hydrophobicity through hydrophobic pulses strongly impacts on such embedded structures and is not confined to define the transmembrane status of amino acids.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号