首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We report a comprehensive analysis of the numbers, lengths and amino acid compositions of transmembrane helices in 235 high-resolution structures of integral membrane proteins. The properties of 1551 transmembrane helices in the structures were compared with those obtained by analysis of the same amino acid sequences using topology prediction tools. Explanations for the 81 (5.2%) missing or additional transmembrane helices in the prediction results were identified. Main reasons for missing transmembrane helices were mis-identification of N-terminal signal peptides, breaks in α-helix conformation or charged residues in the middle of transmembrane helices and transmembrane helices with unusual amino acid composition. The main reason for additional transmembrane helices was mis-identification of amphipathic helices, extramembrane helices or hairpin re-entrant loops. Transmembrane helix length had an overall median of 24 residues and an average of 24.9 ± 7.0 residues and the most common length was 23 residues. The overall content of residues in transmembrane helices as a percentage of the full proteins had a median of 56.8% and an average of 55.7 ± 16.0%. Amino acid composition was analysed for the full proteins, transmembrane helices and extramembrane regions. Individual proteins or types of proteins with transmembrane helices containing extremes in contents of individual amino acids or combinations of amino acids with similar physicochemical properties were identified and linked to structure and/or function. In addition to overall median and average values, all results were analysed for proteins originating from different types of organism (prokaryotic, eukaryotic, viral) and for subgroups of receptors, channels, transporters and others.  相似文献   

2.
The amino acid compositions of proteins from halophilic archaea were compared with those from non-halophilic mesophiles and thermophiles, in terms of the protein surface and interior, on a genome-wide scale. As we previously reported for proteins from thermophiles, a biased amino acid composition also exists in halophiles, in which an abundance of acidic residues was found on the protein surface as compared to the interior. This general feature did not seem to depend on the individual protein structures, but was applicable to all proteins encoded within the entire genome. Unique protein surface compositions are common in both halophiles and thermophiles. Statistical tests have shown that significant surface compositional differences exist among halophiles, non-halophiles, and thermophiles, while the interior composition within each of the three types of organisms does not significantly differ. Although thermophilic proteins have an almost equal abundance of both acidic and basic residues, a large excess of acidic residues in halophilic proteins seems to be compensated by fewer basic residues. Aspartic acid, lysine, asparagine, alanine, and threonine significantly contributed to the compositional differences of halophiles from meso- and thermophiles. Among them, however, only aspartic acid deviated largely from the expected amount estimated from the dinucleotide composition of the genomic DNA sequence of the halophile, which has an extremely high G+C content (68%). Thus, the other residues with large deviations (Lys, Ala, etc.) from their non-halophilic frequencies could have arisen merely as "dragging effects" caused by the compositional shift of the DNA, which would have changed to increase principally the fraction of aspartic acid alone.  相似文献   

3.
We have performed a comparative analysis of amino acid distributions in predicted integral membrane proteins from a total of 107 genomes. A procedure for identification of membrane spanning helices was optimized on a homology-reduced data set of 170 multi-spanning membrane proteins with experimentally determined topologies. The optimized method was then used for extraction of highly reliable partial topologies from all predicted membrane proteins in each genome, and the average biases in amino acid distributions between loops on opposite sides of the membrane were calculated. The results strongly support the notion that a biased distribution of Lys and Arg residues between cytoplasmic and extra-cytoplasmic segments (the positive-inside rule) is present in most if not all organisms.  相似文献   

4.
The evolution of protein folds is under strong constraints from their surrounding environment. Although folding in water‐soluble proteins is driven primarily by hydrophobic forces, the nature of the forces that determine the folding and stability of transmembrane proteins are still not fully understood. Furthermore, the chemically heterogeneous lipid bilayer has a non‐uniform effect on protein structure. In this article, we attempt to get an insight into the nature of this effect by examining the impact of various types of local structure environment on amino acid substitution, based on alignments of high‐resolution structures of polytopic helical transmembrane proteins combined with sequences of close homologs. Compared to globular proteins, burying amino acid sidechains, especially hydrophilic ones, led to a lower increase in conservation in both the lipid‐water interface region and the hydrocarbon core region. This observation is due to surface residues in HTM proteins especially in the HC region being relatively highly conserved, suggesting higher evolutionary constraints from their specific interactions with the surrounding lipid molecules. Polar and small residues, particularly Pro and Gly, show a noticeable increase in conservation as they are positioned more towards the centre of the membrane, which is consistent with their recognized key roles in structural stability. In addition, the examination of hydrogen bonds in the membrane environment identified some exposed hydrophilic residues being better conserved when not hydrogen‐bonded to other residues, supporting the importance of lipid‐protein sidechain interactions. The conclusions presented in this study highlight the distinct features of substitution matrices that take into account the membrane environment, and their potential role in improving sequence‐structure alignments of transmembrane proteins. Proteins 2010; © 2010 Wiley‐Liss, Inc.  相似文献   

5.
许嘉 《生物信息学》2013,11(4):297-299
抗冻蛋白是一类具有提高生物抗冻能力的蛋白质。抗冻蛋白能够特异性的与冰晶相结合,进而阻止体液内冰核的形成与生长。因此,对抗冻蛋白的生物信息学研究对生物工程发展。提高作物抗冻性有重要的推动作用。本文采用由400条抗冻蛋白序列和400条非抗冻蛋白序列构成数据集,以伪氨基酸组分为特征,利用支持向量机分类算法预测抗冻蛋白,对训练集预测精度达到91.3%,对测试集预测精度达到78.8%。该结果证明伪氨基酸组分能够很好的反映抗冻蛋白特性,并能够用于预测抗冻蛋白。  相似文献   

6.
The outer membrane proteins (OMPs) are β-barrel membrane proteins that performed lots of biology functions. The discriminating OMPs from other non-OMPs is a very important task for understanding some biochemical process. In this study, a method that combines increment of diversity with modified Mahalanobis Discriminant, called IDQD, is presented to predict 208 OMPs, 206 transmembrane helical proteins (TMHPs) and 673 globular proteins (GPs) by using Chou's pseudo amino acid compositions as parameters. The overall accuracy of jackknife cross-validation is 93.2% and 96.1%, respectively, for three datasets (OMPs, TMHPs and GPs) and two datasets (OMPs and non-OMPs). These predicted results suggest that the method can be effectively applied to discriminate OMPs, TMHPs and GPs. And it also indicates that the pseudo amino acid composition can better reflect the core feature of membrane proteins than the classical amino acid composition.  相似文献   

7.
With the aim of studying the relationship between protein sequences and their native structures, we adopted vectorial representations for both sequence and structure. The structural representation was based on the principal eigenvector of the fold's contact matrix (PE). As has been recently shown, the latter encodes sufficient information for reconstructing the whole contact matrix. The sequence was represented through a hydrophobicity profile (HP), using a generalized hydrophobicity scale that we obtained from the principal eigenvector of a residue-residue interaction matrix, and denoted as interactivity scale. Using this novel scale, we defined the optimal HP of a protein fold, and, by means of stability arguments, predicted to be strongly correlated with the PE of the fold's contact matrix. This prediction was confirmed through an evolutionary analysis, which showed that the PE correlates with the HP of each individual sequence adopting the same fold and, even more strongly, with the average HP of this set of sequences. Thus, protein sequences evolve in such a way that their average HP is close to the optimal one, implying that neutral evolution can be viewed as a kind of motion in sequence space around the optimal HP. Our results indicate that the correlation coefficient between N-dimensional vectors constitutes a natural metric in the vectorial space in which we represent both protein sequences and protein structures, which we call vectorial protein space. In this way, we define a unified framework for sequence-to-sequence, sequence-to-structure and structure-to-structure alignments. We show that the interactivity scale is nearly optimal both for the comparison of sequences to sequences and sequences to structures.  相似文献   

8.
Proteins fold by either two‐state or multistate kinetic mechanism. We observe that amino acids play different roles in different mechanism. Many residues that are easy to form regular secondary structures (α helices, β sheets and turns) can promote the two‐state folding reactions of small proteins. Most of hydrophilic residues can speed up the multistate folding reactions of large proteins. Folding rates of large proteins are equally responsive to the flexibility of partial amino acids. Other properties of amino acids (including volume, polarity, accessible surface, exposure degree, isoelectric point, and phase transfer energy) have contributed little to folding kinetics of the proteins. Cysteine is a special residue, it triggers two‐state folding reaction and but inhibits multistate folding reaction. These findings not only provide a new insight into protein structure prediction, but also could be used to direct the point mutations that can change folding rate. Proteins 2014; 82:2375–2382. © 2014 Wiley Periodicals, Inc.  相似文献   

9.
Hydrophobicity of amino acid subgroups in proteins   总被引:14,自引:0,他引:14  
Protein folding studies often utilize areas and volumes to assess the hydrophobic contribution to conformational free energy (Richards, F.M. Annu. Rev. Biophys. Bioeng. 6:151-176, 1977). We have calculated the mean area buried upon folding for every chemical group in each residue within a set of X-ray elucidated proteins. These measurements, together with a standard state cavity size for each group, are documented in a table. It is observed that, on average, each type of group buries a constant fraction of its standard state area. The mean area buried by most, though not all, groups can be closely approximated by summing contributions from three characteristic parameters corresponding to three atom types: (1) carbon or sulfur, which turn out to be 86% buried, on average; (2) neutral oxygen or nitrogen, which are 40% buried, on average; and (3) charged oxygen or nitrogen, which are 32% buried, on average.  相似文献   

10.
Adamian L  Nanda V  DeGrado WF  Liang J 《Proteins》2005,59(3):496-509
Characterizing the interactions between amino acid residues and lipid molecules is important for understanding the assembly of transmembrane helices and for studying membrane protein folding. In this study we develop TMLIP (TransMembrane helix-LIPid), an empirically derived propensity of individual residue types to face lipid membrane based on statistical analysis of high-resolution structures of membrane proteins. Lipid accessibilities of amino acid residues within the transmembrane (TM) region of 29 structures of helical membrane proteins are studied with a spherical probe of radius of 1.9 A. Our results show that there are characteristic preferences for residues to face the headgroup region and the hydrocarbon core region of lipid membrane. Amino acid residues Lys, Arg, Trp, Phe, and Leu are often found exposed at the headgroup regions of the membrane, where they have high propensity to face phospholipid headgroups and glycerol backbones. In the hydrocarbon core region, the strongest preference for interacting with lipids is observed for Ile, Leu, Phe and Val. Small and polar amino acid residues are usually buried inside helical bundles and are strongly lipophobic. There is a strong correlation between various hydrophobicity scales and the propensity of a given residue to face the lipids in the hydrocarbon region of the bilayer. Our data suggest a possibly significant contribution of the lipophobic effect to the folding of membrane proteins. This study shows that membrane proteins have exceedingly apolar exteriors rather than highly polar interiors. Prediction of lipid-facing surfaces of boundary helices using TMLIP1 results in a 54% accuracy, which is significantly better than random (25% accuracy). We also compare performance of TMLIP with another lipid propensity scale, kPROT, and with several hydrophobicity scales using hydrophobic moment analysis.  相似文献   

11.
In contrast to water-soluble proteins, membrane proteins reside in a heterogeneous environment, and their surfaces must interact with both polar and apolar membrane regions. As a consequence, the composition of membrane proteins' residues varies substantially between the membrane core and the interfacial regions. The amino acid compositions of helical membrane proteins are also known to be different on the cytoplasmic and extracellular sides of the membrane. Here we report that in the 16 transmembrane beta-barrel structures, the amino acid compositions of lipid-facing residues are different near the N and C termini of the individual strands. Polar amino acids are more prevalent near the C termini than near the N termini, and hydrophobic amino acids show the opposite trend. We suggest that this difference arises because it is easier for polar atoms to escape from the apolar regions of the bilayer at the C terminus of a beta-strand. This new characteristic of beta-barrel membrane proteins enhances our understanding of how a sequence encodes a membrane protein structure and should prove useful in identifying and predicting the structures of trans-membrane beta-barrels.  相似文献   

12.
Hydrophobicity analyses applied to databases of soluble and transmembrane (TM) proteins of known structure were used to resolve total genomic hydrophobicity profiles into (helical) TM sequences and mainly "subhydrophobic" soluble components. This information was used to define a refined "hydrophobicity"-type TM sequence prediction scale that should approach the theoretical limit of accuracy. The refinement procedure involved adjusting scale values to eliminate differences between the average amino acid composition of populations TM and soluble sequences of equal hydrophobicity, a required property of a scale having maximum accuracy. Application of this procedure to different hydrophobicity scales caused them to collapse to essentially a single TM tendency scale. As expected, when different scales were compared, the TM tendency scale was the most accurate at predicting TM sequences. It was especially highly correlated (r = 0.95) to the biological hydrophobicity scale, derived experimentally from the percent TM conformation formed by artificial sequences passing though the translocon. It was also found that resolution of total genomic sequence data into TM and soluble components could be used to define the percent probability that a sequence with a specific hydrophobicity value forms a TM segment. Application of the TM tendency scale to whole genomic data revealed an overlap of TM and soluble sequences in the "semihydrophobic" range. This raises the possibility that a significant number of proteins have sequences that can switch between TM and non-TM states. Such proteins may exist in moonlighting forms having properties very different from those of the predominant conformation.  相似文献   

13.
Ribulose-1,5-biphosphate (RuBP) carboxylases from 38 grass species (26 genera), isolated via affinity chromatography, compare well in amino acid compositions with the enzyme extracted and purified from the same and other species (46 species, 38 genera) by an alternative procedure. Taxonomic differences in the amino acid composition of the enzyme exist between the major grass groups. Those of pooids are distinguishable from those of chloridoids, eu-panicoids and andropogonoids, while those of a bamboo, Oryza, Microlaena and Stipeae bear closer resemblance to those of pooids. The amino acid composition of RuBP carboxylases does not resemble that of total leaf proteins of the grasses. Variations detected in the amino acid compositions of RuBP, carboxylases are independent of observed differences in kinetics between C3 and C4 versions of the enzyme.  相似文献   

14.
Integral membrane proteins are central to many cellular processes and constitute approximately 50% of potential targets for novel drugs. However, the number of outer membrane proteins (OMPs) present in the public structure database is very limited due to the difficulties in determining structure with experimental methods. Therefore, discriminating OMPs from non-OMPs with computational methods is of medical importance as well as genome sequencing necessity. In this study, some sequence-derived structural and physicochemical features of proteins were incorporated with amino acid composition to discriminate OMPs from non-OMPs using support vector machines. The discrimination performance of the proposed method is evaluated on a benchmark dataset of 208 OMPs, 673 globular proteins, and 206 α-helical membrane proteins. A high overall accuracy of 97.8% was observed in the 5-fold cross-validation test. In addition, the current method distinguished OMPs from globular proteins and α-helical membrane proteins with overall accuracies of 98.2 and 96.4%, respectively. The prediction performance is superior to the state-of-the-art methods in the literature. It is anticipated that the current method might be a powerful tool for the discrimination of OMPs.  相似文献   

15.
16.
A method is described for quantitatively hydrolyzing proteins in 45 min and for analyzing the hydrolysates by high-performance liquid chromatography in an additional 52 min. The α-amino acids were detected by the fluorescence of their o-phthaldialdehyde derivatives. Ten picomoles of each of the commonly occuring α-amino acids could be reliably determined. The method described yielded OPA-ethanethiolamino acid derivatives that were stable for 1h h and the HPLC method produced a better separation than previously published methods.  相似文献   

17.
膜蛋白是重要的药物靶位点,对膜蛋白类型的研究有助于药物的成功设计,因此正确预测膜蛋白类型对于药物研发是十分必要的。本文采用由274条分枝杆菌膜蛋白序列组成的一致性小于40%的数据集,以经过优化的伪氨基酸组分为特征,利用支持向量机分类算法预测分枝杆菌膜蛋白类型,在Jackknife检验下,得到85.4%的总体准确率和72.2%的平均准确率。结果说明,该方法可用于分枝杆菌膜蛋白类型的识别,将有助于抗分枝杆菌药物的开发。  相似文献   

18.
Ma BG  Guo JX  Zhang HY 《Proteins》2006,65(2):362-372
Discovering the mechanism of protein folding, in molecular biology, is a great challenge. A key step to this end is to find factors that correlate with protein folding rates. Over the past few years, many empirical parameters, such as contact order, long-range order, total contact distance, secondary structure contents, have been developed to reflect the correlation between folding rates and protein tertiary or secondary structures. However, the correlation between proteins' folding rates and their amino acid compositions has not been explored. In the present work, we examined systematically the correlation between proteins' folding rates and their amino acid compositions for two-state and multistate folders and found that different amino acids contributed differently to the folding progress. The relation between the amino acids' molecular weight and degeneracy and the folding rates was examined, and the role of hydrophobicity in the protein folding process was also inspected. As a consequence, a new indicator called composition index was derived, which takes no structure factors into account and is merely determined by the amino acid composition of a protein. Such an indicator is found to be highly correlated with the protein's folding rate (r > 0.7). From the results of this work, three points of concluding remarks are evident. (1) Two-state folders and multistate folders have different rate-determining amino acids. (2) The main determining information of a protein's folding rate is largely reflected in its amino acid composition. (3) Composition index may be the best predictor for an ab initio protein folding rate prediction directly from protein sequence from the standpoint of practical application.  相似文献   

19.
The hydrophobicity of myelinic, synaptosomal and mitochondrial surfaces in the rat brain was measured using the nonionic surfactant, C18H37O(CH2CH2O)13H. This method is based on the adsorption of the hydrophobic alkyl group of the surfactant by the hydrophobic sites on the surfaces. Each preparations was mixed with an excess of the surfactant and the surfactant remaining in the supernatants was determined spectrophotometrically by measuring the absorbance of tetrabromophenolphthalein ethylester at 690 nm. The greatest amount was adsorbed by myelin, followed by synaptosomes and mitochondria. The hydrophobicity is shown to be a reflection of the surface lipids. This method showed good reproducibility and was useful for the quantitative determination of hydrophobicity.  相似文献   

20.
Rykunov D  Fiser A 《Proteins》2007,67(3):559-568
Statistical distance dependent pair potentials are frequently used in a variety of folding, threading, and modeling studies of proteins. The applicability of these types of potentials is tightly connected to the reliability of statistical observations. We explored the possible origin and extent of false positive signals in statistical potentials by analyzing their distance dependence in a variety of randomized protein-like models. While on average potentials derived from such models are expected to equal zero at any distance, we demonstrate that systematic and significant distortions exist. These distortions originate from the limited statistical counts in local environments of proteins and from the limited size of protein structures at large distances. We suggest that these systematic errors in statistical potentials are connected to the dependence of amino acid composition on protein size and to variation in protein sizes. Additionally, atom-based potentials are dominated by a false positive signal that is due to correlation among distances measured from atoms of one residue to atoms of another residue. The significance of residue-based pairwise potentials at various spatial pair separations was assessed in this study and it was found that as few as approximately 50% of potential values were statistically significant at distances below 4 A, and only at most approximately 80% of them were significant at larger pair separations. A new definition for reference state, free of the observed systematic errors, is suggested. It has been demonstrated to generate statistical potentials that compare favorably to other publicly available ones.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号