首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Protein destabilization is a common mechanism by which amino acid substitutions cause human diseases. Although several machine learning methods have been reported for predicting protein stability changes upon amino acid substitutions, the previous studies did not utilize relevant sequence features representing biological knowledge for classifier construction.

Results

In this study, a new machine learning method has been developed for sequence feature-based prediction of protein stability changes upon amino acid substitutions. Support vector machines were trained with data from experimental studies on the free energy change of protein stability upon mutations. To construct accurate classifiers, twenty sequence features were examined for input vector encoding. It was shown that classifier performance varied significantly by using different sequence features. The most accurate classifier in this study was constructed using a combination of six sequence features. This classifier achieved an overall accuracy of 84.59% with 70.29% sensitivity and 90.98% specificity.

Conclusions

Relevant sequence features can be used to accurately predict protein stability changes upon amino acid substitutions. Predictive results at this level of accuracy may provide useful information to distinguish between deleterious and tolerant alterations in disease candidate genes. To make the classifier accessible to the genetics research community, we have developed a new web server, called MuStab (http://bioinfo.ggc.org/mustab/).
  相似文献   

2.
  1. Download : Download high-res image (149KB)
  2. Download : Download full-size image
  相似文献   

3.
To determine which amino acids in TEM-1 beta-lactamase are important for its structure and function, random libraries were previously constructed which systematically randomized the 263 codons of the mature enzyme. A comprehensive screening of these libraries identified several TEM-1 beta-lactamase core positions, including F66 and L76, which are strictly required for wild-type levels of hydrolytic activity. An examination of positions 66 and 76 in the class A beta-lactamase gene family shows that a phenylalanine at position 66 is strongly conserved while position 76 varies considerably among other beta-lactamases. It is possible that position 76 varies in the gene family because beta-lactamase mutants with non-conservative substitutions at position 76 retain partial function. In contrast, position 66 may remain unchanged in the gene family because non-conservative substitutions at this location are detrimental for enzyme structure and function. By determining the beta-lactam resistance levels of the 38 possible mutants at positions 66 and 76 in the TEM-1 enzyme, it was confirmed that position 76 is indeed more tolerant of non-conservative substitutions. An analysis of the Protein Data Bank files for three class A beta-lactamases indicates that volume constraints at position 66 are at least partly responsible for the low tolerance of substitutions at this position.  相似文献   

4.
The calcium-dependent homophilic cell adhesion molecule E-cadherin typically connects epithelial cells. The extracellular portion of the mature transmembrane protein consists of five homologous domains. The four sequences linking these domains contain the structural amino acid motif DXXD that is thought to be involved in direct calcium binding. In gastric cancer patients mutations affecting this motif between the second and third domain are frequently seen. In order to determine the functional significance of similar sequence alterations with regard to their location, we analyzed single amino acid substitutions changing the DXXD motif to DXXA in each linker region according to a mutation found in gastric cancer (D370A). The cDNA sequences coding for DQND, DVLD and DVND were changed (D257A, D479A, D590A, respectively) and stably expressed in E-cadherin negative MDA-MB-435S mammary carcinoma cells. We found that the D257A and D370A mutations result in abnormal protein localization, changes in the actin cytoskeleton, markedly reduced homophilic cell adhesion, and altered cell morphology. Unexpectedly, the tumor-associated D370A mutation but not the D257A mutation induced increased cell motility. The D479A mutation only had slight functional consequences whereas cells expressing the D590A mutant did not differ from cells expressing the wild-type molecule. Although the putative calcium binding motif DXXD is located at repetitive positions in the extracellular portion of E-cadherin, our results indicate that it has different functions depending on the location. Remarkably, tumor cells select for mutations in the most critical domains resulting both in loss of function (decreased cell adhesion) and in gain of function (increased cell motility). Since multiple DXXD motifs are typically seen in other cadherins, our structure-function study is relevant for this gene family in general.  相似文献   

5.
A morphological or physiological trait may appear multiple times in evolution. At the molecular level, similar protein functions may emerge independently in different lineages. Whether these parallel functional changes are due to parallel amino acid substitutions has been a subject of debate. Here, I address this question using digestive ribonucleases (RNases) of two groups of foregut-fermenting mammals: ruminant artiodactyls and colobine monkeys. The RNase1 gene was duplicated twice in ancestral ruminants at least 40 MYA, and it was also duplicated in the douc langur, an Asian colobine, approximately 4 MYA. After duplication, similar functional changes occurred in the ruminant and monkey enzymes. Interestingly, five amino acid substitutions in ruminant RNases that are known to affect its catalytic activity against double-stranded (ds) RNA did not occur in the monkey enzyme. Rather, a similar functional change in the monkey was caused by a different set of nine substitutions. Site-directed mutagenesis was used to make three of the five ruminant-specific substitutions in the monkey enzyme. Functional assays of these mutants showed that one of the three substitutions has a similar effect in monkeys, the second has a stronger effect, and the third has an opposite effect. These results suggest that (1) an evolutionary problem can have multiple solutions, (2) the same amino acid substitution may have opposite functional effects in homologous proteins, (3) the stochastic processes of mutation and drift play an important role even at functionally important sites, and (4) protein sequences may diverge even when their functions converge.  相似文献   

6.
Comparative genomics usually involves managing the functional aspects of genomes, by simply comparing gene-by-gene functions. Following this approach, Mushegian and Koonin proposed a hypothetical minimal genome, Minimal Gene Set (MGS), aiming for a possible oldest ancestor genome. They obtained MGS by comparing the genomes of two simple bacteria and eliminating duplicated or functionally identical genes. The authors raised the fundamental question of whether a hypothetical organism possessing MGS is able to live or not. We attacked this viability problem specifying in silico the metabolic pathways of the MGS-based prokaryote. We then performed a dynamic simulation of cellular metabolic activities in order to check whether the MGS-prokaryote reaches some equilibrium state and produces the necessary biomass. We assumed these two conditions to be necessary for a living organism. Our simulations clearly show that the MGS does not express an organism that is able to live. We then iteratively proceeded with functional replacements in order to obtain a genome composition that gives rise to equilibrium. We ruled out 76 of the original 254 genes in the MGS, because they resulted in duplication from a functional point of view. We also added seven genes not present in the MGS. These genes encode for enzymes involved in critical nodes of the metabolic network. These modifications led to a genome composed of 187 elements expressing a virtually living organism, Virtual Cell (ViCe), that exhibits homeostatic capabilities and produces biomass. Moreover, the steady-state distribution of the concentrations of virtual metabolites that resulted was similar to that experimentally measured in bacteria. We conclude then that ViCe is able to “live in silico.”  相似文献   

7.
The creatine transporter (CRT) is a member of a large family of sodium-dependent neurotransmitter and amino acid transporters. The CRT is closely related to the gamma-aminobutyric acid (GABA) transporter, GAT-1, yet GABA is not an effective substrate for the CRT. The high resolution structure of a prokaryotic homologue, LeuT has revealed precise details of the substrate binding site for leucine (Yamashita, A., Singh, S. K., Kawate, T., Jin, Y., and Gouaux, E. (2005) Nature 437, 215-223). We have now designed mutations based on sequence comparisons of the CRT with GABA transporters and the LeuT structural template in an attempt to alter the substrate specificity of the CRT. Combinations of two or three amino acid substitutions at four selected positions resulted in the loss of creatine transport activity and gain of a specific GABA transport function. GABA transport by the "gain of function" mutants was sensitive to nipecotic acid, a competitive inhibitor of GABA transporters. Our results show LeuT to be a good structural model to identify amino acid residues involved in the substrate and inhibitor selectivity of eukaryotic sodium-dependent neurotransmitter and amino acid transporters. However, modification of the binding site alone appears to be insufficient for efficient substrate translocation. Additional residues must mediate the conformational changes required for the diffusion of substrate from the binding site to the cytoplasm.  相似文献   

8.
Amino acid sequences of peptides are often inferred from their amino acid compositions by comparison with homologous peptides of known sequence. The probabilities are considered that by such an approach errors are made due to the occurrence of balanced double changes, i.e. reciprocal substitutions, between two homologous peptides of identical compositions. Formulae are derived for the calculation of these probabilities, depending on peptide length and evolutionary distance. However, such calculations requiring too much computer time, the probabilities for reciprocal substitutions are estimated by simulation of evolutionary changes in peptides. It can be concluded from the resulting data that for many purposes the possible errors in amino acid sequences partially inferred from amino acid compositions are acceptably small.  相似文献   

9.
The Escherichia coli regulatory protein AraC regulates expression of ara genes in response to l ‐arabinose. In efforts to develop genetically encoded molecular reporters, we previously engineered an AraC variant that responds to the compound triacetic acid lactone (TAL). This variant (named “AraC‐TAL1”) was isolated by screening a library of AraC variants, in which five amino acid positions in the ligand‐binding pocket were simultaneously randomized. Screening was carried out through multiple rounds of alternating positive and negative fluorescence‐activated cell sorting. Here we show that changing the screening protocol results in the identification of different TAL‐responsive variants (nine new variants). Individual substituted residues within these variants were found to primarily act cooperatively toward the gene expression response. Finally, X‐ray diffraction was used to solve the crystal structure of the apo AraC‐TAL1 ligand‐binding domain. The resolved crystal structure confirms that this variant takes on a structure nearly identical to the apo wild‐type AraC ligand‐binding domain (root‐mean‐square deviation 0.93 Å), suggesting that AraC‐TAL1 behaves similar to wild‐type with regard to ligand recognition and gene regulation. Our results provide amino acid sequence–function data sets for training and validating AraC modeling studies, and contribute to our understanding of how to design new biosensors based on AraC.  相似文献   

10.
A set of aligned homologous protein sequences is divided into two groups consisting of m and n most related sequences. The value of position variability for homologous protein sequences is defined as a number of failures to coincide in the intergroup comparison of all possible m*n pairs of amino acid residues in that position divided by m*n. The position variability value plotted versus the sequence position number with a window of 10 positions gives the intergroup local variability profile. Area S of the figure included between the local variability profile and the straight line corresponding to the mean local variability value is compared with the average area Sr for 1000 random homologous protein families. If S is greater than Sr by more than 2 standard deviation units sigma r, the local variability profile is assumed to contain peaks and hollows corresponding to significant variable and conservative regions of the sequences. The profile extrema containing the area surplus delta S = S-(Sr+ 2 sigma r) are cut off by two straight lines to locate significant regions. The difference (S-Sr) given in standard deviation units sigma r is believed to be the amino acid substitution overall irregularity along the homologous protein sequences OI = (S-Sr)/sigma r. The significant conservative and variable regions of six homologous sequence families (phospholipase A2, cytochromes b, alpha-subunits of Na,K-ATPase, L- and M-subunits of photosynthetic bacteria photoreaction centre and human rhodopsins) were identified. It was shown that for artificial homologous protein sequences derived by k-fold lengthening of natural protein sequences, the OI value rises as square root of k. To compare the degree of substitution irregularity in homologous protein sequence families of different lengths L the value of standard substitution overall irregularity for L = 250 is proposed.  相似文献   

11.
Two types of amino acid substitutions in protein evolution   总被引:35,自引:0,他引:35  
Summary The frequency of amino acid substitutions, relative to the frequency expected by chance, decreases linearly with the increase in physico-chemical differences between amino acid pairs involved in a substitution. This correlation does not apply to abnormal human hemoglobins. Since abnormal hemoglobins mostly reflect the process of mutation rather than selection, the correlation manifest during protein evolution between substitution frequency and physico-chemical difference in amino acids can be attributed to natural selection. Outside of abnormal proteins, the correlation also does not apply to certain regions of proteins characterized by rapid rates of substitution. In these cases again, except for the largest physico-chemical differences between amino acid pairs, the substitution frequencies seem to be independent of the physico-chemical parameters. The limination of the substituents involving the largest physicochemical differences can once more be attributed to natural selection. For smaller physico-chemical differences, natural selection, if it is operating in the polypeptide regions, must be based on parameters other than those examined.  相似文献   

12.
Red blood cells can withstand the harsh mechanical conditions in the vasculature only because the bending rigidity of their plasma membrane is complemented by the shear elasticity of the underlying spectrin-actin network. During an infection by the malaria parasite Plasmodium falciparum, the parasite mines host actin from the junctional complexes and establishes a system of adhesive knobs, whose main structural component is the knob-associated histidine rich protein (KAHRP) secreted by the parasite. Here we aim at a mechanistic understanding of this dramatic transformation process. We have developed a particle-based computational model for the cytoskeleton of red blood cells and simulated it with Brownian dynamics to predict the mechanical changes resulting from actin mining and KAHRP-clustering. Our simulations include the three-dimensional conformations of the semi-flexible spectrin chains, the capping of the actin protofilaments and several established binding sites for KAHRP. For the healthy red blood cell, we find that incorporation of actin protofilaments leads to two regimes in the shear response. Actin mining decreases the shear modulus, but knob formation increases it. We show that dynamical changes in KAHRP binding affinities can explain the experimentally observed relocalization of KAHRP from ankyrin to actin complexes and demonstrate good qualitative agreement with experiments by measuring pair cross-correlations both in the computer simulations and in super-resolution imaging experiments.  相似文献   

13.
Summary A simple method for the evolutionary analysis of amino acid sequence data is presented and used to examine whether the number of variable sites (NVS) of a protein is constant during its evolution. The NVSs for hemoglobin and for mitochondrial cytochrome c are each found to be almost constant, and the ratio between the NVSs is close to the ratio between the unit evolutionary periods. This indicates that the substitution rate per variable site is almost uniform for these proteins, as the neutral theory claims. An advantage of the present analysis is that it can be done without knowledge of paleontological divergence times and can be extended to bacterial proteins such as bacterial c-type cytochromes. It is suggested that the NVS of cytochrome c has been almost constant even over the long period (ca. 3.0 billion years) of bacterial evolution but that at least two different substitution rates are necessary to describe the accumulated changes in the sequence. This two clock interpretation is consistent with fossil evidence for the appearance times of photosynthetic bacteria and eukaryotes.  相似文献   

14.
15.
M Pieber  J Tohá 《Origins of life》1983,13(2):139-146
The frequency of amino acid replacements in families of typical proteins has been elegantly analyzed by Argyle (1980) showing that the most frequent replacements involve a conservation of the amino acid chemical properties. The cyclic arrangement of the twenty amino acids resulting from the most frequent replacements has been described as an amino acid chemical ring. In this work, a novel amino acid replacement frequency ring is proposed, for which a conservation of over 90% of the most general physico-chemical properties can be deduced. The amino acid chemical similarity ring is also analyzed in terms of the genetic code base probability changes, showing that the discrepancy that exists between the standard deviation value of the amino acid replacement frequency matrix and its respective ideal value is almost equal to that deduced from the corresponding base codon replacement probability matrices. These differences are finally evaluated and discussed in terms of the restrictions imposed by the structure of the genetic code and the physico-chemical dissimilarities between some codons of amino acids which are chemically similar.  相似文献   

16.
The frequency of amino acid replacements in families of typical proteins has been elegantly analyzed by Argyle (1980) showing that the most frequent replacements involve a conservation of the amino acid chemical properties. The cyclic arrangement of the twenty amino acids resulting from the most frequent replacements has been described as an amino acid chemical ring.In this work, a novel amino acid replacement frequency ring is proposed, for which a conservation of over 90% of the most general physico-chemical properties can be deduced.The amino acid chemical similarity ring is also analyzed in terms of the genetic code base probability changes, showing that the discrepancy that exists between the standard deviation value of the amino acid replacement frequency matrix and its respective ideal value is almost equal to that deduced from the corresponding base codon replacement probability matrices. These differences are finally evaluated and discussed in terms of the restrictions imposed by the structure of the genetic code and the physico-chemical dissimilarities between some codons of amino acids which are chemically similar.This work was partially supported by OEA and Departamento de Desarrollo de la Investigación.  相似文献   

17.
Functional and structural heterogeneity of HLA class I molecules was sought among five donors serologically identical for A2, A3, B7, and Bw44. Functional differences were identified by cell-mediated lympholysis (CML) after allogeneic mixed lymphocyte reaction among the five donors. Structural differences were characterized by high resolution two-dimensional electrophoretic maps of the class I HLA proteins synthesized by peripheral blood lymphocytes of these donors. Cells from three donors showed no CML-defined differences from one another; their HLA protein maps were identical. The cells of one donor recognized an A3-associated target antigen on the cells of all the other donors; her HLA map revealed a unique protein with altered isoelectric point. Another donor's cells differed by two CML-detected antigens: one was identified as a variant of Bw44 ("44.2") and the other was associated with Cw4. This donor's two-dimensional HLA map showed two novel charged proteins. By using these data, a two-dimensional map locating HLA-A2, -A3, -A3', -B7, -Bw44.1, -Bw44.2, and -Cw4 was prepared. Because each of three CML-detected antigens was correlated with a protein of distinctive charge, our results and the available published data raise the possibility that amino acid substitution producing charge variation may be a particularly important mechanism in the generation of CML-detectable HLA diversity.  相似文献   

18.
The transport of L-proline, L-lysine and L-glutamate in rat red blood cells has been studied. L-proline and L-lysine uptake were Na+-independent. When the concentration dependence was studied both showed a non-saturable uptake assimilable to a difussion-like process, with high Kd values (0.718 and 0.191 min–1 for L-proline and L-lysine respectively). Rat red blood cells showed high impermeability to L-glutamate. No sodium dependence was observed and the Kd value was low (0.067 min–1). Our results show firstly, that rat red blood cells do not have amino acid transport systems for anionic and cationic amino acids and secondly that erythrocytes show no sodium-dependent L-proline transport, and that these cells are very permeable to this amino acid.Abbreviations MeAIB methyl aminoisobutyric acid  相似文献   

19.
All known nucleoside monophosphate kinases contain an invariant sequence Asp-Gly-Phe(Tyr)-Pro-Arg. In order to understand better the structural and functional role of individual amino acid residues belonging to the above sequence, three mutants of Escherichia coli adenylate kinase (D84H, G85V, and F86L) were produced by site-directed mutagenesis. Circular dichroism spectra revealed that the secondary structure dichroism spectra revealed that the secondary structure of all three mutant proteins is very similar to that of the wild-type enzyme. However, each of the substitutions resulted in a decreased thermodynamic stability of the protein, as indicated by differential scanning calorimetry measurements and equilibrium unfolding experiments in guanidine HCl. The destabilizing effect was most pronounced for the G85V mutant, in which case the denaturation temperature was decreased by as much as 11 degrees C. The catalytic activity of the three mutants represented less than 1% of that of the wild-type enzyme. Furthermore, for the D84H-modified form of adenylate kinase, the impaired binding of nucleotide substrates was accompanied by a markedly decreased affinity for magnesium ion. These observations support the notion that Asp84 is directly involved in binding of nucleotide substrates and that this binding is mediated by interaction of the aspartic acid residue with divalent cation. The two remaining residues probed in this study, Gly85 and Phe86, belong to a beta-turn which appears to play a major role in stabilizing the three-dimensional structure of adenylate kinase.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号