首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this study, we estimate the translation probability to amino acid from RNA codon. With the determined 183 translation probabilities and amino-acid composition of eight highly mutated proteins, we construct the theoretical distributions of mutated amino acids in these proteins and then compare them with their actual distributions affected by mutations. Thereafter we trace the pattern of translation probabilities from RNA codons to mutated amino acids of 1053 point missense mutations. Finally, we statistically conclude that the natural mutation trend goes along the theoretical translation probability.  相似文献   

2.
This is the continuation of our studies using random approaches to analyse the p53 protein family. In this data-based theoretical analysis, we use the random approach to analyse the amino acid pairs in human p53 protein in order to determine which amino acid pairs are more sensitive to 190 human p53 mutations/variants. The rationale of this study is based on our hypothesis and findings that a harmful mutation is more likely to occur at randomly unpredictable amino acid pairs, and a harmless mutation is more likely to occur at randomly predictable amino acid pairs. This is because we argue that the randomly predictable amino acid pairs should not be deliberately evolved, whereas the randomly unpredictable amino acid pairs should be deliberately evolved with a connection to protein function. The results show, for example, that 93.16% of 190 mutations/variants occur at randomly unpredictable amino acid pairs. Thus, the randomly unpredictable amino acid pairs are more sensitive to mutations/variants in human p53 protein. The results also suggest that the human p53 protein has a tendency for the occurrence of mutation/variants.  相似文献   

3.
In this study, we used the 183 translation probabilities between RNA codons and mutated amino acids to construct the theoretical distributions of mutated amino acids in hemagglutinins of influenza A virus. We then compared the actual distributions of mutated amino acids from 953 hemagglutinins with their theoretical ones. The results demonstrated that mutated amino acids generally follow the direction of the theoretical distributions governed by RNA codons. This, in turn, highlights the mutation trend of amino acids in hemagglutinins and provides a method for estimating possible mutations in a protein according to its theoretical distributions of mutated amino acids.  相似文献   

4.
The repeated amino-acid sequences in Citrobacter Freundii beta-lactamase may be indispensable for its function, because such repetitions cannot be simply attributed to a chance. In order to fully explore the functional units in Citrobacter Freundii beta-lactamase, it may need to analyse all the amino acid pairs, triplets, etc. along Citrobacter Freundii beta-lactamase from one terminal to the other terminal, to count their frequencies and calculate their probabilities. The amino-acid sequence of Citrobacter Freundii beta-lactamase was counted according to two-, three- and four-amino-acid sequences. The counted frequency and probability were compared with the predicted frequency and probability. The amino acid sequences, which appear in Citrobacter Freundii beta-lactamase and can be predicted from its amino acid composition according to a purely random mechanism, should not be deliberately evolved and conserved. By contrast, the amino acid sequences, which appear in Citrobacter Freundii beta-lactamase but cannot be predicted from its amino acid composition according to a purely random mechanism, should be deliberately evolved and conversed. Accordingly 99 (26.053%) and 33 (8.684%) of 380 two-amino-acid sequences can be predicted by the frequency and probability according to a purely random mechanism. Some kinds of amino acid sequences, which absent in Citrobacter Freundii beta-lactamase and can be predicted from its amino acid composition according to a purely random mechanism, should not be deliberately excluded from Citrobacter Freundii beta-lactamase. By contrast, some kinds of amino acid sequences, which absent in Citrobacter Freundii beta-lactamase and cannot be predicted from its amino acid composition according to a purely random mechanism, should be deliberately excluded from Citrobacter Freundii beta-lactamase. Accordingly 89 (48.370%) and 41 (22.283%) of 184 kinds of absent two-amino-acid sequences can be predicted by the frequency and probability according to a purely random mechanism, and 7236 (99.848%) of 7247 kinds of absent three-amino-acid sequences can be predicted by the frequency according to a purely random mechanism. The amino acids, whose probabilities in following certain preceding amino acids can be predicted from Citrobacter Freundii beta-lactamase amino acid composition according to a purely random mechanism, should not be deliberately evolved and conversed, accordingly 2 (0.526%) of 380 counted first order Markov transition probabilities for the second amino acid in two-amino-acid sequences match the predicted conditional probabilities.  相似文献   

5.
Wu G  Yan S 《Protein engineering》2003,16(3):195-199
In this data-based theoretical analysis, we use the random approach to analyse the amino acid pairs in human beta-glucocerebrosidase in order to determine which amino acid pairs are more sensitive to 109 variants from missense mutant human glucocerebrosidase. The rationale of this study is based on our hypothesis and findings that the harmful variants are more likely to occur at randomly unpredictable amino acid pairs and the non-harmful variants are more likely to occur at randomly predictable amino acid pairs. This is because we argue that the randomly predictable amino acid pairs should not be deliberately evolved, whereas the randomly unpredictable amino acid pairs should be deliberately evolved with connection of protein function. The results show, for example, that 93.58% of 109 variants occur at randomly unpredictable amino acid pairs, which account for 71.40% of amino acid pairs in glucocerebrosidase, and the chance of occurrence of the variant is about 4.4 times higher in randomly unpredictable amino acid pairs than in predictable pairs. Hence the randomly unpredictable amino acid pairs are more sensitive to variants in human glucocerebrosidase. The results also suggest that human glucocerebrosidase has a natural tendency to variants.  相似文献   

6.
It is well known that the evolutionary process leads to the majority of amino acids clustering in some regions rather than being homogenously distributed along a protein. Among numerous factors affecting the evolutionary process is chance, whose impact therefore should be present in a protein primary structure. The issue of how to measure the random distribution of amino acids in a primary structure is of importance for the understanding of protein structure and functions. In this study, we use the random principle as a tool to analyze and compare the distributions of amino acids in the primary structure of the p53 protein family. The results, for example, show that the amino acids are distributed more randomly in mouse p53 and less randomly in common tree shrew p53, the distribution ranks of amino acids are relatively lower in the functional regions (about 0.5 on average) than in the whole sequences (about 1.2 on average) except for mouse p53. From the probabilistic distribution view, the composition of human p53 is relatively stable in the functional regions rather than in the whole sequence, which may suggest one of the potential effects on the mutations inducing human cancers. In general, we can use the distribution probability to present quantitatively a type of distribution of amino acids in a protein, to compare quantitatively the magnitude of clusters between different proteins and to track the effect of chance on the evolutionary process.  相似文献   

7.
We report 31 point mutations in the factor IX gene and explore the relationship between the level of evolutionary conservation of an amino acid and the probability of a mutation causing hemophilia B. From our total sample of 125 hemophiliacs and from those reported by others, we identify 95 independent missense mutations, 94 of which occur at amino acids that are evolutionarily conserved in the available mammalian factor IX sequences. The likelihood of a missense mutation causing hemophilia B depends on whether the residue is also conserved in the factor IX-related proteases: factor VII, factor X, and protein C. Most of the possible missense mutations in generically conserved residues (i.e., those conserved in factor IX and in all the related proteases) should cause disease. In contrast, missense mutations in factor IX-specific residues (i.e., those conserved in human, cow, dog, and mouse factor IX but not in the related proteases) are sixfold less likely to cause disease. Missense mutations at nonconserved residues are 33-fold less likely to cause disease. At least three models are compatible with these observations. A comparison of sequence alignments from four and nine species of factor IX and an examination of the missense mutations occurring at CpG residues suggest a model in which most residues fall on opposite ends of a spectrum. In about 40% of residues, virtually any missense mutation in a minority of the residues will cause disease, while virtually no missense mutations will cause disease in most of the remaining residues. Thus, many of the residues in factor IX are spacers; that is, the main chains are presumably necessary to keep other amino acid interactions in register, but the nature of the side chain is unimportant.  相似文献   

8.
Wu G  Yan S 《Peptides》2002,23(12):2085-2090
In this data-based theoretical analysis, we use a random approach to estimate amino acid pairs in human phenylalanine 4-hydroxylase (PAH) protein in order to determine which amino acid pairs are more sensitive to 187 variants in human PAH protein. The rationale of this study is based on our hypothesis and previous findings that the harmful variants are more likely to occur at randomly unpredictable amino acid pairs rather than at randomly predictable pairs. This is reasonable to argue as randomly predictable amino acid pairs are less likely to be deliberately evolved, whereas randomly unpredictable amino acid pairs are probably deliberately evolved in connection with protein function. 94.12% of 187 variants occurred at randomly unpredictable amino acid pairs, which accounted for 71.84% of 451 amino acid pairs in human PAH protein. The chance of a variant occurring is five times higher in randomly unpredictable amino acid pairs than in predictable pairs. Thus, randomly unpredictable amino acid pairs are more sensitive to variance in human PAH protein. The results also suggest that the human PAH protein has a natural tendency towards variants.  相似文献   

9.
Wu G  Yan S 《Peptides》2003,24(3):347-352
In this data-based theoretical analysis, we use the random approach to analyze the amino acid pairs in 5(IV) chain precursor (CA54) in order to determine which amino acid pairs are more sensitive to 151 variants from missense mutant human CA54 protein. The rationale of this study is based on our hypothesis and previous findings that harmful variance is more likely to occur at randomly unpredictable amino acid pair position rather than at randomly predictable positions. This is reasonable to argue as randomly predictable amino acid pairs are less likely to be deliberately evolved, whereas randomly unpredictable amino acid pairs are probably deliberately evolved in connection with protein function. The results show that all 151 variants occurred at randomly unpredictable amino acid pairs and the chance of a variant occurring is markedly higher in randomly unpredictable amino acid pairs than in predictable pairs. Thus, randomly unpredictable amino acid pairs are more sensitive to variance in human CA54. The results also suggest that the human CA54 protein has a natural tendency towards variants.  相似文献   

10.
The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans.  相似文献   

11.
In this study, we use the cross-impact analysis to define the relationship among impact, mutation, and outbreak of bird flu. Then we use the distribution rank, which is developed by us over last several years, to quantify the mutations from amino acid sequences of 134 hemagglutinins and 97 neuraminidases. With the help of Bayesian equation, we calculate the probability of occurring of mutation in H5, H6, and H9 hemagglutinins, and N1 and N2 neuraminidases. Finally, we estimate the probability of occurring of mutation with different intensities of an impact. Although we have no means to predict an impact, which is severe enough to lead to the mutations in hemagglutinins and neuraminidases resulting in the outbreak of bird flu, we can in principle monitor the changes in distribution rank along the time course, and predict the trend of mutations, even to predict the degree of outbreak of bird flu.  相似文献   

12.
In this study, we determine the mutation relation among 333 H5N1 hemagglutinins of influenza A viruses according to their amino acid and RNA codon sequences. Then, we calculate seven probabilistic numbers, which have been developed by us since 1999, for each amino acid in these hemagglutinins. With the seven numeric numbers as independents and the probability of occurrence of mutation at each hemagglutinin position as dependent, we use the logistic regression to model 967 missense point mutations from 333 hemagglutinins to get the population estimates. Thereafter, we predict the future mutation positions in H5N1 hemagglutinin. Finally, we use the translation probabilities between RNA codons and mutated amino acids to predict the would-be-mutated amino acids in H5N1 hemagglutinin.  相似文献   

13.
We outline a method for estimating quantitatively the influence of point mutations and selection on the frequencies of codons and amino acids. We show how the mutation rate, i.e., the rate of amino acid replacement due to point mutation, can be affected by the codon usage as well as by the rates of the involved base exchanges. A comparison of the mutation rates calculated from reliable values of codon usage and base exchange probabilities with those that would be expected on the basis of chance reveals a notable suppression of replacements leading to tryptophan, glutamate, lysine, and methionine, and particularly of those leading to the termination codons. If selection constraints are neglected and only mutations are taken into account, the best agreement between expected and observed frequencies of both codons and amino acids is obtained for alpha = 1.13-1.15, where (Formula: see text). The "selection values" of codons and amino acids derived by our method show a pattern that partially deviates from others in the literature. For example, the selection pressure on methionine and cysteine turns out to be much more pronounced than expected if only the discrepancies between their observed and expected occurrences in proteins are considered. To estimate to what extent randomly occurring amino acid replacements are accepted by selection, we constructed an "acceptability matrix" from the well-established matrix of accepted point mutations. On the basis of this matrix "acceptability values" of the amino acids can be defined that correlate with their selection values. We also examine the significance of mutations and selection of amino acids with respect to their physicochemical properties and functions in proteins. The conservatism of amino acid replacements with respect to certain properties such as polarity can be brought about by the mutational process alone, whereas the conservatism with respect to other relevant properties--among them all measures of bulkiness--obviously is the result of additional selectional constraints on the evolution of protein structures.  相似文献   

14.
Summary We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identity using the run test statistic (r o) of Mood (1940,Ann. Math. Stat. 11, 367–392). The probability density ofr o for a collection of random sequences has mean=0 and variance=1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong α-helix propensity show a strong tendency to cluster whereas those with β-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic “patterns” that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.  相似文献   

15.
Recent models of adaptation at the DNA sequence level assume that the fitness effects of new mutations show certain statistical properties. In particular, these models assume that the distribution of fitness effects among new mutations is in the domain of attraction of the so-called Gumbel-type extreme value distribution. This assumption has not, however, been justified on any biological or theoretical grounds. In this note, I study random mutation in one of the simplest models of mutation and adaptation-Fisher's geometric model. I show that random mutation in this model yields a distribution of mutational effects that belongs to the Gumbel type. I also show that the distribution of fitness effects among rare beneficial mutations in Fisher's model is asymptotically exponential. I confirm these analytic findings with exact computer simulations. These results provide some support for the use of Gumbel-type extreme value theory in studies of adaptation and point to a surprising connection between recent phenotypic- and sequence-based models of adaptation: in both, the distribution of fitness effects among rare beneficial mutations is approximately exponential.  相似文献   

16.

Background

Zipf''s law states that the relationship between the frequency of a word in a text and its rank (the most frequent word has rank , the 2nd most frequent word has rank ,…) is approximately linear when plotted on a double logarithmic scale. It has been argued that the law is not a relevant or useful property of language because simple random texts - constructed by concatenating random characters including blanks behaving as word delimiters - exhibit a Zipf''s law-like word rank distribution.

Methodology/Principal Findings

In this article, we examine the flaws of such putative good fits of random texts. We demonstrate - by means of three different statistical tests - that ranks derived from random texts and ranks derived from real texts are statistically inconsistent with the parameters employed to argue for such a good fit, even when the parameters are inferred from the target real text. Our findings are valid for both the simplest random texts composed of equally likely characters as well as more elaborate and realistic versions where character probabilities are borrowed from a real text.

Conclusions/Significance

The good fit of random texts to real Zipf''s law-like rank distributions has not yet been established. Therefore, we suggest that Zipf''s law might in fact be a fundamental law in natural languages.  相似文献   

17.
Although mutations drive the evolutionary process, the rates at which the mutations occur are themselves subject to evolutionary forces. Our purpose here is to understand the role of selection and random genetic drift in the evolution of mutation rates, and we address this question in asexual populations at mutation‐selection equilibrium neglecting selective sweeps. Using a multitype branching process, we calculate the fixation probability of a rare nonmutator in a large asexual population of mutators and find that a nonmutator is more likely to fix when the deleterious mutation rate of the mutator population is high. Compensatory mutations in the mutator population are found to decrease the fixation probability of a nonmutator when the selection coefficient is large. But, surprisingly, the fixation probability changes nonmonotonically with increasing compensatory mutation rate when the selection is mild. Using these results for the fixation probability and a drift‐barrier argument, we find a novel relationship between the mutation rates and the population size. We also discuss the time to fix the nonmutator in an adapted population of asexual mutators, and compare our results with experiments.  相似文献   

18.
It has been shown that malignant activation of ras proto-oncogenes was mediated by point mutations which resulted in the single amino acid conversions at positions 12, 13 or 61 of the ras gene products (p21 proteins). By analyzing randomly mutated ras genes, it has been demonstrated that amino acid substitutions at residues 12, 13, 59 and 63 activated p21. Furthermore, it has been shown that residues 16, 116 and 119 in p21 played critical roles in the guanine nucleotide binding and, consequently, the ability of the protein to induce changes characteristic of cellular transformation. By using the protein conformational prediction method of Chou and Fasman, the present work predicts that these critical amino acids, except glutamic acid at position 63, are located within beta-turns. The major "hot spots" for ras activation are codons 12 and 61. The author has predicted in an earlier paper that the single amino acid conversions at positions 12 and 61 would occur at beta-turn conformation consisting of residues 10-13 and 58-61, respectively. In the present study, probabilities of beta-turn occurrence at residues 10-13 or 58-61 of the p21 proteins encoded by various ras genes are compared. The probability for the normal p21 containing glycine as residue 12 is greatest, and the cancer-associated variants show less probabilities. The single amino acid substitutions at position 61 do not cause so decreased probabilities of beta-turn potential at residues 58-61, except the replacement by histidine. Histidine at position 61 is not predicted as occurring within a beta-turn.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

19.
Wu G  Yan S 《Peptides》2003,24(12):1837-1845
In this study, we analyzed the amino acid pairs affected by mutations in two spike proteins from human coronavirus strains 229E and OC43 by means of random analysis in order to gain some insight into the possible mutations in the spike protein from SARS-CoV. The results demonstrate that the randomly unpredictable amino acid pairs are more sensitive to the mutations. The larger is the difference between actual and predicted frequencies, the higher is the chance of mutation occurring. The effect induced by mutations is to reduce the difference between actual and predicted frequencies. The amino acid pairs whose actual frequencies are larger than their predicted frequencies are more likely to be targeted by mutations, whereas the amino acid pairs whose actual frequencies are smaller than their predicted frequencies are more likely to be formed after mutations. These findings are identical to our several recent studies, i.e. the mutations represent a process of degeneration inducing human diseases.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号