首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Summary A formal mathematical analysis of Kimura's (1981) six-parameter model of nucleotide substitution for the case of unequal substitution rates among different pairs of nucleotides is conducted, and new formulae for estimating the number of nucleotide substitutions and its standard error are obtained. By using computer simulation, the validities and utilities of Jukes and Cantor's (1969) one-parameter formula, Takahata and Kimura's (1981) four-parameter formula, and our sixparameter formula for estimating the number of nucleotide substitutions are examined under three different schemes of nucleotide substitution. It is shown that the one-parameter and four-parameter formulae often give underestimates when the number of nucleotide substitutions is large, whereas the six-parameter formula generally gives a good estimate for all the three substitution schemes examined. However, when the number of nucleotide substitutions is large, the six-parameter and four-parameter formulae are often inapplicable unless the number of nucleotides compared is extremely large. It is also shown that as long as the mean number of nucleotide substitutions is smaller than one per nucleotide site the three formulae give more or less the same estimate regardless of the substitution scheme used.On leave of absence from the Department of Biology, Faculty of Science, Kyushu University 33, Fukuoka 812, Japan  相似文献   

2.
Summary Statistical properties of Goodman et al.'s (1974) method of compensating for undetected nucleotide substitutions in evolution are investigated by using computer simulation. It is found that the method tends to overcompensate when the stochastic error of the number of nucleotide substitutions is large. Furthermore, the estimate of the number of nucleotide substitutions obtained by this method has a large variance. However, in order to see whether this method gives overcompensation when applied together with the maximum parsimony method, a much larger scale of simulation seems to be necessary.  相似文献   

3.
Summary Conducting computer simulations, Nei and Tateno (1978) have shown that Jukes and Holmquist's (1972) method of estimating the number of nucleotide substitutions tends to give an overestimate and the estimate obtained has a large variance. Holmquist and Conroy (1980) repeated some parts of our simulation and claim that the overestimation of nucleotide substitutions in our paper occurred mainly because we used selected data. Examination of Holmquist and Conroy's simulation indicates that their results are essentially the same as ours when the Jukes-Holmquist method is used, but since they used a different method of computation their estimates of nucleotide substitutions differed substantially from ours. Another problem in Holmquist and Conroy's Letter is that they confused the expected number of nucleotide substitution with the number in a sample. This confusion has resulted in a number of unnecessary arguments. They also criticized ourX2 measure, but this criticism is apparently due to a misunderstanding of the assumptions of our method and a failure to use our method in the way we described. We believe that our earlier conclusions remain unchanged.  相似文献   

4.
Two types of amino acid substitutions in protein evolution   总被引:35,自引:0,他引:35  
Summary The frequency of amino acid substitutions, relative to the frequency expected by chance, decreases linearly with the increase in physico-chemical differences between amino acid pairs involved in a substitution. This correlation does not apply to abnormal human hemoglobins. Since abnormal hemoglobins mostly reflect the process of mutation rather than selection, the correlation manifest during protein evolution between substitution frequency and physico-chemical difference in amino acids can be attributed to natural selection. Outside of abnormal proteins, the correlation also does not apply to certain regions of proteins characterized by rapid rates of substitution. In these cases again, except for the largest physico-chemical differences between amino acid pairs, the substitution frequencies seem to be independent of the physico-chemical parameters. The limination of the substituents involving the largest physicochemical differences can once more be attributed to natural selection. For smaller physico-chemical differences, natural selection, if it is operating in the polypeptide regions, must be based on parameters other than those examined.  相似文献   

5.
6.
Summary A simple method for the evolutionary analysis of amino acid sequence data is presented and used to examine whether the number of variable sites (NVS) of a protein is constant during its evolution. The NVSs for hemoglobin and for mitochondrial cytochrome c are each found to be almost constant, and the ratio between the NVSs is close to the ratio between the unit evolutionary periods. This indicates that the substitution rate per variable site is almost uniform for these proteins, as the neutral theory claims. An advantage of the present analysis is that it can be done without knowledge of paleontological divergence times and can be extended to bacterial proteins such as bacterial c-type cytochromes. It is suggested that the NVS of cytochrome c has been almost constant even over the long period (ca. 3.0 billion years) of bacterial evolution but that at least two different substitution rates are necessary to describe the accumulated changes in the sequence. This two clock interpretation is consistent with fossil evidence for the appearance times of photosynthetic bacteria and eukaryotes.  相似文献   

7.
A general model for estimating the number of amino acid substitutions per site (d) from the fraction of identical residues between two sequences (q) is proposed. The well-known Poisson-correction formula q = e –d corresponds to a site-independent and amino-acid-independent substitution rate. Equation q = (1 – e –2d )/2d, derived for the case of substitution rates that are site-independent, but vary among amino acids, approximates closely the empirical method, suggested by Dayhoff et al. (1978). Equation q = 1/(1 + d) describes the case of substitution rates that are amino acid-independent but vary among sites. Lastly, equation q = [ln(1 + 2d)]/2d accounts for the general case where substitution rates can differ for both amino acids and sites.  相似文献   

8.
Arndt PF 《Gene》2007,390(1-2):75-83
Maximum likelihood phylogeny reconstruction methods are widely used in uncovering and assessing the evolutionary history and relationships of natural systems. However, several simplifying assumptions commonly made in this analysis limit the explanatory power of the results obtained. We present an algorithm that performs the phylogenetic analysis without making the common assumptions for sequence data from at least three leaf nodes in a star phylogeny. In particular, the underlying nucleotide substitution model does not have to be reversible and may include neighbor-dependent processes like the CpG methylation deamination process (CpG-effect). The base composition of the sequences at the external nodes and the one of the ancestral sequence may be different from each other and they do not have to be stationary state distributions of the corresponding substitution model. The algorithm is able to reconstruct the ancestral base composition and accurately estimate substitution frequencies in the branches of the star phylogeny. Extensive tests on simulated data validate the very favorable performance of the algorithm. As an application we present the analysis of aligned genomic sequences from human, mouse, and dog. Different substitution pattern can be observed in the three lineages.  相似文献   

9.
Using an information theoretic formalism, we optimize classes of amino acid substitution to be maximally indicative of local protein structure. Our statistically-derived classes are loosely identifiable with the heuristic constructions found in previously published work. However, while these other methods provide a more rigid idealization of physicochemically constrained residue substitution, our classes provide substantially more structural information with many fewer parameters. Moreover, these substitution classes are consistent with the paradigmatic view of the sequence-to-structure relationship in globular proteins which holds that the three-dimensional architecture is predominantly determined by the arrangement of hydrophobic and polar side chains with weak constraints on the actual amino acid identities. More specific constraints are imposed on the placement of prolines, glycines, and the charged residues. These substitution classes have been used in highly accurate predictions of residue solvent accessibility. They could also be used in the identification of homologous proteins, the construction and refinement of multiple sequence alignments, and as a means of condensing and codifying the information in multiple sequence alignments for secondary structure prediction and tertiary fold recognition. © 1996 Wiley-Liss, Inc.  相似文献   

10.
11.
Evolution of the amino acid substitution in the mammalian myoglobin gene   总被引:1,自引:0,他引:1  
Summary Multivariate statistical analyses were applied to 16 physical and chemical properties of amino acids. Four of these properties; volume, polarity, isoelectric point (charge), and hydrophobicity were found to explain adequately 96% of the total variance of amino acid attributes. Using these four quantitative measures of amino acid properties, a structural discriminate function in the form of a weighted difference sum of squares equation was developed. The discriminate function is weighted by the location of each particular residue within a given tertiary structure and yields a numerical discriminate or difference value for the replacement of these residues by different amino acids. This resulting discriminate value represents an expression of the perturbation in the local positional environment of a protein when an amino acid substitution occurs. With the use of this structural discriminate function, a residue by residue comparison of the known mammalian myoglobin sequences was carried out in an attempt to elucidate the positions of possible deviations from the known tertiary structure of sperm whale myoglobin. Only 11 of the 153 residue positions in myoglobin demonstrated possible structural deviations. From this analysis, indices of difference were calculated for all amino acid exchanges between the various myoglobins. All comparisons yielded indices of difference that were considerably lower than would be expected if mutations had been fixed at random, even if the organization of the genetic code is taken into consideration. On the basis of these results, it is inferred that some form of selection has acted in the evolution of mammalian myoglobins to favor amino acid substitutions that are compatible with the retention of the original conformation of the protein.  相似文献   

12.
Human immunodeficiency virus (HIV) exhibits immunological hypervariability, which has been an obstacle to successful production of effective anti-HIV vaccines. In this study, we estimated patterns of nucleotide and amino acid substitutions in the env gene of HIVs, with the aim of finding characteristics of the mechanism which generates the immunological diversity of the env protein of HIVs. We found that nucleotide changes between A and G are predominant compared to those between other nucleotides. Since this feature is consistent with the pattern of nucleotide substitutions of other retroviral genes but is quite different from those of most eukaryotic genes, a high rate of nucleotide substitution between A and G appears to be specific for retroviruses including HIVs. We discuss the biological relationship between this biased substitution and the mechanism generating hypervariability of epitopes on the env protein of HIVs.  相似文献   

13.
We have sequenced the entire exon (1,180 bp) encoding the zinc finger domain of the X-linked and Y-linked zinc finger genes (ZFX and ZFY, respectively) in the orangutan, the baboon, the squirrel monkey, and the rat; a total of 9,442 by were sequenced. The ratio of the rates of synonymous substitution in the ZFY and ZFX genes is estimated to be 2.1 in primates. This is close to the ratio of 2.3 estimated from primate ZFY and ZFX intron sequences and supports the view that the male-to-female ratio of mutation rate in humans is considerably higher than 1 but not extremely large. The ratio of synonymous substitution rates in ZFY and ZFX is estimated to be 1.3 in the rat lineage but 4.2 in the mouse lineage. The former is close to the estimate (1.4) from introns. The much higher ratio in the mouse lineage (not statistically significant) might have arisen from relaxation of selective constraints. The synonymous divergence between mouse and rat ZFX is considerably lower than that between mouse and rat autosomal genes, agreeing with previous observations and providing some evidence for stronger selective constraints on synonymous changes in X-linked genes than in autosomal genes. At the protein level ZFX has been highly conserved in all placental mammals studied while ZFY has been well conserved in primates and foxes but has evolved rapidly in mice and rats, possibly due to relaxation of functional constraints as a result of the development of X-inactivation of ZFX in rodents. The long persistence of the ZFY-ZFX gene pair in mammals provides some insight into the process of degeneration of Y-linked genes.Correspondence to: W.-H. Li  相似文献   

14.
Mitochondrial DNA (mtDNA) sequences are widely used for inferring the phylogenetic relationships among species. Clearly, the assumed model of nucleotide or amino acid substitution used should be as realistic as possible. Dependence among neighboring nucleotides in a codon complicates modeling of nucleotide substitutions in protein-encoding genes. It seems preferable to model amino acid substitution rather than nucleotide substitution. Therefore, we present a transition probability matrix of the general reversible Markov model of amino acid substitution for mtDNA-encoded proteins. The matrix is estimated by the maximum likelihood (ML) method from the complete sequence data of mtDNA from 20 vertebrate species. This matrix represents the substitution pattern of the mtDNA-encoded proteins and shows some differences from the matrix estimated from the nuclear-encoded proteins. The use of this matrix would be recommended in inferring trees from mtDNA-encoded protein sequences by the ML method. Received: 3 May 1995 / Accepted: 31 October 1995  相似文献   

15.
The amino acid composition of human alcohol dehydrogenase (ADH) was compared with alcohol dehydrogenases from different organisms and with other proteins. Similar amino acid sequences in human ADH (template protein) and in other proteins were determined by means of an original computer program. Analysis of amino acid motifs reveals that the ADHs from evolutionary more close organisms have more common amino acid sequences. The quantity measure of amino acid similarity was the number of similar motifs in analyzed protein per protein length. This value was measured for ADHs and for different proteins. For ADHs, this quotient was higher than for proteins with different functions; for vertebrates it correlated with evolutionary closeness. The similar operation of motif comparison was made with the help of program complex “MEME”. The analysis of ADHs revealed 4 motifs common to 6 of 10 tested organisms and no such motifs for proteins of different function. The conclusion is that general amino composition is more important for protein function than amino acid order and for enzymes of similar function it better correlates with evolutionary distance between organisms.  相似文献   

16.
Summary Some simple formulae were obtained which enable us to estimate evolutionary distances in terms of the number of nucleotide substitutions (and, also, the evolutionary rates when the divergence times are known). In comparing a pair of nucleotide sequences, we distinguish two types of differences; if homologous sites are occupied by different nucleotide bases but both are purines or both pyrimidines, the difference is called type I (or transition type), while, if one of the two is a purine and the other is a pyrimidine, the difference is called type II (or transversion type). Letting P and Q be respectively the fractions of nucleotide sites showing type I and type II differences between two sequences compared, then the evolutionary distance per site is K = — (1/2) ln {(1 — 2P — Q) }. The evolutionary rate per year is then given by k = K/(2T), where T is the time since the divergence of the two sequences. If only the third codon positions are compared, the synonymous component of the evolutionary base substitutions per site is estimated by K'S = — (1/2) ln (1 — 2P — Q). Also, formulae for standard errors were obtained. Some examples were worked out using reported globin sequences to show that synonymous substitutions occur at much higher rates than amino acid-altering substitutions in evolution.Contribution No. 1330 from the National Institute of Genetics, Mishima, 411 Japan  相似文献   

17.
Summary Several forms of maximum likelihood models are applied to aligned amino acid sequence data coded for in the mitochondrial DNA of six species (chicken, frog, human, bovine, mouse, and rat). These models range in form from relatively simple models of the type currently used for inferring phylogenetic tree structure to models more complex than those that have been used previously. No major discrepancies between the optimal trees inferred by any of these methods are found, but there are huge differences in adequacy of fit. A very significant finding is that the fit of any of these models is vastly improved by allowing a certain proportion of the amino acid sites to be invariant. An even more important, although disquieting, finding is that none of these models fits well, as judged by standard statistical criteria. The primary reason for this is that amino acid sites undergo substitution according to a process that is very heterogeneous. Because most phylogenetic inference is accomplished by choosing the optimal tree under the assumption that a homogeneous process is acting on the sites, the potential invalidity of some such conclusions is raised by this article's results. The seriousness of this problem depends upon the robustness of the phylogenetic inferential procedure to departures from the underlying model.  相似文献   

18.
Two hemoglobin components are recognized in erythrocytes of the adult Tinamou. We determined the amino acid sequences of Tinamou D-, A-, and -globins from intact globin chains and several chemically cleaved fragments. A remarkable feature of Tinamou hemoglobin was a deletion in the D-globin chain. This has not been reported in the literature, except in pigeon embryonic D-globin. The amino acid sequences of Tinamou globin were highly similar to those of Ostrich and Rhea hemoglobin. Comparison between Tinamou, Ostrich, and Rhea that suggested the evolution speed of globin, D = A > , was related with the early appearance birds. The important residues in Tinamou hemoglobin as the heme contact and oxygen binding regions were highly conserved in other species.  相似文献   

19.
Summary We have isolated complementary DNA (cDNA) clones for apocytochrome c from the green algaChlamydomonas reinhardtii and shown that they are encoded by a single nuclear gene termedcyc.Cyc mRNA levels are found to depend primarily on the presence of acetate as a reduced carbon source in the culture medium. The deduced amino acid sequence shows that, apart from the probable removal of the initiating methionine,C. reinhardtii apocytochrome c is syntheszed in its mature form. Its structure is generally similar to that of cytochromes c from higher plants. Several punctual deviations from the general pattern of cytochrome c sequences that is found in other organisms have interesting structural and functional implications. These include, in particular, valines 19 and 39, asparagine 78, and alanine 83. A phylogenetic tree was constructed by the matrix method from cytochrome c data for a representative range of species. The results suggest thatC. reinhardtii diverged from higher plants approximately 700–750 million years ago; they also are not easy to reconcile with the current attribution ofChlamydomonas reinhardtii andEnteromorpha intestinalis to a unique phylum, because these two species probably diverged from one another at about the same time as they diverged from the line leading to higher plants.  相似文献   

20.
A 3 kb DNA fragment containing the gene (mdh) encoding malate dehydrogenase (MDH) from the thermophile Thermus aquaticus B was cloned in Escherichia coli and its nucleotide sequence determined. Comparative analysis showed the nucleotide sequence to be very closely related to that determined for the Thermus flavus mdh gene and flanking regions, with no differences between the predicted amino acid sequences of the MDHs. A proximal open reading frame, identified as the sucD gene, and the mdh gene may be parts of the same operon in T. aquaticus B. Expression of the T. aquaticus B mdh gene in E. coli was found to be at a relatively low level. A simple method for purification of thermostable MDH from the E. coli clone containing the T. aquaticus B mdh gene is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号