首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
MOTIVATION: It is well known that neighbouring nucleotides in DNA sequences do not mutate independently of each other. In this paper, we introduce a context-dependent substitution model and derive an algorithm to calculate the likelihood of sequences evolving under this model. We use this algorithm to estimate neighbour-dependent substitution rates, as well as rates for dinucleotide substitutions, using a Bayesian sampling procedure. The model is irreversible, giving an arrow to time, and allowing the position of the root between a pair of sequences to be inferred without using out-groups. RESULTS: We applied the model upon aligned human-mouse non-coding data. Clear neighbour dependencies were observed, including 17-18-fold increased CpG to TpG/CpA rates compared with other substitutions. Root inference positioned the root halfway the mouse and human tips, suggesting an approximately clock-like behaviour of the irreversible part of the substitution process.  相似文献   

3.
A codon-based model of nucleotide substitution for protein-coding DNA sequences   总被引:11,自引:23,他引:11  
A codon-based model for the evolution of protein-coding DNA sequences is presented for use in phylogenetic estimation. A Markov process is used to describe substitutions between codons. Transition/transversion rate bias and codon usage bias are allowed in the model, and selective restraints at the protein level are accommodated using physicochemical distances between the amino acids coded for by the codons. Analyses of two data sets suggest that the new codon-based model can provide a better fit to data than can nucleotide-based models and can produce more reliable estimates of certain biologically important measures such as the transition/transversion rate ratio and the synonymous/nonsynonymous substitution rate ratio.   相似文献   

4.
It is well known that due to the degeneracy of genetic code, most of the silent substitutions appear in the third codon position, so the mutation frequency of the third codon position is much higher than that of the first two positions. However, it remains unknown whether the directionality of point mutation in three codon positions is similar or not. In this paper, through analyzing 15 sets of orthologous genes, it is revealed that most of the substitution types are significantly different between any two codon positions, especially between the 2nd and the 3rd phases. Furthermore, the average frequencies of each type of substitution calculated from the fifteen sets of orthologous genes are similar to those identified in single nucleotide polymorphisms (SNPs) of human and mouse genome. The present analyses suggest that the nucleotide substitution in protein-coding sequences is not only context-dependent (so called neighboring-nucleotide effects), but also phase-dependent, which is of significance to improving the prevalent nucleotide-evolution models.  相似文献   

5.
Variation in chloroplastrbcL sequences was studied in representative species of four different lineages: the tribeRubieae (Rubiaceae), and the generaDrosera (Droseraceae),Nothofagus (Nothofagaceae) andIlex (Aquifoliaceae). Each lineage has its particular non-overlapping set ofrbcL polymorphic sites, indicating that common unconstrainedrbcL sites are not shared. Large differences in the rate and pattern of nucleotide substitution are observed among the four lineages. The genusIlex has the lowest rate of substitution, the lowest transition/transversion ratio, the lowest synonymous/replacement ratio and the lowest number of substitutions at the third codon position. An apparent relationship of these measures to the age of the lineages is observed. The A + T content and codon use among the four lineages are very similar and, apparently, cannot account for the observed differences in patterns of nucleotide substitution. However, the A + T content of the two bases immediately flanking the polymorphic sites is higher inIlex than in the other lineages. This could be correlated with the transversion/transition bias observed inIlex. The particularly low synonymous/replacement ratio found inIlex could also be explained by the small population sizes of species in this genus.  相似文献   

6.
7.
Summary A formal mathematical analysis of Kimura's (1981) six-parameter model of nucleotide substitution for the case of unequal substitution rates among different pairs of nucleotides is conducted, and new formulae for estimating the number of nucleotide substitutions and its standard error are obtained. By using computer simulation, the validities and utilities of Jukes and Cantor's (1969) one-parameter formula, Takahata and Kimura's (1981) four-parameter formula, and our sixparameter formula for estimating the number of nucleotide substitutions are examined under three different schemes of nucleotide substitution. It is shown that the one-parameter and four-parameter formulae often give underestimates when the number of nucleotide substitutions is large, whereas the six-parameter formula generally gives a good estimate for all the three substitution schemes examined. However, when the number of nucleotide substitutions is large, the six-parameter and four-parameter formulae are often inapplicable unless the number of nucleotides compared is extremely large. It is also shown that as long as the mean number of nucleotide substitutions is smaller than one per nucleotide site the three formulae give more or less the same estimate regardless of the substitution scheme used.On leave of absence from the Department of Biology, Faculty of Science, Kyushu University 33, Fukuoka 812, Japan  相似文献   

8.
A new approach to the identification of point mutations by allele-specific PCR was proposed. The mutation R408W of the human phenylalanine hydroxylase gene was used as a model. A high specificity of the approach was achieved by the use of primers partially complementary to the genomic DNA. Polyethylene glycol covalently attached to one of the allele-specific primers provides for the differential identification of the PCR products due to a change in electrophoretic mobility.  相似文献   

9.
MOTIVATION: Neighbor-dependent substitution processes generated specific pattern of dinucleotide frequencies in the genomes of most organisms. The CpG-methylation-deamination process is, e.g. a prominent process in vertebrates (CpG effect). Such processes, often with unknown mechanistic origins, need to be incorporated into realistic models of nucleotide substitutions. RESULTS: Based on a general framework of nucleotide substitutions we developed a method that is able to identify the most relevant neighbor-dependent substitution processes, estimate their relative frequencies and judge their importance in order to be included into the modeling. Starting from a model for neighbor independent nucleotide substitution we successively added neighbor-dependent substitution processes in the order of their ability to increase the likelihood of the model describing given data. The analysis of neighbor-dependent nucleotide substitutions based on repetitive elements found in the genomes of human, zebrafish and fruit fly is presented. AVAILABILITY: A web server to perform the presented analysis is freely available at: http://evogen.molgen.mpg.de/server/substitution-analysis  相似文献   

10.
11.
Markov models describing the evolution of the nucleotide substitution process, widely used in phylogeny reconstruction, usually assume the hypotheses of stationarity and time reversibility. Although these models give meaningful results when applied to biological data, it is not clear if the 2 assumptions mentioned above hold and, if not, how much sequence evolution processes deviate from them. To this aim, we introduce 2 sets of indices that can be calculated from the nucleotide distribution and the substitution rates. The stationarity indices (STIs) can be used to test the validity of the equilibrium assumption. The irreversibility indices (IRIs) are derived from the Kolmogorov cycle conditions for time reversibility and quantify the degree of nontime reversibility of a process. We have computed STIs and IRIs for the evolutionary process of 2 lineages, Drosophila simulans and Homo sapiens. In the latter case, we use a modified form of the indices that takes into account the CpG decay process. In both cases, we find statistically significant deviations from the ideal case of a process that has reached stationarity and is time reversible.  相似文献   

12.
13.
Holland  BR  Schmid  J 《BMC microbiology》2005,5(1):1-11

Background

The sexually transmitted disease, gonorrhea, is a serious health problem in developed as well as in developing countries, for which treatment continues to be a challenge. The recent completion of the genome sequence of the causative agent, Neisseria gonorrhoeae, opens up an entirely new set of approaches for studying this organism and the diseases it causes. Here, we describe the initial phases of the construction of an expression-capable clone set representing the protein-coding ORFs of the gonococcal genome using a recombination-based cloning system.

Results

The clone set thus far includes 1672 of the 2250 predicted ORFs of the N. gonorrhoeae genome, of which 1393 (83%) are sequence-validated. Included in this set are 48 of the 61 ORFs of the gonococcal genetic island of strain MS11, not present in the sequenced genome of strain FA1090. L-arabinose-inducible glutathione-S-transferase (GST)-fusions were constructed from random clones and each was shown to express a fusion protein of the predicted size following induction, demonstrating the use of the recombination cloning system. PCR amplicons of each ORF used in the cloning reactions were spotted onto glass slides to produce DNA microarrays representing 2035 genes of the gonococcal genome. Pilot experiments indicate that these arrays are suitable for the analysis of global gene expression in gonococci.

Conclusion

This archived set of Gateway® entry clones will facilitate high-throughput genomic and proteomic studies of gonococcal genes using a variety of expression and analysis systems. In addition, the DNA arrays produced will allow us to generate gene expression profiles of gonococci grown in a wide variety of conditions. Together, the resources produced in this work will facilitate experiments to dissect the molecular mechanisms of gonococcal pathogenesis on a global scale, and ultimately lead to the determination of the functions of unknown genes in the genome.  相似文献   

14.

Background  

Micro-biological research relies on the use of model organisms that act as representatives of their species or subspecies, these are frequently well-characterized laboratory strains. However, it has often become apparent that the model strain initially chosen does not represent important features of the species. For micro-organisms, the diversity of their genomes is such that even the best possible choice of initial strain for sequencing may not assure that the genome obtained adequately represents the species. To acquire information about a species' genome as efficiently as possible, we require a method to choose strains for analysis on the basis of how well they represent the species.  相似文献   

15.
Interspersed repeats have emerged as a valuable tool for studying neutral patterns of molecular evolution. Here we analyze variation in the rate and pattern of nucleotide substitution across all autosomes in the chicken genome by comparing the present-day CR1 repeat sequences with their ancestral copies and reconstructing nucleotide substitutions with a maximum likelihood model. The results shed light on the origin and evolution of large-scale heterogeneity in GC content found in the genomes of birds and mammals--the isochore structure. In contrast to mammals, where GC content is becoming homogenized, heterogeneity in GC content is being reinforced in the chicken genome. This is also supported by patterns of substitution inferred from alignments of introns in chicken, turkey, and quail. Analysis of individual substitution frequencies is consistent with the biased gene conversion (BGC) model of isochore evolution, and it is likely that patterns of evolution in the chicken genome closely resemble those in the ancestral amniote genome, when it is inferred that isochores originated. Microchromosomes and distal regions of macrochromosomes are found to have elevated substitution rates and a more GC-biased pattern of nucleotide substitution. This can largely be accounted for by a strong correlation between GC content and the rate and pattern of substitution. The results suggest that an interaction between increased mutability at CpG motifs and fixation biases due to BGC could explain increased levels of divergence in GC-rich regions.  相似文献   

16.
The application of different substitution models to each gene (a.k.a. mixed model) should be considered in model‐based phylogenetic analysis of multigene sequences. However, a single molecular evolution model is still usually applied. There are no computer programs able to conduct model selection for multiple loci at the same time, though several recently developed types of software for phylogenetic inference can handle mixed model. Here, I have developed computer software named ‘kakusan’ that enables us to solve the above problems. Major running steps are briefly described, and an analysis of results with kakusan is compared to that obtained with other program.  相似文献   

17.
Hughes AL  French JO 《Gene》2007,387(1-2):31-37
Patterns of nucleotide substitution at orthologous loci were examined between three genomes of Ehrlichia ruminantium, the causative agent of heartwater disease of ruminants. The most recent common ancestor of two genomes (Erwe and Erwo) belonging to the Welgevonden strain was estimated to have occurred 26,500-57,000 years ago, while the most recent common ancestor of these two genomes and the Erga genome (Gardel strain) was estimated to have occurred 2.1-4.7 million years ago. The search for genes showing extremely high values of the number of synonymous substitutions per site was used to identify genes involved in past homologous recombination. The most striking case involved the map1 gene, encoding major antigenic protein-1; evidence for homologous recombination is consistent with previous phylogenetic analysis of map1 alleles. At this and certain other loci, homologous recombination may have contributed to the evolution of host-pathogen interactions. In addition, comparison of the patterns of synonymous and nonsynonymous substitution provided evidence for positive selection favoring a high level of amino acid change between the Welgevonden and Gardel strains at a locus of unknown function (designated Erum4340 in the Erwo genome).  相似文献   

18.
Miyazawa S 《PloS one》2011,6(3):e17244

Background

Empirical substitution matrices represent the average tendencies of substitutions over various protein families by sacrificing gene-level resolution. We develop a codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene. First, selective constraints averaged over proteins are estimated by maximizing the likelihood of each 1-PAM matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution matrices. Then, selective constraints specific to given proteins are approximated as a linear function of those estimated from the empirical substitution matrices.

Results

Akaike information criterion (AIC) values indicate that a model allowing multiple nucleotide changes fits the empirical substitution matrices significantly better. Also, the ML estimates of transition-transversion bias obtained from these empirical matrices are not so large as previously estimated. The selective constraints are characteristic of proteins rather than species. However, their relative strengths among amino acid pairs can be approximated not to depend very much on protein families but amino acid pairs, because the present model, in which selective constraints are approximated to be a linear function of those estimated from the JTT/WAG/LG/KHG matrices, can provide a good fit to other empirical substitution matrices including cpREV for chloroplast proteins and mtREV for vertebrate mitochondrial proteins.

Conclusions/Significance

The present codon-based model with the ML estimates of selective constraints and with adjustable mutation rates of nucleotide would be useful as a simple substitution model in ML and Bayesian inferences of molecular phylogenetic trees, and enables us to obtain biologically meaningful information at both nucleotide and amino acid levels from codon and protein sequences.  相似文献   

19.
In this article, a new approach is presented for estimating the efficiencies of the nucleotide substitution models in a four-taxon case and then this approach is used to estimate the relative efficiencies of six substitution models under a wide variety of conditions. In this approach, efficiencies of the models are estimated by using a simple probability distribution theory. To assess the accuracy of the new approach, efficiencies of the models are also estimated by using the direct estimation method. Simulation results from the direct estimation method confirmed that the new approach is highly accurate. The success of the new approach opens a unique opportunity to develop analytical methods for estimating the relative efficiencies of the substitution models in a straightforward way.  相似文献   

20.
Patterns of nucleotide substitution in pseudogenes and functional genes   总被引:26,自引:0,他引:26  
Summary The pattern of point mutations is inferred from nucleotide substitutions in pseudogenes. The pattern obtained suggests that transition mutations occur somewhat more frequently than transversion mutations and that mutations result more often in A or T than in G or C. Our results are discussed with respect to the predictions from Topal and Fresco's model for the molecular basis of point (substitution) mutations (Nature 263:285–289, 1976). The pattern of nucleotide substitution at the first and second positions of codons in functional genes is quite similar to that in pseudogenes, but the relative frequency of the transition CT in the sense strand is drastically reduced and those of the transversions CG and GC are doubled. The differences between the two patterns can be explained by the observation that in the protein evolution amino acid substitutions occur mainly between amino acids with similar biochemical properties (Grantham, Science 185:862–864, 1974). Our results for the patterns of nucleotide substitutions in pseudogenes and in functional genes lead to the prediction that both the coding and non-coding regions of protein coding genes should have high frequencies of A and T. Available data show that the non-coding regions are indeed high in A and T but the coding regions are low in T, though high in A.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号