共查询到20条相似文献,搜索用时 15 毫秒
1.
Longitudinal samples of DNA sequences, the DNA sequences sampled from the same population at different time points, have increasingly been used to study the evolutionary process of fast-evolving organisms, e.g., RNA virus, in recent years. We propose in this article several methods for testing genetical isochronism or detecting significant genetical heterochronism in this type of sample. These methods can be used to determine the necessary sample size and sampling interval in experimental design or to combine genetically isochronic samples for better data analysis. We investigate the properties of these test statistics, including their powers of detecting heterochronism, assuming different evolutionary processes using simulation. The possible choices and usages of these test statistics are discussed. 相似文献
2.
Characterizing the time dependency of human mitochondrial DNA mutation rate estimates 总被引:2,自引:0,他引:2
Previous research has established a discrepancy of nearly anorder of magnitude between pedigree-based and phylogeny-based(human vs. chimpanzee) estimates of the mitochondrial DNA (mtDNA)control region mutation rate. We characterize the time dependencyof the human mitochondrial hypervariable region one mutationrate by generating 14 new phylogeny-based mutation rate estimatesusing within-human comparisons and archaeological dates. Rateestimates based on population events between 15,000 and 50,000years ago are at least 2-fold lower than pedigree-based estimates.These within-human estimates are also higher than estimatesgenerated from phylogeny-based human–chimpanzee comparisons.Our new estimates establish a rapid decay in evolutionary mutationrate between approximately 2,500 and 50,000 years ago and aslow decay from 50,000 to 6 Ma. We then extend this analysisto the mtDNA-coding region. Our within-human coding region mutationrate estimates display a similar, though less rapid, time-dependentdecay. We explore the possibility that multiple hits explainthe discrepancy between pedigree-based and phylogeny-based mutationrates. We conclude that whereas nucleotide substitution modelsincorporating multiple hits do provide a possible explanationfor the discrepancy between pedigree-based and human–chimpanzeemutation rate estimates, they do not explain the rapid declineof within-human rate estimates. We propose that demographicprocesses such as serial bottlenecks prior to the Holocene couldexplain the difference between rates estimated before and after15,000 years ago. Our findings suggest that human mtDNA estimatesof dates of population and phylogenetic events should be adjustedin light of this time dependency of the mutation rate estimates. 相似文献
3.
Estimating effective population size and migration rates from genetic samples over space and time 总被引:10,自引:0,他引:10
In the past, moment and likelihood methods have been developed to estimate the effective population size (N(e)) on the basis of the observed changes of marker allele frequencies over time, and these have been applied to a large variety of species and populations. Such methods invariably make the critical assumption of a single isolated population receiving no immigrants over the study interval. For most populations in the real world, however, migration is not negligible and can substantially bias estimates of N(e) if it is not accounted for. Here we extend previous moment and maximum-likelihood methods to allow the joint estimation of N(e) and migration rate (m) using genetic samples over space and time. It is shown that, compared to genetic drift acting alone, migration results in changes in allele frequency that are greater in the short term and smaller in the long term, leading to under- and overestimation of N(e), respectively, if it is ignored. Extensive simulations are run to evaluate the newly developed moment and likelihood methods, which yield generally satisfactory estimates of both N(e) and m for populations with widely different effective sizes and migration rates and patterns, given a reasonably large sample size and number of markers. 相似文献
4.
Microsatellites are short tandem repeats that are widely dispersed among eukaryotic genomes. Many of them are highly polymorphic; they have been used widely in genetic studies. Statistical properties of all measures of genetic variation at microsatellites critically depend upon the composite parameter theta = 4Nmicro, where N is the effective population size and micro is mutation rate per locus per generation. Since mutation leads to expansion or contraction of a repeat number in a stepwise fashion, the stepwise mutation model has been widely used to study the dynamics of these loci. We developed an estimator of theta, theta; (F), on the basis of sample homozygosity under the single-step stepwise mutation model. The estimator is unbiased and is much more efficient than the variance-based estimator under the single-step stepwise mutation model. It also has smaller bias and mean square error (MSE) than the variance-based estimator when the mutation follows the multistep generalized stepwise mutation model. Compared with the maximum-likelihood estimator theta; (L) by, theta; (F) has less bias and smaller MSE in general. theta; (L) has a slight advantage when theta is small, but in such a situation the bias in theta; (L) may be more of a concern. 相似文献
5.
Mammalian DNA replication: mutation biases and the mutation rate 总被引:4,自引:0,他引:4
K H Wolfe 《Journal of theoretical biology》1991,149(4):441-451
Experimental studies have shown that the fidelity of DNA replication can be affected by the concentrations of free deoxyribonucleotides present in the cell. Replication of mammalian chromosomes is achieved using pools of newly-synthesized deoxyribonucleotides which fluctuate during the cell cycle. Since regions of mammalian chromosomes are replicated sequentially, there is the potential for differences among mammalian loci in both the relative and absolute frequencies of the various transitional and transversional mutations which may occur. Where these mutations are effectively neutral, at silent sites in genes and in non-coding sequences, this may result in different rates of evolution and in different base compositions, as have been observed in data from mammalian genes. A simple model of the DNA replication process is developed to describe how the mutation rate could be affected by the G + C contents of the deoxyribonucleotide pools and of the replicating DNA. Mutation rates are predicted to vary from locus to locus; only in the particular case of identical G + C contents in the DNA locus and the deoxyribonucleotide pools, and no proofreading, will the mutation rate be uniform over all loci. 相似文献
6.
Background
The population mutation rate (θ) remains one of the most fundamental parameters in genetics, ecology, and evolutionary biology. However, its accurate estimation can be seriously compromised when working with error prone data such as expressed sequence tags, low coverage draft sequences, and other such unfinished products. This study is premised on the simple idea that a random sequence error due to a chance accident during data collection or recording will be distributed within a population dataset as a singleton (i.e., as a polymorphic site where one sampled sequence exhibits a unique base relative to the common nucleotide of the others). Thus, one can avoid these random errors by ignoring the singletons within a dataset. 相似文献7.
Estimating divergence dates from molecular sequences 总被引:12,自引:13,他引:12
The ability to date the time of divergence between lineages using molecular
data provides the opportunity to answer many important questions in
evolutionary biology. However, molecular dating techniques have previously
been criticized for failing to adequately account for variation in the rate
of molecular evolution. We present a maximum- likelihood approach to
estimating divergence times that deals explicitly with the problem of rate
variation. This method has many advantages over previous approaches
including the following: (1) a rate constancy test excludes data for which
rate heterogeneity is detected; (2) date estimates are generated with
confidence intervals that allow the explicit testing of hypotheses
regarding divergence times; and (3) a range of sequences and fossil dates
are used, removing the reliance on a single calculated calibration rate. We
present tests of the accuracy of our method, which show it to be robust to
the effects of some modes of rate variation. In addition, we test the
effect of substitution model and length of sequence on the accuracy of the
dating technique. We believe that the method presented here offers
solutions to many of the problems facing molecular dating and provides a
platform for future improvements to such analyses.
相似文献
8.
J Felsenstein 《Genetical research》1992,60(3):209-220
We would like to use maximum likelihood to estimate parameters such as the effective population size N(e) or, if we do not know mutation rates, the product 4N(e) mu of mutation rate per site and effective population size. To compute the likelihood for a sample of unrecombined nucleotide sequences taken from a random-mating population it is necessary to sum over all genealogies that could have led to the sequences, computing for each one the probability that it would have yielded the sequences, and weighting each one by its prior probability. The genealogies vary in tree topology and in branch lengths. Although the likelihood and the prior are straightforward to compute, the summation over all genealogies seems at first sight hopelessly difficult. This paper reports that it is possible to carry out a Monte Carlo integration to evaluate the likelihoods approximately. The method uses bootstrap sampling of sites to create data sets for each of which a maximum likelihood tree is estimated. The resulting trees are assumed to be sampled from a distribution whose height is proportional to the likelihood surface for the full data. That it will be so is dependent on a theorem which is not proven, but seems likely to be true if the sequences are not short. One can use the resulting estimated likelihood curve to make a maximum likelihood estimate of the parameter of interest, N(e) or of 4N(e) mu. The method requires at least 100 times the computational effort required for estimation of a phylogeny by maximum likelihood, but is practical on today's work stations. The method does not at present have any way of dealing with recombination. 相似文献
9.
Longitudinal samples of DNA sequences are the DNA sequences sampled from the same population at different time points. For fast evolving organisms, e.g. RNA virus, these kind of samples have increasingly been used to study the evolutionary process in action. Longitudinal samples provide some interesting new summary statistics of genetic variation, such as the frequency of mutation of size i in one sample and size j in another, the average number of mutations accumulated since the common ancestor of two sequences each from a different sample, and number of private, shared and fixed mutations within samples. To make the results more applicable, we used in this study a general two-sample model, which assumes two longitudinal samples were taken from the same measurably evolving population. Inspired by the HIV study, we also studied a two-sample-two-stage model, which is a special case of two-sample model and assumes a treatment after the first sampling instantaneously changes the population size. We derived the formulas for calculating statistical properties, e.g. expectations, variances and covariances, of these new summary statistics under the two models. Potential applications of these results were discussed. 相似文献
10.
Although mutation rates are a key determinant of the rate of evolution they are difficult to measure precisely and global mutations rates (mutations per genome per generation) are often extrapolated from the per-base-pair mutation rate assuming that mutation rate is uniform across the genome. Using budding yeast, we describe an improved method for the accurate calculation of mutation rates based on the fluctuation assay. Our analysis suggests that the per-base-pair mutation rates at two genes differ significantly (3.80x10(-10) at URA3 and 6.44x10(-10) at CAN1) and we propose a definition for the effective target size of genes (the probability that a mutation inactivates the gene) that acknowledges that the mutation rate is nonuniform across the genome. 相似文献
11.
12.
13.
14.
High spontaneous mutation frequency in shuttle vector sequences recovered from mammalian cellular DNA. 总被引:10,自引:9,他引:10
下载免费PDF全文

The recombinant shuttle vector pSV2gpt was introduced into V79 Chinese hamster cells, and stable transformants expressing the Escherichia coli gpt gene were selected. Two transformants carrying tandem duplications of the plasmid at a single site were identified and fused to simian COS-1 cells. Plasmid DNA recovered from the heterokaryons was used to transform a Gpt- derivative of E. coli HB101, and the relative frequency of plasmids carrying a mutation in the gpt gene was determined. The high frequency of Gpt- plasmids (ca. 1%) was similar to that observed when plasmid was recovered from COS-1 cells which had been transfected with pSV2gpt. Most of the mutant plasmids had rearrangements in the region containing the gpt gene. 相似文献
15.
J Felsenstein 《Genetical research》1992,59(2):139-147
It is known that under neutral mutation at a known mutation rate a sample of nucleotide sequences, within which there is assumed to be no recombination, allows estimation of the effective size of an isolated population. This paper investigates the case of very long sequences, where each pair of sequences allows a precise estimate of the divergence time of those two gene copies. The average divergence time of all pairs of copies estimates twice the effective population number and an estimate can also be derived from the number of segregating sites. One can alternatively estimate the genealogy of the copies. This paper shows how a maximum likelihood estimate of the effective population number can be derived from such a genealogical tree. The pairwise and the segregating sites estimates are shown to be much less efficient than this maximum likelihood estimate, and this is verified by computer simulation. The result implies that there is much to gain by explicitly taking the tree structure of these genealogies into account. 相似文献
16.
Lawrence C. Shimmin Benny Hung-Junn Chang David Hewett-Emmett Wen-Hsiung Li 《Journal of molecular evolution》1993,37(2):160-166
It is commonly believed that the rate of mutation is much higher in males than in females because the number of germ-cell divisions per generation is much larger in males than in females. However, the precise magnitude of the male-to-female mutation rate ratio (α m ) remains unknown. Recently there have been efforts to estimate α m by using DNA sequence data from different species. We have studied the potential problems in such an approach. We found that the rate of synonymous substitution varies about fivefold among X-linked genes, as large as the variation among autosomal genes. This large variation makes the assumption of selective neutrality of synonymous changes dubious, so one should be cautious in using the synonymous rates in X-linked and autosomal genes to estimate α m . A similar difficulty was also observed in using nonhomologous intron sequences to estimate α m . Contrary to the expectation that X-linked sequences should evolve more slowly than autosomal sequences, theAlu repeat in the last intron of the X-linked zinc finger gene has evolved faster than the four autosomalAlu repeats used in this study. It appears that the best way to estimate α m is to use homologous sequences. However, such sequences may be involved in gene conversion events. In fact, we found evidence that the Y-linked and X-linked zinc finger genes have been involved in multiple conversion events during primate evolution. Thus, the possibility of gene conversion should be considered when using homologous sequences to estimate α m . Presented at the NATO Advanced Research Workshop onGenome Organization and Evolution, Spetsai, Greece, 16–22 September 1992 相似文献
17.
Lüsebrink J Schildgen V Tillmann RL Wittleben F Böhmer A Müller A Schildgen O 《PloS one》2011,6(5):e19457
Parvoviruses are single stranded DNA viruses that replicate in a so called "rolling-hairpin" mechanism, a variant of the rolling circle replication known for bacteriophages like φX174. The replication intermediates of parvoviruses thus are concatemers of head-to-head or tail-to-tail structure. Surprisingly, in case of the novel human bocavirus, neither head-to-head nor tail-to-tail DNA sequences were detected in clinical isolates; in contrast head-to-tail DNA sequences were identified by PCR and sequencing. Thereby, the head-to-tail sequences were linked by a novel sequence of 54 bp of which 20 bp also occur as conserved structures of the palindromic ends of parvovirus MVC which in turn is a close relative to human bocavirus. 相似文献
18.
L Kádasi 《Human heredity》1989,39(2):67-74
Recombination between the marker locus and disease locus introduces a risk of diagnostic error that must be considered when performing indirect diagnosis of monogenic disorders by means of a linked DNA polymorphism or another marker. A method is presented which improves the hitherto used estimates of the magnitude of this error. Principally, it makes use of the fact that recombination between marker and disease locus needs not necessarily increase the error rate; if it occurs twice or several times during the diagnostic process, the final diagnosis may be correct. 相似文献
19.
Estimating the contribution of mutation, recombination and gene conversion in the generation of haplotypic diversity 总被引:3,自引:0,他引:3
下载免费PDF全文

Recombination occurs through both homologous crossing over and homologous gene conversion during meiosis. The contribution of recombination relative to mutation is expected to be dramatically reduced in inbreeding organisms. We report coalescent-based estimates of the recombination parameter (rho) relative to estimates of the mutation parameter (theta) for 18 genes from the highly self-fertilizing grass, wild barley, Hordeum vulgare ssp. spontaneum. Estimates of rho/theta are much greater than expected, with a mean rho/theta approximately 1.5, similar to estimates from outcrossing species. We also estimate rho with and without the contribution of gene conversion. Genotyping errors can mimic the effect of gene conversion, upwardly biasing estimates of the role of conversion. Thus we report a novel method for identifying genotyping errors in nucleotide sequence data sets. We show that there is evidence for gene conversion in many large nucleotide sequence data sets including our data that have been purged of all detectable sequencing errors and in data sets from Drosophila melanogaster, D. simulans, and Zea mays. In total, 13 of 27 loci show evidence of gene conversion. For these loci, gene conversion is estimated to contribute an average of twice as much as crossing over to total recombination. 相似文献
20.
Curtis Strobeck 《Theoretical population biology》1983,24(2):160-172
The sampling theory for the infinite site model taking into account the phylogenetic relationship between the alleles is developed for those cases in which two or three alleles are observed in the sample. From this theory a maximum likelihood estimate of θ = 4Nμ can be obtained. Unlike the maximum likelihood estimate of θ based on the infinite allele model or the number of segregating sites, this estimate of θ is a function of the frequencies of the alleles. This method is used to estimate θ for mitochondrial DNA in Drosophila melanogaster and D. virilis from data obtained by Shah and Langley (1979. Nature (London)281, 696–699) using restriction endonucleases. 相似文献