首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We estimate parameters of a general isolation-with-migration model using resequence data from mitochondrial DNA (mtDNA), the Y chromosome, and two loci on the X chromosome in samples of 25-50 individuals from each of 10 human populations. Application of a coalescent-based Markov chain Monte Carlo technique allows simultaneous inference of divergence times, rates of gene flow, as well as changes in effective population size. Results from comparisons between sub-Saharan African and Eurasian populations estimate that 1500 individuals founded the ancestral Eurasian population approximately 40 thousand years ago (KYA). Furthermore, these small Eurasian founding populations appear to have grown much more dramatically than either African or Oceanian populations. Analyses of sub-Saharan African populations provide little evidence for a history of population bottlenecks and suggest that they began diverging from one another upward of 50 KYA. We surmise that ancestral African populations had already been geographically structured prior to the founding of ancestral Eurasian populations. African populations are shown to experience low levels of mitochondrial DNA gene flow, but high levels of Y chromosome gene flow. In particular, Y chromosome gene flow appears to be asymmetric, i.e., from the Bantu-speaking population into other African populations. Conversely, mitochondrial gene flow is more extensive between non-African populations, but appears to be absent between European and Asian populations.  相似文献   

2.
Rannala B  Yang Z 《Genetics》2003,164(4):1645-1656
The effective population sizes of ancestral as well as modern species are important parameters in models of population genetics and human evolution. The commonly used method for estimating ancestral population sizes, based on counting mismatches between the species tree and the inferred gene trees, is highly biased as it ignores uncertainties in gene tree reconstruction. In this article, we develop a Bayes method for simultaneous estimation of the species divergence times and current and ancestral population sizes. The method uses DNA sequence data from multiple loci and extracts information about conflicts among gene tree topologies and coalescent times to estimate ancestral population sizes. The topology of the species tree is assumed known. A Markov chain Monte Carlo algorithm is implemented to integrate over uncertain gene trees and branch lengths (or coalescence times) at each locus as well as species divergence times. The method can handle any species tree and allows different numbers of sequences at different loci. We apply the method to published noncoding DNA sequences from the human and the great apes. There are strong correlations between posterior estimates of speciation times and ancestral population sizes. With the use of an informative prior for the human-chimpanzee divergence date, the population size of the common ancestor of the two species is estimated to be approximately 20,000, with a 95% credibility interval (8000, 40,000). Our estimates, however, are affected by model assumptions as well as data quality. We suggest that reliable estimates have yet to await more data and more realistic models.  相似文献   

3.
Yang Z 《Genetics》2002,162(4):1811-1823
Polymorphisms in an ancestral population can cause conflicts between gene trees and the species tree. Such conflicts can be used to estimate ancestral population sizes when data from multiple loci are available. In this article I extend previous work for estimating ancestral population sizes to analyze sequence data from three species under a finite-site nucleotide substitution model. Both maximum-likelihood (ML) and Bayes methods are implemented for joint estimation of the two speciation dates and the two population size parameters. Both methods account for uncertainties in the gene tree due to few informative sites at each locus and make an efficient use of information in the data. The Bayes algorithm using Markov chain Monte Carlo (MCMC) enjoys a computational advantage over ML and also provides a framework for incorporating prior information about the parameters. The methods are applied to a data set of 53 nuclear noncoding contigs from human, chimpanzee, and gorilla published by Chen and Li. Estimates of the effective population size for the common ancestor of humans and chimpanzees by both ML and Bayes methods are approximately 12,000-21,000, comparable to estimates for modern humans, and do not support the notion of a dramatic size reduction in early human populations. Estimates published previously from the same data are several times larger and appear to be biased due to methodological deficiency. The divergence between humans and chimpanzees is dated at approximately 5.2 million years ago and the gorilla divergence 1.1-1.7 million years earlier. The analysis suggests that typical data sets contain useful information about the ancestral population sizes and that it is advantageous to analyze data of several species simultaneously.  相似文献   

4.
The purpose of this study was to test for evidence that savannah baboons (Papio cynocephalus) underwent a population expansion in concert with a hypothesized expansion of African human and chimpanzee populations during the late Pleistocene. The rationale is that any type of environmental event sufficient to cause simultaneous population expansions in African humans and chimpanzees would also be expected to affect other codistributed mammals. To test for genetic evidence of population expansion or contraction, we performed a coalescent analysis of multilocus microsatellite data using a hierarchical Bayesian model. Markov chain Monte Carlo (MCMC) simulations were used to estimate the posterior probability density of demographic and genealogical parameters. The model was designed to allow interlocus variation in mutational and demographic parameters, which made it possible to detect aberrant patterns of variation at individual loci that could result from heterogeneity in mutational dynamics or from the effects of selection at linked sites. Results of the MCMC simulations were consistent with zero variance in demographic parameters among loci, but there was evidence for a 10- to 20-fold difference in mutation rate between the most slowly and most rapidly evolving loci. Results of the model provided strong evidence that savannah baboons have undergone a long-term historical decline in population size. The mode of the highest posterior density for the joint distribution of current and ancestral population size indicated a roughly eightfold contraction over the past 1,000 to 250,000 years. These results indicate that savannah baboons apparently did not share a common demographic history with other codistributed primate species.  相似文献   

5.
Ewing G  Nicholls G  Rodrigo A 《Genetics》2004,168(4):2407-2420
We present a Bayesian statistical inference approach for simultaneously estimating mutation rate, population sizes, and migration rates in an island-structured population, using temporal and spatial sequence data. Markov chain Monte Carlo is used to collect samples from the posterior probability distribution. We demonstrate that this chain implementation successfully reaches equilibrium and recovers truth for simulated data. A real HIV DNA sequence data set with two demes, semen and blood, is used as an example to demonstrate the method by fitting asymmetric migration rates and different population sizes. This data set exhibits a bimodal joint posterior distribution, with modes favoring different preferred migration directions. This full data set was subsequently split temporally for further analysis. Qualitative behavior of one subset was similar to the bimodal distribution observed with the full data set. The temporally split data showed significant differences in the posterior distributions and estimates of parameter values over time.  相似文献   

6.
Comparison of the performance and accuracy of different inference methods, such as maximum likelihood (ML) and Bayesian inference, is difficult because the inference methods are implemented in different programs, often written by different authors. Both methods were implemented in the program MIGRATE, that estimates population genetic parameters, such as population sizes and migration rates, using coalescence theory. Both inference methods use the same Markov chain Monte Carlo algorithm and differ from each other in only two aspects: parameter proposal distribution and maximization of the likelihood function. Using simulated datasets, the Bayesian method generally fares better than the ML approach in accuracy and coverage, although for some values the two approaches are equal in performance. MOTIVATION: The Markov chain Monte Carlo-based ML framework can fail on sparse data and can deliver non-conservative support intervals. A Bayesian framework with appropriate prior distribution is able to remedy some of these problems. RESULTS: The program MIGRATE was extended to allow not only for ML(-) maximum likelihood estimation of population genetics parameters but also for using a Bayesian framework. Comparisons between the Bayesian approach and the ML approach are facilitated because both modes estimate the same parameters under the same population model and assumptions.  相似文献   

7.
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution.  相似文献   

8.
The analysis of ancient DNA in a population genetic or phylogeographical framework is an emerging field, as traditional analytical tools were largely developed for the purpose of analysing data sampled from a single time point. Markov chain Monte Carlo approaches have been successfully developed for the analysis of heterochronous sequence data from closed panmictic populations. However, attributing genetic differences between temporal samples to mutational events between time points requires the consideration of other factors that may also result in genetic differentiation. Geographical effects are an obvious factor for species exhibiting geographical structuring of genetic variation. The departure from a closed panmictic model require researchers to either exploit software developed for the analysis of isochronous data, take advantage of simulation approaches using algorithms developed for heterochronous data, or explore approximate Bayesian computation. Here, we review statistical approaches employed and available software for the joint analysis of ancient and modern DNA, and where appropriate we suggest how these may be further developed.  相似文献   

9.
We expand a coalescent-based method that uses serially sampled genetic data from a subdivided population to incorporate changes to the number of demes and patterns of colonization. Often, when estimating population parameters or other parameters of interest from genetic data, the demographic structure and parameters are not constant over evolutionary time. In this paper, we develop a Bayesian Markov chain Monte Carlo method that allows for step changes in mutation, migration, and population sizes, as well as changing numbers of demes, where the times of these changes are also estimated. We show that in parameter ranges of interest, reliable estimates can often be obtained, including the historical times of parameter changes. However, posterior densities of migration rates can be quite diffuse and estimators somewhat biased, as reported by other authors.  相似文献   

10.
Natural populations of wild common buckwheat have been found growing adjacent to cultivated populations of common buckwheat. Gene flow between the cultivated and natural populations would be expected in such cases. To evaluate the amount of gene flow, two sets composed of a cultivated buckwheat population and an adjacent natural population of wild common buckwheat were chosen, one from Yanjing village in the Sanjiang area, which is presumed to be the original birthplace of common buckwheat, and one from Jinhe village, Yanyuan district of Sichuan province in China. The genotypes of 45 individuals from each population were examined at eight microsatellite marker loci to estimate the magnitude of gene flow between the cultivated and wild common buckwheat populations. The Bayesian method with a Markov chain Monte Carlo approach estimated that the magnitude of gene flow between the populations in the Sanjiang area at 0.002-0.008 was not significantly different from that found in Yanyuan district at 0.002-0.008. The gene flow between cultivated populations was higher, usually at 0.002-0.044 (exceptionally high at 0.255 between cultivated populations of Yanjing and Jinhe), than that found between a cultivated population and a natural population (0.002-0.008) or between two natural populations (0.002-0.003). Therefore, the genetic similarity found between the cultivated populations and natural populations observed in the Sanjiang area (Konishi et al., 2005) was not due to recent gene flow between them. This in turn suggests that the close genetic relationship found in the Sanjiang area may be due to the common ancestry of the natural populations and cultivated common buckwheat.  相似文献   

11.
The presence of conspecific wild-type and cultivar populations has been a common landscape feature for centuries. As orchards generally continue to expand towards the natural forest, two important issues are raised: the potential reduction of cultivar genetic diversity compared to wild populations and the extent of gene flow between the two population types. These questions were addressed in a study of Prunus avium in northern Greece using nine simple sequence repeat loci to analyse genetic variation in 93 wild-type individuals and 21 cultivars representing the local cultivated germplasm. Results showed a significant reduction of genetic diversity parameters in the cultivated germplasm compared to natural populations. Bayesian, frequency-based and Markov chain – Monte Carlo analyses have revealed that the wild and cultivar groups are genetically divergent and that realized between-group gene flow is almost completely absent. This result was further verified by a principal component analysis showing a clear separation of the two groups in low multivariate space after a principal coordinate analysis. The significant disjunction in flowering time and a considerable geographic distance between the two groups could primarily account for the absence of substantial gene flow. These findings indicate that local wild cherry can provide a source of genetic variation for future breeding in the genetically restricted cultivar group.  相似文献   

12.
An improved Bayesian method is presented for estimating phylogenetic trees using DNA sequence data. The birth-death process with species sampling is used to specify the prior distribution of phylogenies and ancestral speciation times, and the posterior probabilities of phylogenies are used to estimate the maximum posterior probability (MAP) tree. Monte Carlo integration is used to integrate over the ancestral speciation times for particular trees. A Markov Chain Monte Carlo method is used to generate the set of trees with the highest posterior probabilities. Methods are described for an empirical Bayesian analysis, in which estimates of the speciation and extinction rates are used in calculating the posterior probabilities, and a hierarchical Bayesian analysis, in which these parameters are removed from the model by an additional integration. The Markov Chain Monte Carlo method avoids the requirement of our earlier method for calculating MAP trees to sum over all possible topologies (which limited the number of taxa in an analysis to about five). The methods are applied to analyze DNA sequences for nine species of primates, and the MAP tree, which is identical to a maximum-likelihood estimate of topology, has a probability of approximately 95%.   相似文献   

13.
Bayesian inference of recent migration rates using multilocus genotypes   总被引:25,自引:0,他引:25  
Wilson GA  Rannala B 《Genetics》2003,163(3):1177-1191
A new Bayesian method that uses individual multilocus genotypes to estimate rates of recent immigration (over the last several generations) among populations is presented. The method also estimates the posterior probability distributions of individual immigrant ancestries, population allele frequencies, population inbreeding coefficients, and other parameters of potential interest. The method is implemented in a computer program that relies on Markov chain Monte Carlo techniques to carry out the estimation of posterior probabilities. The program can be used with allozyme, microsatellite, RFLP, SNP, and other kinds of genotype data. We relax several assumptions of early methods for detecting recent immigrants, using genotype data; most significantly, we allow genotype frequencies to deviate from Hardy-Weinberg equilibrium proportions within populations. The program is demonstrated by applying it to two recently published microsatellite data sets for populations of the plant species Centaurea corymbosa and the gray wolf species Canis lupus. A computer simulation study suggests that the program can provide highly accurate estimates of migration rates and individual migrant ancestries, given sufficient genetic differentiation among populations and sufficient numbers of marker loci.  相似文献   

14.
We present a Markov chain Monte Carlo coalescent genealogy sampler, LAMARC 2.0, which estimates population genetic parameters from genetic data. LAMARC can co-estimate subpopulation Theta = 4N(e)mu, immigration rates, subpopulation exponential growth rates and overall recombination rate, or a user-specified subset of these parameters. It can perform either maximum-likelihood or Bayesian analysis, and accomodates nucleotide sequence, SNP, microsatellite or elecrophoretic data, with resolved or unresolved haplotypes. It is available as portable source code and executables for all three major platforms. AVAILABILITY: LAMARC 2.0 is freely available at http://evolution.gs.washington.edu/lamarc  相似文献   

15.
Detecting population expansion and decline using microsatellites   总被引:15,自引:0,他引:15  
Beaumont MA 《Genetics》1999,153(4):2013-2029
This article considers a demographic model where a population varies in size either linearly or exponentially. The genealogical history of microsatellite data sampled from this population can be described using coalescent theory. A method is presented whereby the posterior probability distribution of the genealogical and demographic parameters can be estimated using Markov chain Monte Carlo simulations. The likelihood surface for the demographic parameters is complicated and its general features are described. The method is then applied to published microsatellite data from two populations. Data from the northern hairy-nosed wombat show strong evidence of decline. Data from European humans show weak evidence of expansion.  相似文献   

16.
Populations can be genetically isolated both by geographic distance and by differences in their ecology or environment that decrease the rate of successful migration. Empirical studies often seek to investigate the relationship between genetic differentiation and some ecological variable(s) while accounting for geographic distance, but common approaches to this problem (such as the partial Mantel test) have a number of drawbacks. In this article, we present a Bayesian method that enables users to quantify the relative contributions of geographic distance and ecological distance to genetic differentiation between sampled populations or individuals. We model the allele frequencies in a set of populations at a set of unlinked loci as spatially correlated Gaussian processes, in which the covariance structure is a decreasing function of both geographic and ecological distance. Parameters of the model are estimated using a Markov chain Monte Carlo algorithm. We call this method Bayesian Estimation of Differentiation in Alleles by Spatial Structure and Local Ecology (BEDASSLE), and have implemented it in a user‐friendly format in the statistical platform R. We demonstrate its utility with a simulation study and empirical applications to human and teosinte data sets.  相似文献   

17.
DNA samples of the spectacled bear (Tremarctos ornatus) from five Andean countries, Venezuela, Colombia, Ecuador, Peru and Bolivia, were analyzed for nine microsatellite loci. Seven of them were polymorphic, which led us to investigate several population-genetic parameters. Private alleles and significant differences in gene frequencies were found among the populations studied, which demonstrated the extent of genetic differentiation among the spectacled bear populations. The levels of gene diversity measured with these microsatellites were rather modest in this species. Hardy-Weinberg disequilibrium was especially found for the overall and the Ecuadorian samples, and might be due to the Wahl-und effect or consanguinity. Significant genetic heterogeneity was mainly observed among the Colombian and the Ecuadorian populations. Markov chain Monte Carlo simulations clearly showed that two different gene pools were present, one present in the Venezuelan-Colombian bears and other in the Ecuadorian ones.  相似文献   

18.
A Bayesian approach to DNA sequence segmentation   总被引:3,自引:0,他引:3  
Boys RJ  Henderson DA 《Biometrics》2004,60(3):573-581
Many deoxyribonucleic acid (DNA) sequences display compositional heterogeneity in the form of segments of similar structure. This article describes a Bayesian method that identifies such segments by using a Markov chain governed by a hidden Markov model. Markov chain Monte Carlo (MCMC) techniques are employed to compute all posterior quantities of interest and, in particular, allow inferences to be made regarding the number of segment types and the order of Markov dependence in the DNA sequence. The method is applied to the segmentation of the bacteriophage lambda genome, a common benchmark sequence used for the comparison of statistical segmentation algorithms.  相似文献   

19.
A simple population genetic model is presented for a hermaphrodite annual species, allowing both selfing and outcrossing. Those male gametes (pollen) responsible for outcrossing are assumed to disperse much further than seeds. Under this model, the pedigree of a sample from a single locality is loop-free. A novel Markov chain Monte Carlo strategy is presented for sampling from the joint posterior distribution of the pedigree of such a sample and the parameters of the population genetic model (including the selfing rate) given the genotypes of the sampled individuals at unlinked marker loci. The computational costs of this Markov chain Monte Carlo strategy scale well with the number of individuals in the sample, and the number of marker loci, but increase exponentially with the age (time since colonisation from the source population) of the local population. Consequently, this strategy is particularly suited to situations where the sample has been collected from a population which is the result of a recent colonisation process.  相似文献   

20.
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号