首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Smith NG  Fearnhead P 《Genetics》2005,171(4):2051-2062
We have performed simulations to assess the performance of three population genetics approximate-likelihood methods in estimating the population-scaled recombination rate from sequence data. We measured performance in two ways: accuracy when the sequence data were simulated according to the (simplistic) standard model underlying the methods and robustness to violations of many different aspects of the standard model. Although we found some differences between the methods, performance tended to be similar for all three methods. Despite the fact that the methods are not robust to violations of the underlying model, our simulations indicate that patterns of relative recombination rates should be inferred reasonably well even if the standard model does not hold. In addition, we assess various techniques for improving the performance of approximate-likelihood methods. In particular we find that the composite-likelihood method of Hudson (2001) can be improved by including log-likelihood contributions only for pairs of sites that are separated by some prespecified distance.  相似文献   

2.
Wright's FST and related statistics are often used to measure the extent of divergence among populations of the same species relative to the net genetic diversity within the species. This paper compares several definitions of FST which are relevant to DNA sequence data, and shows that these must be used with care when estimating migration parameters. It is also pointed out that FST is strongly influenced by the level of within-population diversity. In situations where factors such as selection on closely linked sites are expected to have stronger effects on within-population diversity at some loci than at others, differences among loci can result entirely from differences in within- population diversities. It is shown that several published cases of differences in FST among regions of high and low recombination in Drosophila may be caused in this way. For the purpose of comparisons of levels of between-population differences among loci or species which are subject to different intensities of forces that reduce variability within local populations, absolute measures of divergence between populations should be used in preference to relative measures such as FST.   相似文献   

3.
I derive the equilibrium values of sex-specific FST parameters, in an island model for a dioecious species with sex-biased dispersal and binomial distribution of family size before dispersal (as assumed in a Wright-Fisher population). I show that FST may take different values among males and among females whenever dispersal is a trait conditioned on gender. This has not always been recognized, because some models assumed that genes are sampled before dispersal. In particular, the ratios of sex-specific FST parameters evaluated after dispersal over FST evaluated before dispersal are simple functions of sex-specific dispersal rates. Therefore, a simple moment-based estimator of sex-specific dispersal rate is proposed. This method is based on the comparison of FST estimated before and after dispersal and assumes equilibrium between migration and drift. I evaluate this method through stochastic simulations for a range of sex-specific dispersal rates and sampling effort (sample size, number of loci scored).  相似文献   

4.
Understanding the processes and conditions under which populations diverge to give rise to distinct species is a central question in evolutionary biology. Since recently diverged populations have high levels of shared polymorphisms, it is challenging to distinguish between recent divergence with no (or very low) inter-population gene flow and older splitting events with subsequent gene flow. Recently published methods to infer speciation parameters under the isolation-migration framework are based on summarizing polymorphism data at multiple loci in two species using the joint site-frequency spectrum (JSFS). We have developed two improvements of these methods based on a more extensive use of the JSFS classes of polymorphisms for species with high intra-locus recombination rates. First, using a likelihood based method, we demonstrate that taking into account low-frequency polymorphisms shared between species significantly improves the joint estimation of the divergence time and gene flow between species. Second, we introduce a local linear regression algorithm that considerably reduces the computational time and allows for the estimation of unequal rates of gene flow between species. We also investigate which summary statistics from the JSFS allow the greatest estimation accuracy for divergence time and migration rates for low (around 10) and high (around 100) numbers of loci. Focusing on cases with low numbers of loci and high intra-locus recombination rates we show that our methods for the estimation of divergence time and migration rates are more precise than existing approaches.  相似文献   

5.
Wall JD 《Genetics》2004,167(3):1461-1473
We introduce a new method for jointly estimating crossing-over and gene conversion rates using sequence polymorphism data. The method calculates probabilities for subsets of the data consisting of three segregating sites and then forms a composite likelihood by multiplying together the probabilities of many subsets. Simulations show that this new method performs better than previously proposed methods for estimating gene conversion rates, but that all methods require large amounts of data to provide reliable estimates. While existing methods can easily estimate an "average" gene conversion rate over many loci, they cannot reliably estimate gene conversion rates for a single region of the genome.  相似文献   

6.
Many methods for fitting demographic models to data sets of aligned sequences rely upon an assumption that the data have a branching coalescent history without recombination within regions or loci. To mitigate the effects of the failure of this assumption, a common approach is to filter data and sample regions that pass the four‐gamete criterion for recombination, an approach that allows data to run, but that is expected to detect only a minority of recombination events. A series of empirical tests of this approach were conducted using computer simulations with and without recombination for a variety of isolation‐with‐migration (IM) model for two and three populations. Only the IMa3 program was used, but the general results should apply to related genealogy‐sampling‐based methods for IM models or subsets of IM models. It was found that the details of sampling intervals that pass a four‐gamete filter have a moderate effect, and that schemes that use the longest intervals, or that use overlapping intervals, gave poorer results. A simple approach of using a random nonoverlapping interval returned the smallest difference between results with and without recombination, with the mean difference between parameter estimates usually less than 20% of the true value (usually much less). However, the posterior probability distributions for migration rates were flatter with recombination, suggesting that filtering based on the four‐gamete criterion, while necessary for methods like these, leads to reduced resolution on migration. A distinct, alternative approach, of using a finite sites mutation model and not filtering the data, performed quite poorly.  相似文献   

7.
The prevailing wisdom of the plant mitochondrial genome is that it has very low substitution rates, thus it is generally assumed that nucleotide diversity within species will also be low. However, recent evidence suggests plant mitochondrial genes may harbor variable and sometimes high levels of within-species polymorphism, a result attributed to variance in the influence of selection. However, insufficient attention has been paid to the effect of among-gene variation in mutation rate on varying levels of polymorphism across loci. We measured levels of polymorphism in seven mitochondrial gene regions across a geographically wide sample of the plant Silene vulgaris to investigate whether individual mitochondrial genes accumulate polymorphisms equally. We found that genes vary significantly in polymorphism. Tests based on coalescence theory show that the genes vary significantly in their scaled mutation rate, which, in the absence of differences among genes in effective population size, suggests these genes vary in their underlying mutation rate. Further evidence that among-gene variance in polymorphism is due to variation in the underlying mutation rate comes from a significant positive relationship between the number of segregating sites and silent site divergence from an outgroup. Contrary to recent studies, we found unconvincing evidence of recombination in the mitochondrial genome, and generally confirm the standard model of plant mitochondria characterized by low substitution rates and no recombination. We also show no evidence of significant variation in the strength or direction of selection among genes; this result may be expected if there is no recombination. The present study provides some of the most thorough data on plant mitochondrial polymorphism, and provides compelling evidence for mutation rate variation among genes. The study also demonstrates the difficulty in establishing a null model of mitochondrial genome polymorphism, and thus the difficulty, in the absence of a comparative approach, in testing the assumption that low substitution rates in plant mitochondria lead to low polymorphism.  相似文献   

8.
We propose two approximate methods (one based on parsimony and one on pairwise sequence comparison) for estimating the pattern of nucleotide substitution and a parsimony-based method for estimating the gamma parameter for variable substitution rates among sites. The matrix of substitution rates that represents the substitution pattern can be recovered through its relationship with the observable matrix of site pattern frequences in pairwise sequence comparisons. In the parsimony approach, the ancestral sequences reconstructed by the parsimony algorithm were used, and the two sequences compared are those at the ends of a branch in the phylogenetic tree. The method for estimating the gamma parameter was based on a reinterpretation of the numbers of changes at sites inferred by parsimony. Three data sets were analyzed to examine the utility of the approximate methods compared with the more reliable likelihood methods. The new methods for estimating the substitution pattern were found to produce estimates quite similar to those obtained from the likelihood analyses. The new method for estimating the gamma parameter was effective in reducing the bias in conventional parsimony estimates, although it also overestimated the parameter. The approximate methods are computationally very fast and appear useful for analyzing large data sets, for which use of the likelihood method requires excessive computation.   相似文献   

9.
There has been considerable recent interest in understanding the way in which recombination rates vary over small physical distances, and the extent of recombination hotspots, in various genomes. Here we adapt, apply, and assess the power of recently developed coalescent-based approaches to estimating recombination rates from sequence polymorphism data. We apply full-likelihood estimation to study rate variation in and around a well-characterized recombination hotspot in humans, in the beta-globin gene cluster, and show that it provides similar estimates, consistent with those from sperm studies, from two populations deliberately chosen to have different demographic and selectional histories. We also demonstrate how approximate-likelihood methods can be used to detect local recombination hotspots from genomic-scale SNP data. In a simulation study based on 80 100-kb regions, these methods detect 43 out of 60 hotspots (ranging from 1 to 2 kb in size), with only two false positives out of 2000 subregions that were tested for the presence of a hotspot. Our study suggests that new computational tools for sophisticated analysis of population diversity data are valuable for hotspot detection and fine-scale mapping of local recombination rates.  相似文献   

10.
S. A. Karl  B. W. Bowen    J. C. Avise 《Genetics》1992,131(1):163-173
We introduce an approach for the analysis of Mendelian polymorphisms in nuclear DNA (nDNA), using restriction fragment patterns from anonymous single-copy regions amplified by the polymerase chain reaction, and apply this method to the elucidation of population structure and gene flow in the endangered green turtle, Chelonia mydas. Seven anonymous clones isolated from a total cell DNA library were sequenced to generate primers for the amplification of nDNA fragments. Nine individuals were screened for restriction site polymorphisms at these seven loci, using 40 endonucleases. Two loci were monomorphic, while the remainder exhibited a total of nine polymorphic restriction sites and three size variants (reflecting 600-base pair (bp) and 20-bp deletions and a 20-bp insertion). A total of 256 turtle specimens from 15 nesting populations worldwide were then scored for these polymorphisms. Genotypic proportions within populations were in accord with Hardy-Weinberg expectations. Strong linkage disequilibrium observed among polymorphic sites within loci enabled multisite haplotype assignments. Estimates of the standardized variance in haplotype frequency among global collections (FST = 0.17), within the Atlantic-Mediterranean (FST = 0.13), and within the Indian-Pacific (FST = 0.13), revealed a moderate degree of population substructure. Although a previous study concluded that nesting populations appear to be highly structured with respect to female (mitochondrial DNA) lineages, estimates of Nm based on nDNA data from this study indicate moderate rates of male-mediated gene flow. A positive relationship between genetic similarity and geographic proximity suggests historical connections and/or contemporary gene flow between particular rookery populations, likely via matings on overlapping feeding grounds, migration corridors or nonnatal rookeries.  相似文献   

11.
Estimating recombination rates from population genetic data.   总被引:21,自引:0,他引:21  
P Fearnhead  P Donnelly 《Genetics》2001,159(3):1299-1318
We introduce a new method for estimating recombination rates from population genetic data. The method uses a computationally intensive statistical procedure (importance sampling) to calculate the likelihood under a coalescent-based model. Detailed comparisons of the new algorithm with two existing methods (the importance sampling method of Griffiths and Marjoram and the MCMC method of Kuhner and colleagues) show it to be substantially more efficient. (The improvement over the existing importance sampling scheme is typically by four orders of magnitude.) The existing approaches not infrequently led to misleading results on the problems we investigated. We also performed a simulation study to look at the properties of the maximum-likelihood estimator of the recombination rate and its robustness to misspecification of the demographic model.  相似文献   

12.
Recently diverged taxa may continue to exchange genes. A number of models of speciation with gene flow propose that the frequency of gene exchange will be lower in genomic regions of low recombination and that these regions will therefore be more differentiated. However, several population-genetic models that focus on selection at linked sites also predict greater differentiation in regions of low recombination simply as a result of faster sorting of ancestral alleles even in the absence of gene flow. Moreover, identifying the actual amount of gene flow from patterns of genetic variation is tricky, because both ancestral polymorphism and migration lead to shared variation between recently diverged taxa. New analytic methods have been developed to help distinguish ancestral polymorphism from migration. Along with a growing number of datasets of multi-locus DNA sequence variation, these methods have spawned a renewed interest in speciation models with gene flow. Here, we review both speciation and population-genetic models that make explicit predictions about how the rate of recombination influences patterns of genetic variation within and between species. We then compare those predictions with empirical data of DNA sequence variation in rabbits and mice. We find strong support for the prediction that genomic regions experiencing low levels of recombination are more differentiated. In most cases, reduced gene flow appears to contribute to the pattern, although disentangling the relative contribution of reduced gene flow and selection at linked sites remains a challenge. We suggest fruitful areas of research that might help distinguish between different models.  相似文献   

13.
Gay J  Myers S  McVean G 《Genetics》2007,177(2):881-894
Gene conversion plays an important part in shaping genetic diversity in populations, yet estimating the rate at which it occurs is difficult because of the short lengths of DNA involved. We have developed a new statistical approach to estimating gene conversion rates from genetic variation, by extending an existing model for haplotype data in the presence of crossover events. We show, by simulation, that when the rate of gene conversion events is at least comparable to the rate of crossover events, the method provides a powerful approach to the detection of gene conversion and estimation of its rate. Application of the method to data from the telomeric X chromosome of Drosophila melanogaster, in which crossover activity is suppressed, indicates that gene conversion occurs approximately 400 times more often than crossover events. We also extend the method to estimating variable crossover and gene conversion rates and estimate the rate of gene conversion to be approximately 1.5 times higher than the crossover rate in a region of human chromosome 1 with known recombination hotspots.  相似文献   

14.
Li N  Stephens M 《Genetics》2003,165(4):2213-2233
We introduce a new statistical model for patterns of linkage disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a "block-like" structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.  相似文献   

15.
Ptak SE  Voelpel K  Przeworski M 《Genetics》2004,167(1):387-397
An ability to predict levels of linkage disequilibrium (LD) between linked markers would facilitate the design of association studies and help to distinguish between evolutionary models. Unfortunately, levels of LD depend crucially on the rate of recombination, a parameter that is difficult to measure. In humans, rates of genetic exchange between markers megabases apart can be estimated from a comparison of genetic and physical maps; these large-scale estimates can then be interpolated to predict LD at smaller ("local") scales. However, if there is extensive small-scale heterogeneity, as has been recently proposed, local rates of recombination could differ substantially from those averaged over much larger distances. We test this hypothesis by estimating local recombination rates indirectly from patterns of LD in 84 genomic regions surveyed by the SeattleSNPs project in a sample of individuals of European descent and of African-Americans. We find that LD-based estimates are significantly positively correlated with map-based estimates. This implies that large-scale, average rates are informative about local rates of recombination. Conversely, although LD-based estimates are based on a number of simplifying assumptions, it appears that they capture considerable information about the underlying recombination rate or at least about the ordering of regions by recombination rate. Using LD-based estimators, we also find evidence for homologous gene conversion in patterns of polymorphism. However, as we demonstrate by simulation, inferences about gene conversion are unreliable, even with extensive data from homogeneous regions of the genome, and are confounded by genotyping error.  相似文献   

16.
M. Slatkin  W. P. Maddison 《Genetics》1989,123(3):603-613
A method for estimating the average level of gene flow among populations is introduced. The method provides an estimate of Nm, where N is the size of each local population in an island model and m is the migration rate. This method depends on knowing the phylogeny of the nonrecombining segments of DNA that are sampled. Given the phylogeny, the geographic location from which each sample is drawn is treated as multistate character with one state for each geographic location. A parsimony criterion applied to the evolution of this character on the phylogeny provides the minimum number of migration events consistent with the phylogeny. Extensive simulations show that the distribution of this minimum number is a simple function of Nm. Assuming the phylogeny is accurately estimated, this method provides an estimate of Nm that is as nearly as accurate as estimates obtained using FST and other statistics when Nm is moderate. Two examples of the use of this method with mitochondrial DNA data are presented.  相似文献   

17.
Anisimova M  Nielsen R  Yang Z 《Genetics》2003,164(3):1229-1236
Maximum-likelihood methods based on models of codon substitution accounting for heterogeneous selective pressures across sites have proved to be powerful in detecting positive selection in protein-coding DNA sequences. Those methods are phylogeny based and do not account for the effects of recombination. When recombination occurs, such as in population data, no unique tree topology can describe the evolutionary history of the whole sequence. This violation of assumptions raises serious concerns about the likelihood method for detecting positive selection. Here we use computer simulation to evaluate the reliability of the likelihood-ratio test (LRT) for positive selection in the presence of recombination. We examine three tests based on different models of variable selective pressures among sites. Sequences are simulated using a coalescent model with recombination and analyzed using codon-based likelihood models ignoring recombination. We find that the LRT is robust to low levels of recombination (with fewer than three recombination events in the history of a sample of 10 sequences). However, at higher levels of recombination, the type I error rate can be as high as 90%, especially when the null model in the LRT is unrealistic, and the test often mistakes recombination as evidence for positive selection. The test that compares the more realistic models M7 (beta) against M8 (beta and omega) is more robust to recombination, where the null model M7 allows the positive selection pressure to vary between 0 and 1 (and so does not account for positive selection), and the alternative model M8 allows an additional discrete class with omega = d(N)/d(S) that could be estimated to be >1 (and thus accounts for positive selection). Identification of sites under positive selection by the empirical Bayes method appears to be less affected than the LRT by recombination.  相似文献   

18.
We propose a novel method for detecting sites of molecular recombination in multiple alignments. Our approach is a compromise between previous extremes of computationally prohibitive but mathematically rigorous methods and imprecise heuristic methods. Using a combined algorithm for estimating tree structure and hidden Markov model parameters, our program detects changes in phylogenetic tree topology over a multiple sequence alignment. We evaluate our method on benchmark datasets from previous studies on two recombinant pathogens, Neisseria and HIV-1, as well as simulated data. We show that we are not only able to detect recombinant regions of vastly different sizes but also the location of breakpoints with great accuracy. We show that our method does well inferring recombination breakpoints while at the same time maintaining practicality for larger datasets. In all cases, we confirm the breakpoint predictions of previous studies, and in many cases we offer novel predictions.  相似文献   

19.
MOTIVATION: Accurate detection of positive Darwinian selection can provide important insights to researchers investigating the evolution of pathogens. However, many pathogens (particularly viruses) undergo frequent recombination and the phylogenetic methods commonly applied to detect positive selection have been shown to give misleading results when applied to recombining sequences. We propose a method that makes maximum likelihood inference of positive selection robust to the presence of recombination. This is achieved by allowing tree topologies and branch lengths to change across detected recombination breakpoints. Further improvements are obtained by allowing synonymous substitution rates to vary across sites. RESULTS: Using simulation we show that, even for extreme cases where recombination causes standard methods to reach false positive rates >90%, the proposed method decreases the false positive rate to acceptable levels while retaining high power. We applied the method to two HIV-1 datasets for which we have previously found that inference of positive selection is invalid owing to high rates of recombination. In one of these (env gene) we still detected positive selection using the proposed method, while in the other (gag gene) we found no significant evidence of positive selection. AVAILABILITY: A HyPhy batch language implementation of the proposed methods and the HIV-1 datasets analysed are available at http://www.cbio.uct.ac.za/pub_support/bioinf06. The HyPhy package is available at http://www.hyphy.org, and it is planned that the proposed methods will be included in the next distribution. RDP2 is available at http://darwin.uvigo.es/rdp/rdp.html  相似文献   

20.
Follistatin (FST) and activin A as gonadal proteins exhibit opposite effects on follicle-stimulating hormone (FSH) release from pituitary gland, and activin A-FST system is involved in regulation of decidualization in reproductive biology. However, the roles of FST and activin A in migration of decidualized endometrial stromal cells are not well characterized. In this study, transwell chambers and microfluidic devices were used to assess the effects of FST and activin A on migration of decidualized mouse endometrial stromal cells (d-MESCs). We found that compared with activin A, FST exerted more significant effects on adhesion, wound healing and migration of d-MESCs. Similar results were also seen in the primary cultured decidual stromal cells (DSCs) from uterus of pregnant mouse. Simultaneously, the results revealed that FST increased calcium influx and upregulated the expression levels of the migration-related proteins MMP9 and Ezrin in d-MESCs. In addition, FST increased the level of phosphorylation of JNK in d-MESCs, and JNK inhibitor AS601245 significantly attenuated FST action on inducing migration of d-MESCs. These data suggest that FST, not activin A in activin A-FST system, is a crucial chemoattractant for migration of d-MESCs by JNK signalling to facilitate the successful uterine decidualization and tissue remodelling during pregnancy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号