首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The assumption that selection alters the genealogical tree of a sample of alleles from a population relative to the neutral expectation underlies several "tests of neutrality." Two recent papers have studied the effect of purifying selection; their suggestive but incomplete results indicate that, in the single site case, the shape of a gene genealogy for a locus may differ only from the neutral expectation. We verify this finding for weak selection using the "ancestral selection graph." We consider a wider range of models, including both a four-allele single-site model and an infinite-sites model. Our results confirm the previous claim for the symmetric-mutation single site model. We emphasize, however, that a neutral-seeming genealogy is consistent with detectable effects of selection on the distribution of allele frequences within the sample. With selection operating, the information about a sample cannot be reduced to the genealogy. As a result, a distinction needs to be made between the selected sites themselves, for which the genealogy offers insufficient information, and linked neutral variation. This distinction seems to have been overlooked in previous papers, yet it has significant implications for the interpretation of data on DNA sequence variation. In particular, it predicts that under purifying selection, the frequency spectrum of neutral mutations will not reflect the skew toward rare polymorphisms at replacement sites even if there is no recombination between them. We caution, however, that the effect of weak selection on the genealogy is specific to the model; a (more realistic) model of multiple linked sites could lead to a more distorted genealogy than is observed for a single site.  相似文献   

2.
The allele frequency data of Baird et al. were tested using Ewens-Watterson sampling theory for goodness of fit to the infinite-alleles model of neutral evolution. Although probes of both the HRAS-1 and D14S1 loci identify highly diverse restriction-fragment-length polymorphisms, the observed values of gene identity (F) and the common allele frequency (C) are not significantly different from the neutral expectation. Allele frequency distributions show a tendency toward a deficit in diversity for HRAS-1 and a slight excess diversity for D14S1. The direction of these departures is consistent with potential selective effects of the Harvey-ras oncogene and hitchhiking of the D14S1 locus to closely linked immunoglobulin genes. Direct chi 2-tests of goodness of fit of the observed and expected allele frequency distributions reveal significant departures in the caucasoid and Hispanic HRAS-1 distributions but not in any of the other tests.  相似文献   

3.
Sampling Hubbell's neutral theory of biodiversity   总被引:7,自引:0,他引:7  
In the context of neutral theories of community ecology, a novel genealogy‐based framework has recently furnished an analytic extension of Ewens’ sampling multivariate abundance distribution, which also applies to a random sample from a local community. Here, instead of taking a multivariate approach, we further develop the sampling theory of Hubbell's neutral spatially implicit theory and derive simple abundance distributions for a random sample both from a local community and a metacommunity. Our result is given in terms of the average number of species with a given abundance in any randomly extracted sample. Contrary to what has been widely assumed, a random sample from a metacommunity is not fully described by the Fisher log‐series, but by a new distribution. This new sample distribution matches the log‐series expectation at high biodiversity values (θ > 1) but clearly departs from it for species‐poor metacommunities (θ < 1). Our theoretical framework should be helpful in the better assessment of diversity and testing of the neutral theory by using abundance data.  相似文献   

4.
Using computer simulations, we generated and analyzed genetic distances among selectively neutral haplotypes transmitted through gene genealogies with random-mating organismal pedigrees. Constraints and possible biases on haplotype distances due to correlated ancestry were evaluated by comparing observed distributions of distances to those predicted from an inbreeding theory that assumes independence among haplotype pairs. Results suggest that: 1) mean time to common ancestry of neutral haplotypes can be a reasonably good predictor of evolutionary effective population size; 2) the nonindependence of haplotype paths of descent within a given gene genealogy typically produces significant departures from the theoretical probability distributions of haplotype distances; 3) frequency distributions of distances between haplotypes drawn from “replicate” organismal pedigrees or from multiple unlinked loci within an organismal pedigree exhibit very close agreement with the theory for independent haplotypes. These results are relevant to interpretations of current molecular data on genetic distances among nonrecombining haplotypes at either nuclear or cytoplasmic loci.  相似文献   

5.
In this paper we consider the genealogy of two nested mutant alleles, assuming the constant-size neutral coalescent model with infinite sites mutation. We study the conditional genealogy and derive explicit formulas for the joint and marginal site frequency spectra for the double, single and zero mutant allele. In addition, we find the mean ages of the two mutations. We show that the age of the youngest mutation does not depend on the frequency of the single mutant allele and that the frequency spectra for the single mutant allele and the zero mutant allele are the same.  相似文献   

6.
Evidence from a variety of sources indicates that selection has influenced synonymous codon usage in Drosophila. It has generally been difficult, however, to distinguish selection that acted in the distant past from ongoing selection. However, under a neutral model, polymorphisms usually reflect more recent mutations than fixed differences between species and may, therefore, be useful for inferring recent selection. If the ancestral state is preferred, selection should shift the frequency distribution of derived states/site toward lower values; if the ancestral is unpreferred, selection should increase the number of derived states/site. Polymorphisms were classified as ancestrally preferred or unpreferred for several genes of D. simulans and D. melanogaster. A computer simulation of coalescence was employed to derive the expected frequency distributions of derived states/site under various modifications of the Wright–Fisher neutral model, and distributions of test statistics (t and Mann–Whitney U) were derived by appropriate sampling. One-tailed tests were applied to transformed frequency data to assess whether the two frequency distributions deviated from neutral expectations in the direction predicted by selection on codon usage. Several genes from D. simulans appear to be subject to recent selection on synonymous codons, including one gene with low codon bias, esterase-6. Selection may also be acting in D. melanogaster. Received: 15 April 1998 / Accepted: 13 May 1999  相似文献   

7.
Abstract Many species exist as metapopulations in balance between local population extinction and recolonization. The effect of these processes on average population differentiation, within-deme diversity, and specieswide diversity has been considered previously. In this paper, coalescent simulations of Slatkin's propagule-pool and migrant-pool models are used to characterize the distribution of neutral genetic diversity within demes (πs), diversity in the metapopulation a whole (TTT), the ratio F ST= (πt–πS)/πT, Tajima's D statistic, and several ratios of gene-tree branch lengths. Using these distributions, power to detect differences in key metapopulation parameter values is determined under contrasting sampling regimes. The results indicate that it will be difficult to use sequence data from a single locus to detect a history of extinctions and recolonizations in a metapopulation because of high genealogical variance, the loss of diversity due to reductions in effective population size, and the fact that a genealogy of lineages from different demes under Slatkin's model differs from a neutral coalescent only in its time scale. Genetic indices of gene-tree shape that capture the effects of extinction/recolonization on both external branches and the length of the genealogy as a whole will provide the best indication of metapopulation dynamics if several lineages are sampled from several different demes.  相似文献   

8.
The evolutionary significance of Y-chromosomal ribosomal DNA sequence variation was tested by two different means. A single sample of males and females was collected from a peach orchard in central Pennsylvania. Wild-caught males and sons of wild-caught, wild-inseminated females were crossed to virgin females having an X-linked rDNA deficiency. Genomic DNA from male progeny of these crosses was extracted and digested with the single restriction endonuclease, DraI. Southern blots of these digestions, when probed with the complete rDNA probe, revealed 10 distinct patterns of restriction fragments. A chisquare test for the homogeneity of the frequency distributions of the sample of wild males and sons of wild females failed to reject a neutral null hypothesis. The allele frequency configuration was tested with the Ewens-Watterson test, and the departure from the infinite alleles neutral model was not significant. Simulations were performed to test the sensitivity of the tests to misclassification and to quantify the power of the two tests.  相似文献   

9.
Heterosis or Neutrality?   总被引:12,自引:3,他引:9       下载免费PDF全文
G. A. Watterson 《Genetics》1977,85(4):789-814
Various statistics have been proposed on an ad hoc basis to test whether alleles at a locus are selectively neutral. By considering population models in which selection operates, this paper shows that the population homozygosity is a powerful test statistic for testing departures from neutrality, in the direction of heterozygote advantage or disadvantage. The sample homozygosity plays a similar role when only sample data are available. Some numerical examples are included, showing the application of the test.—An analysis is made of the effect of heterosis on such quantities as the expected number of alleles in the population or sample, the effective number of alleles, the expected homozygosity, and on the population and sample allele frequency distributions generally.  相似文献   

10.
为解释长白山温带森林群落构建和物种多度格局的形成过程, 该文以不同演替阶段的针阔混交林监测样地数据为基础, 采用中性理论模型、生物统计模型(对数正态分布模型)和生态位模型(Zifp模型、分割线段模型、生态位优先模型)拟合森林群落物种多度分布, 并用χ 2检验、Kolmogorov-Smirnov (K-S)检验和赤池信息准则(AIC)选择最佳拟合模型。结果显示: 中性模型能很好地预测长白山温带森林不同演替阶段植物群落的物种多度分布。在10 m × 10 m尺度上, 5种模型均可被χ 2检验和K-S检验接受, 但中性模型拟合效果不如对数正态分布模型、Zifp模型、分割线段模型和生态位优先模型, 表明小尺度上中性过程和生态位过程均能解释群落物种多度分布, 但生态位过程的解释能力相对较大。而在中大尺度上(30 m × 30 m、60 m × 60 m和90 m × 90 m), 中性模型为最优拟合模型, 并且随着研究尺度增加, 生态位模型和生物统计模型逐渐被χ 2检验拒绝, 表明中性过程在长白山针阔混交林群落物种多度分布格局形成中的作用随着研究尺度增加而逐渐增大。该文证实了中性过程在长白山温带针阔混交林群落结构形成中具有重要作用, 但未否认生态位机制在群落构建中的贡献。因此, 温带森林群落构建过程中中性理论和生态位理论并非相互矛盾, 而是相互融合的。在研究森林群落物种多度分布时, 应重视取样尺度和演替阶段的影响, 并采用多种模型进行拟合。  相似文献   

11.
The sample frequency spectrum of a segregating site is the probability distribution of a sample of alleles from a genetic locus, conditional on observing the sample to be polymorphic. This distribution is widely used in population genetic inferences, including statistical tests of neutrality in which a skew in the observed frequency spectrum across independent sites is taken as a signature of departure from neutral evolution. Theoretical aspects of the frequency spectrum have been well studied and several interesting results are available, but they are usually under the assumption that a site has undergone at most one mutation event in the history of the sample. Here, we extend previous theoretical results by allowing for at most two mutation events per site, under a general finite allele model in which the mutation rate is independent of current allelic state but the transition matrix is otherwise completely arbitrary. Our results apply to both nested and nonnested mutations. Only the former has been addressed previously, whereas here we show it is the latter that is more likely to be observed except for very small sample sizes. Further, for any mutation transition matrix, we obtain the joint sample frequency spectrum of the two mutant alleles at a triallelic site, and derive a closed-form formula for the expected age of the younger of the two mutations given their frequencies in the population. Several large-scale resequencing projects for various species are presently under way and the resulting data will include some triallelic polymorphisms. The theoretical results described in this paper should prove useful in population genomic analyses of such data.  相似文献   

12.
Abstract. Fuzzy set ordination is employed to evaluate sites on the basis of their suitability for particular tree species. The technique orders sites along an axis defined by the presences and absences of a given species of interest. A rationale is given in terms of noise reduction; in many situations the overall vegetation of a site will reflect habitat conditions better than the presence, absence, or quantitative performance of any single species. A data set of tree presence/absence covering a large part of the southeastern United States was analyzed and habitat suitability scores were calculated for each species. Monte-Carlo tests were used to measure the statistical power of the data set with regard to habitat preferences; 38 of the 49 species have cumulative frequency distributions showing significant departures from random expectation. Most statistically significant habitat preferences seem to be related to geographic range limits located within the study area, but some species found throughout the area also show significant departures from random expectation. The method may find applications in autecological studies of species, selection of representative site conditions for simulation modeling, and the solution of certain technical problems in ordination.  相似文献   

13.
Summary Mathematical procedures are given to estimate infestation totals and daily life stage arrivals, departures, and mortality ofDendroctonus frontalis Zimmermann for an infested tree in the field. These estimates are based on minimal sample data and are designed to utilize all available information. Daily arrival estimates for larvae, pupae, and callow adults are obtained by indirect analysis without direct observation of these stages. The procedures are applied to 147 infested trees, and the results are transformed to a common time basis to obtain daily expectations by life stage for an “average” tree. These expectations suggest optimal times for field sampling or relative times of sampling when optimal times are missed. Expected daily arrival distributions by life stage for a single egg and a single attacking adult are given. Procedures are given for utilizing collateral information to obtain an infestation total and daily arrival estimates for a boundary life stage. The results of this study are applicable to anyD. frontalis field study, and the procedures given are applicable to any bark inhabiting insect having similar habits.  相似文献   

14.
Keightley PD  Halligan DL 《Genetics》2011,188(4):931-940
Sequencing errors and random sampling of nucleotide types among sequencing reads at heterozygous sites present challenges for accurate, unbiased inference of single-nucleotide polymorphism genotypes from high-throughput sequence data. Here, we develop a maximum-likelihood approach to estimate the frequency distribution of the number of alleles in a sample of individuals (the site frequency spectrum), using high-throughput sequence data. Our method assumes binomial sampling of nucleotide types in heterozygotes and random sequencing error. By simulations, we show that close to unbiased estimates of the site frequency spectrum can be obtained if the error rate per base read does not exceed the population nucleotide diversity. We also show that these estimates are reasonably robust if errors are nonrandom. We then apply the method to infer site frequency spectra for zerofold degenerate, fourfold degenerate, and intronic sites of protein-coding genes using the low coverage human sequence data produced by the 1000 Genomes Project phase-one pilot. By fitting a model to the inferred site frequency spectra that estimates parameters of the distribution of fitness effects of new mutations, we find evidence for significant natural selection operating on fourfold sites. We also find that a model with variable effects of mutations at synonymous sites fits the data significantly better than a model with equal mutational effects. Under the variable effects model, we infer that 11% of synonymous mutations are subject to strong purifying selection.  相似文献   

15.
Estimation for an island model where mutation maintains ak-allele neutral polymorphism at a single locus on each island is considered. The likelihood of an observed sample type configuration is obtained by applying a computational algorithm analogous to Griffiths and Tavaré (Theor. Popul. Biol.46(1994), 131–159). This allows the computation of sampling distributions in an island model and investigation of their properties. Given a sample type configuration, the maximum likelihood estimate of the migration parameter is obtained by simulating independently the likelihood at a grid of points and, also, using a surface simulation method. The latter method generates the whole likelihood trajectory in a single application of the simulation program. An estimate of variance of the estimate of the migration parameter is obtained using the likelihood trajectory. A comparison of the maximum likelihood estimates of the gene flow between subpopulations is made with those obtained by using Wright'sFSTstatistic.  相似文献   

16.
The extent to which natural selection shapes diversity within populations is a key question for population genetics. Thus, there is considerable interest in quantifying the strength of selection. A full likelihood approach for inference about selection at a single site within an otherwise neutral fully linked sequence of sites is described here. A coalescent model of evolution is used to model the ancestry of a sample of DNA sequences which have the selected site segregating. The mutation model, for the selected and neutral sites, is the infinitely many-sites model where there is no back or parallel mutation at sites. A unique perfect phylogeny, a gene tree, can be constructed from the configuration of mutations on the sample sequences under this model of mutation. The approach is general and can be used for any bi-allelic selection scheme. Selection is incorporated through modelling the frequency of the selected and neutral allelic classes stochastically back in time, then using a subdivided population model considering the population frequencies through time as variable population sizes. An importance sampling algorithm is then used to explore over coalescent tree space consistent with the data. The method is applied to a simulated data set and the gene tree presented in Verrelli et al. (2002).  相似文献   

17.
Mitochondrial DNA is an important tool for inference of population history in animals. A variety of mitochondrial loci have been sampled for this purpose, but many studies focus on the non-coding D-loop or control region (CR), which in at least some species appears hypermutable. Unfortunately, analyses of this region are sometimes complicated by segmental duplications, as well as by difficulties in sequencing through repeat expansions, driving many researchers to favor single-copy protein-coding or ribosomal RNA genes. Without systematic comparison, it is unclear if, how much, and what sort of information might be lost by focusing on coding regions, or conversely whether such regions might offer significant advantages over the CR. In this study, we compare the information content, both in terms of genealogy and tests of neutral equilibrium, of the mitochondrial CR and protein-coding ND2 gene of the red-winged blackbird (Agelaius phoeniceus) and its close relative the tricolored blackbird (A. tricolor). Both gene regions violate the standard infinite sites assumption central to moment-based population genetic inference, as well as exhibiting considerable among-site rate heterogeneity, obscuring significant departures from neutral equilibrium. Given the ubiquity of rate heterogeneity in mtDNA, use of more sophisticated tests that account for this should be obligatory. The two regions yield quite similar genealogical reconstructions, as well as indicating similar departures from neutral equilibrium assumptions for A. phoeniceus. However, individual Sanger-read-length fragments (∼600 bases) of the CR have significantly higher information content than comparable fragments of ND2, suggesting that limited sampling of the mitochondrial genome should focus on the CR.  相似文献   

18.
Recent studies of mitochondrial DNA (mtDNA) variation in mammals and Drosophila have shown an excess of amino acid variation within species (replacement polymorphism) relative to the number of silent and replacement differences fixed between species. To examine further this pattern of nonneutral mtDNA evolution, we present sequence data for the ND3 and ND5 genes from 59 lines of Drosophila melanogaster and 29 lines of D. simulans. Of interest are the frequency spectra of silent and replacement polymorphisms, and potential variation among genes and taxa in the departures from neutral expectations. The Drosophila ND3 and ND5 data show no significant excess of replacement polymorphism using the McDonald-Kreitman test. These data are in contrast to significant departures from neutrality for the ND3 gene in mammals and other genes in Drosophila mtDNA (cytochrome b and ATPase 6). Pooled across genes, however, both Drosophila and human mtDNA show very significant excesses of amino acid polymorphism. Silent polymorphisms at ND5 show a significantly higher variance in frequency than replacement polymorphisms, and the latter show a significant skew toward low frequencies (Tajima's D = -1.954). These patterns are interpreted in light of the nearly neutral theory where mildly deleterious amino acid haplotypes are observed as ephemeral variants within species but do not contribute to divergence. The patterns of polymorphism and divergence at charge-altering amino acid sites are presented for the Drosophila ND5 gene to examine the evolution of functionally distinct mutations. Excess charge-altering polymorphism is observed at the carboxyl terminal and excess charge-altering divergence is detected at the amino terminal. While the mildly deleterious model fits as a net effect in the evolution of nonrecombining mitochondrial genomes, these data suggest that opposing evolutionary pressures may act on different regions of mitochondrial genes and genomes.   相似文献   

19.
To quantify and assess the processes underlying community assembly and driving tree species abundance distributions(SADs) with spatial scale variation in two typical subtropical secondary forests in Dashanchong state‐owned forest farm, two 1‐ha permanent study plots (100‐m × 100‐m) were established. We selected four diversity indices including species richness, Shannon–Wiener, Simpson and Pielou, and relative importance values to quantify community assembly and biodiversity. Empirical cumulative distribution and species accumulation curves were utilized to describe the SADs of two forests communities trees. Three types of models, including statistic model (lognormal and logseries model), niche model (broken‐stick, niche preemption, and Zipf‐Mandelbrodt model), and neutral theory model, were estimated by the fitted SADs. Simulation effects were tested by Akaike's information criterion (AIC) and Kolmogorov–Smirnov test. Results found that the Fagaceae and Anacardiaceae families were their respective dominance family in the evergreen broad‐leaved and deciduous mixed communities. According to original data and random sampling predictions, the SADs were hump‐shaped for intermediate abundance classes, peaking between 8 and 32 in the evergreen broad‐leaved community, but this maximum increased with size of total sampled area size in the deciduous mixed community. All niche models could only explain SADs patterns at smaller spatial scales. However, both the neutral theory and purely statistical models were suitable for explaining the SADs for secondary forest communities when the sampling plot exceeded 40 m. The results showed the SADs indicated a clear directional trend toward convergence and similar predominating ecological processes in two typical subtropical secondary forests. The neutral process gradually replaced the niche process in importance and become the main mechanism for determining SADs of forest trees as the sampling scale expanded. Thus, we can preliminarily conclude that neutral processes had a major effect on biodiversity patterns in these two subtropical secondary forests but exclude possible contributions of other processes.  相似文献   

20.
Genotype imputation is an indispensable step in human genetic studies. Large reference panels with deeply sequenced genomes now allow interrogating variants with minor allele frequency < 1% without sequencing. Although it is critical to consider limits of this approach, imputation methods for rare variants have only done so empirically; the theoretical basis of their imputation accuracy has not been explored. To provide theoretical consideration of imputation accuracy under the current imputation framework, we develop a coalescent model of imputing rare variants, leveraging the joint genealogy of the sample to be imputed and reference individuals. We show that broadly used imputation algorithms include model misspecifications about this joint genealogy that limit the ability to correctly impute rare variants. We develop closed-form solutions for the probability distribution of this joint genealogy and quantify the inevitable error rate resulting from the model misspecification across a range of allele frequencies and reference sample sizes. We show that the probability of a falsely imputed minor allele decreases with reference sample size, but the proportion of falsely imputed minor alleles mostly depends on the allele count in the reference sample. We summarize the impact of this error on genotype imputation on association tests by calculating the r2 between imputed and true genotype and show that even when modeling other sources of error, the impact of the model misspecification has a significant impact on the r2 of rare variants. To evaluate these predictions in practice, we compare the imputation of the same dataset across imputation panels of different sizes. Although this empirical imputation accuracy is substantially lower than our theoretical prediction, modeling misspecification seems to further decrease imputation accuracy for variants with low allele counts in the reference. These results provide a framework for developing new imputation algorithms and for interpreting rare variant association analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号