首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A. M. Valdes  M. Slatkin    N. B. Freimer 《Genetics》1993,133(3):737-749
We summarize available data on the frequencies of alleles at microsatellite loci in human populations and compare observed distributions of allele frequencies to those generated by a simulation of the stepwise mutation model. We show that observed frequency distributions at 108 loci are consistent with the results of the model under the assumption that mutations cause an increase or decrease in repeat number by one and under the condition that the product Nu, where N is the effective population size and u is the mutation rate, is larger than one. We show that the variance of the distribution of allele sizes is a useful estimator of Nu and performs much better than previously suggested estimators for the stepwise mutation model. In the data, there is no correlation between the mean and variance in allele size at a locus or between the number of alleles and mean allele size, which suggests that the mutation rate at these loci is independent of allele size.  相似文献   

2.
Li WH 《Genetics》1978,90(2):349-382
Formulae are developed for the distribution of allele frequencies (the frequency spectrum), the mean number of alleles in a sample, and the mean and variance of heterozygosity under mutation pressure and under either genic or recessive selection. Numerical computations are carried out by using these formulae and Watterson's (1977) formula for the distribution of allele frequencies under overdominant selection. The following properties are observed: (1) The effect of selection on the distribution of allele frequencies is slight when 4Ns 相似文献   

3.
Summary Unbiased estimators of genotype and allele frequencies and their respective variances are obtained for loci identified by mendelian segregation in haploid female gametophytes from individual trees. By a minimum sampling variance criterion, the allocation of experimental effort between the number of female gametophytes analysed per tree and the number of trees sampled per population is examined for a fixed total amount of experimental effort. For estimating heterozygosity, the optimum sampling design for many (generally most) cases is three female gametophytes per tree, but may be more than three depending upon the true genotype frequencies in the population. For estimating allele frequencies, the optimum sampling design is one female gametophyte per tree except in cases where a strong negative correlation exists between alleles within genotpyes. Guidelines are discussed for determining a suitable number of female gametophytes to be analysed per tree in order to estimate heterozygosity.  相似文献   

4.
People living in endemic areas often habour several malaria infections at once. High-resolution genotyping can distinguish between infections by detecting the presence of different alleles at a polymorphic locus. However the number of infections may not be accurately counted since parasites from multiple infections may carry the same allele. We use simulation to determine the circumstances under which the number of observed genotypes are likely to be substantially less than the number of infections present and investigate the performance of two methods for estimating the numbers of infections from high-resolution genotyping data.THE SIMULATIONS SUGGEST THAT THE PROBLEM IS NOT SUBSTANTIAL IN MOST DATASETS: the disparity between the mean numbers of infections and of observed genotypes was small when there was 20 or more alleles, 20 or more blood samples, a mean number of infections of 6 or less and where the frequency of the most common allele was no greater than 20%. The issue of multiple infections carrying the same allele is unlikely to be a major component of the errors in PCR-based genotyping.Simulations also showed that, with heterogeneity in allele frequencies, the observed frequencies are not a good approximation of the true allele frequencies. The first method that we proposed to estimate the numbers of infections assumes that they are a good approximation and hence did poorly in the presence of heterogeneity. In contrast, the second method by Li et al estimates both the numbers of infections and the true allele frequencies simultaneously and produced accurate estimates of the mean number of infections.  相似文献   

5.
Stadler T 《Genetics》2011,188(3):663-672
In this article, I develop a methodology for inferring the transmission rate and reproductive value of an epidemic on the basis of genotype data from a sample of infected hosts. The epidemic is modeled by a birth-death process describing the transmission dynamics in combination with an infinite-allele model describing the evolution of alleles. I provide a recursive formulation for the probability of the allele frequencies in a sample of hosts and a Bayesian framework for estimating transmission rates and reproductive values on the basis of observed allele frequencies. Using the Bayesian method, I reanalyze tuberculosis data from the United States. I estimate a net transmission rate of 0.19/year [0.13, 0.24] and a reproductive value of 1.02 [1.01, 1.04]. I demonstrate that the allele frequency probability under the birth-death model does not follow the well-known Ewens' sampling formula that holds under Kingman's coalescent.  相似文献   

6.
An estimator for pairwise relatedness using molecular markers   总被引:21,自引:0,他引:21  
Wang J 《Genetics》2002,160(3):1203-1215
I propose a new estimator for jointly estimating two-gene and four-gene coefficients of relatedness between individuals from an outbreeding population with data on codominant genetic markers and compare it, by Monte Carlo simulations, to previous ones in precision and accuracy for different distributions of population allele frequencies, numbers of alleles per locus, actual relationships, sample sizes, and proportions of relatives included in samples. In contrast to several previous estimators, the new estimator is well behaved and applies to any number of alleles per locus and any allele frequency distribution. The estimates for two- and four-gene coefficients of relatedness from the new estimator are unbiased irrespective of the sample size and have sampling variances decreasing consistently with an increasing number of alleles per locus to the minimum asymptotic values determined by the variation in identity-by-descent among loci per se, regardless of the actual relationship. The new estimator is also robust for small sample sizes and for unknown relatives being included in samples for estimating allele frequencies. Compared to previous estimators, the new one is generally advantageous, especially for highly polymorphic loci and/or small sample sizes.  相似文献   

7.
An expression is derived and values tabulated for the expected allele frequencies and their variances, arranged in decreasing order in a population, from the finite and infinite alleles diffusion model in Watterson (1976). The neutral model and also a model with heterozygote selection are considered. Some observed ABO blood group allele frequencies are compared with the tabulated expected frequencies in the neutral three allele model. This extends the results of Watterson and Guess (1977) who tabulate the expected value of the most common allele. One test of neutrality previously advocated is to consider the distribution of F, the population homozygosity, conditional on G, the product of allele frequencies. However it is shown here that for a large number of alleles, F and G are asymptotically independent, the test would not be a good one in this case. A limit theorem is derived for the distribution of allele frequencies in the neutral model when the mutation rate is large. In this case F is shown to be asymptotically normal. An inequality is derived for the probability that the oldest allele in a population is amongst the r most frequent types. An inequality is also found for the probability that a sample will only contain representatives of the r most frequent allele types in the population.  相似文献   

8.
A model is presented in which a large population in mutation/drift equilibrium undergoes a severe restriction in size and subsequently remains at the small size. The rate of loss of genetic variability has been studied. Allelic loss occurs more rapidly than loss of genic heterozygosity. Rare alleles are lost especially rapidly. The result is a transient deficiency in the total number of alleles observed in samples taken from the reduced population when compared with the number expected in a sample from a steady-state population having the same observed heterozygosity. Alternatively, the population can be considered to possess excess gene diversity if the number of alleles is used as the statistical estimator of mutation rate. The deficit in allele number arises principally from a lack of those alleles that are expected to appear only once or twice in the sample. The magnitude of the allelic deficiency is less, however, than the excess that an earlier study predicted to follow a rapid population expansion. This suggests that populations that have undergone a single bottleneck event, followed by rapid population growth, should have an apparent excess number of alleles, given the observed level of genic heterozygosity and provided that the bottleneck has not occurred very recently. Conversely, such populations will be deficient for observed heterozygosity if allele number is used as the sufficient statistic for the estimation of 4Nev. Populations that have undergone very recent restrictions in size should show the opposite tendencies.  相似文献   

9.
Tomoko Ohta  Motoo Kimura 《Genetics》1974,76(3):615-624
Using a new model of isoalleles, extensive Monte Carlo experiments were performed to examine the pattern of allelic distribution in a finite population. In this model it was assumed that the set of allelic states is represented by discrete points on a one-dimensional lattice and that change of state by mutation occurs in such a way that an allele moves either one step in the positive direction or one step in the negative direction on the lattice. Such a model was considered to be appropriate for estimating theoretically the number of electrophoretically detectable alleles within a population. The evenness of allelic distribution was measured by the ratio of the effective to the actual number of alleles (n(e)/n(a)). The results of the Monte Carlo experiments have shown that this ratio is generally larger under the new model of isoalleles than under the conventional Kimura-Crow model of neutral isoalleles. In other words, the distribution of allelic frequencies within a population is expected to be more uniform in the new model. By comparing the Monte Carlo results with actual observations, it was concluded that the observed deviation from what is predicted under the new model with selective neutrality is not in the direction of conforming to the overdominance hypothesis but is, in fact, in the opposite direction.  相似文献   

10.
Estimation of allele frequencies for VNTR loci   总被引:9,自引:4,他引:5       下载免费PDF全文
VNTR loci provide valuable information for a number of fields of study involving human genetics, ranging from forensics (DNA fingerprinting and paternity testing) to linkage analysis and population genetics. Alleles of a VNTR locus are simply fragments obtained from a particular portion of the DNA molecule and are defined in terms of their length. The essential element of a VNTR fragment is the repeat, which is a short sequence of basepairs. The core of the fragment is composed of a variable number of identical repeats that are linked in tandem. A sample of fragments from a population of individuals exhibits substantial variation in length because of variation in the number of repeats. Each distinct fragment length defines an allele, but any given fragment is measured with error. Therefore the observed distribution of fragment lengths is not discrete but is continuous, and determination of distinct allele classes is not straightforward. A mixture model is the natural statistical method for estimating the allele frequencies of VNTR loci. In this article we develop nonparametric methods for obtaining the distribution of allele sizes and estimates of their frequencies. Methods for obtaining maximum-likelihood estimates are developed. In addition, we suggest an empirical Bayes method to improve the maximum-likelihood estimates of the gene frequencies; the empirical Bayes procedure effects a local smoothing. The latter method works particularly well when measurement error is large relative to the repeat size, because the estimated distribution of allele frequencies when maximum likelihood is used is unreliable because of an alternating pattern of over- and underestimation. We define alleles and estimate the allele frequencies for two VNTR loci from the human genome (D17S79 and D2S44), from data obtained from Lifecodes, Inc.  相似文献   

11.
Nuclear SSRs are notorious for having relatively high frequencies of null alleles, i.e. alleles that fail to amplify and are thus recessive and undetected in heterozygotes. In this paper, we compare two kinds of approaches for estimating null allele frequencies at seven nuclear microsatellite markers in three French Fagus sylvatica populations: (1) maximum likelihood methods that compare observed and expected homozygote frequencies in the population under the assumption of Hardy-Weinberg equilibrium and (2) direct null allele frequency estimates from progeny where parent genotypes are known. We show that null allele frequencies are high in F. sylvatica (7.0% on average with the population method, 5.1% with the progeny method), and that estimates are consistent between the two approaches, especially when the number of sampled maternal half-sib progeny arrays is large. With null allele frequencies ranging between 5% and 8% on average across loci, population genetic parameters such as genetic differentiation (F ST) may be mostly unbiased. However, using markers with such average prevalence of null alleles (up to 15% for some loci) can be seriously misleading in fine scale population studies and parentage analysis.  相似文献   

12.
The sample frequency spectrum of a segregating site is the probability distribution of a sample of alleles from a genetic locus, conditional on observing the sample to be polymorphic. This distribution is widely used in population genetic inferences, including statistical tests of neutrality in which a skew in the observed frequency spectrum across independent sites is taken as a signature of departure from neutral evolution. Theoretical aspects of the frequency spectrum have been well studied and several interesting results are available, but they are usually under the assumption that a site has undergone at most one mutation event in the history of the sample. Here, we extend previous theoretical results by allowing for at most two mutation events per site, under a general finite allele model in which the mutation rate is independent of current allelic state but the transition matrix is otherwise completely arbitrary. Our results apply to both nested and nonnested mutations. Only the former has been addressed previously, whereas here we show it is the latter that is more likely to be observed except for very small sample sizes. Further, for any mutation transition matrix, we obtain the joint sample frequency spectrum of the two mutant alleles at a triallelic site, and derive a closed-form formula for the expected age of the younger of the two mutations given their frequencies in the population. Several large-scale resequencing projects for various species are presently under way and the resulting data will include some triallelic polymorphisms. The theoretical results described in this paper should prove useful in population genomic analyses of such data.  相似文献   

13.
Richard R. Hudson 《Genetics》1985,109(3):611-631
The sampling distributions of several statistics that measure the association of alleles on gametes (linkage disequilibrium) are estimated under a two-locus neutral infinite allele model using an efficient Monte Carlo method. An often used approximation for the mean squared linkage disequilibrium is shown to be inaccurate unless the proper statistical conditioning is used. The joint distribution of linkage disequilibrium and the allele frequencies in the sample is studied. This estimated joint distribution is sufficient for obtaining an approximate maximum likelihood estimate of C = 4Nc, where N is the population size and c is the recombination rate. It has been suggested that observations of high linkage disequilibrium might be a good basis for rejecting a neutral model in favor of a model in which natural selection maintains genetic variation. It is found that a single sample of chromosomes, examined at two loci cannot provide sufficient information for such a test if C less than 10, because with C this small, very high levels of linkage disequilibrium are not unexpected under the neutral model. In samples of size 50, it is found that, even when C is as large as 50, the distribution of linkage disequilibrium conditional on the allele frequencies is substantially different from the distribution when there is no linkage between the loci. When conditioned on the number of alleles at each locus in the sample, all of the sample statistics examined are nearly independent of theta = 4N mu, where mu is the neutral mutation rate.  相似文献   

14.
Analytic expressions for the expectations and variance of the number of alleles with gene frequencies in a specified class are derived in the entire population as well as in a random sample of genes drawn from the population. The correlation of this quantity with heterozygosity at the locus is also obtained. The derivations are given in details for a steady state population of finite size under the infinite allele model of selectively neutral alleles. The results are extended to include weak selection pressures and non-stationarity of the population. The relevance of the correlation of heterozygosity and the number of rare alleles in connection with the neutralist-selectionist controversy is also discussed.  相似文献   

15.
Genetic profile of cosmopolitan populations: effects of hidden subdivision   总被引:1,自引:0,他引:1  
Natural populations of many organisms exhibit excess of rare alleles in comparison with the predictions of the neutral mutation hypothesis. It has been shown before that either a population bottleneck or the presence of slightly deleterious mutations can explain this phenomenon. A third explanation is presented in this work, showing that hidden subdivision within a population can also lead to an excess of rare alleles in the total population when the expectations of the neutral model are based on the allele frequency profile of the entire population data. With two examples (mitochondrial DNA-morph distribution and isozyme allele frequency distributions), it is shown that most cosmopolitan human populations exhibit excess of rare as well as total allele counts, when these are compared with the expectations of the neutral mutation hypothesis. The mitochondrial data demonstrate that such excesses can be detected from genetic variation at a single locus as well, and this is not due to stochastic error of allele frequency distributions. Contrast of the present observations with the allele frequency profiles in agglomerated tribal populations from South and Central America shows that even when the neutral expectations hold for individual subpopulations, if all subpopulations are grouped into a single population, the pooled data exhibit an excess of total number of alleles that is mainly due to the excess of rare alleles. Therefore, a primary cause of the excess number of rare alleles could be the hidden subdivision, and the magnitude of the excess indicates the extent of substructuring. The two components of hidden subdivision are: 1) Number of subpopulations, and 2) the average genetic distance among them. The implications of this observation in estimating mutation rate are discussed indicating the difficulties of comparing mutation rates from different population surveys.  相似文献   

16.
A method is developed for simulating the allele frequencies in an equilibrium or transient population under the effects of neutral mutation and random drift. The method is based on diffusion theory and is fast so that it can be used to study in detail the distribution of heterozygosity or any quantity that can be expressed as a function of allele frequencies. It has been applied to study the distribution of heterozygosity and the distributions of the frequencies of the first three most frequent alleles in a population. It also has been applied to study the distribution of the number of alleles shared by two populations that were derived from a common stock.  相似文献   

17.
Gong Y  Gu S  Woodruff RC 《Human heredity》2005,60(3):150-155
Based on the hypothesis that rare alleles are in mutation and random loss equilibrium, mutation rate can be indirectly estimated by measuring the number of rare variants and the average existing time of a mutant allele. This method can be applied to estimate the mutation rate in humans. However, this estimation of mutation rate is affected by the presence of premeiotic clusters of mutation. Mutation clusters change both the number of initial mutants and the average existing time of a mutant allele. As a result, the formula indirectly estimating mutation rate should be modified. The influence of premeiotic clusters is more obvious when the population size is small or the average cluster size is big. For example, if the population size is 3,000 and average cluster size is two, instead of one, the mutation rate is increased by about 9.4%.  相似文献   

18.
Twenty-six individuals of the sporophytic self-incompatible (SSI) weed, Senecio squalidus were crossed in a full diallel to determine the number and frequency of S alleles in an Oxford population. Incompatibility phenotypes were determined by fruit-set results and the mating patterns observed fitted a SSI model that allowed us to identify six S alleles. Standard population S allele number estimators were modified to deal with S allele data from a species with SSI. These modified estimators predicted a total number of approximately six S alleles for the entire Oxford population of S. squalidus. This estimate of S allele number is low compared to other estimates of S allele diversity in species with SSI. Low S allele diversity in S. squalidus is expected to have arisen as a consequence of a disturbed population history since its introduction and subsequent colonisation of the British Isles. Other features of the SSI system in S. squalidus were also investigated: (a) the strength of self-incompatibility response; (b) the nature of S allele dominance interactions; and (c) the relative frequencies of S phenotypes. These are discussed in view of the low S allele diversity estimates and the known population history of S. squalidus.  相似文献   

19.
Estimating the age of alleles by use of intraallelic variability.   总被引:9,自引:6,他引:3  
A method is presented for estimating the age of an allele by use of its frequency and the extent of variation among different copies. The method uses the joint distribution of the number of copies in a population sample and the coalescence times of the intraallelic gene genealogy conditioned on the number of copies. The linear birth-death process is used to approximate the dynamics of a rare allele in a finite population. A maximum-likelihood estimate of the age of the allele is obtained by Monte Carlo integration over the coalescence times. The method is applied to two alleles at the cystic fibrosis (CFTR) locus, deltaF508 and G542X, for which intraallelic variability at three intronic microsatellite loci has been examined. Our results indicate that G542X is somewhat older than deltaF508. Although absolute estimates depend on the mutation rates at the microsatellite loci, our results support the hypothesis that deltaF508 arose < 500 generations (approximately 10,000 years) ago.  相似文献   

20.
Previous studies have found that at most human loci, ancestral alleles are "African," in the sense that they reach their highest frequency there. Conventional wisdom holds that this reflects a recent African origin of modern humans. This paper challenges that view by showing that the empirical pattern (of elevated allele frequencies within Africa) is not as pervasive as has been thought. We confirm this African bias in a set of mainly protein-coding loci, but find a smaller bias in Alu insertion polymorphisms, and an even smaller bias in noncoding loci. Thus, the strong bias that was originally observed must reflect some factor that varies among data sets--something other than population history. This factor may be the per-locus mutation rate: the African bias is most pronounced in loci where this rate is high. The distribution of ancestral alleles among populations has been studied using 2 methods. One of these involves comparing the fractions of loci that reach maximal frequency in each population. The other compares the average frequencies of ancestral alleles. The first of these methods reflects history in a manner that depends on the mutation rate. When that rate is high, ancestral alleles at most loci reach their highest frequency in the ancestral population. When that rate is low, the reverse is true. The other method--comparing averages--is unresponsive. Average ancestral allele frequencies are affected neither by mutation rate nor by the history of population size and migration. In the absence of selection and ascertainment bias, they should be the same everywhere. This is true of one data set, but not of 2 others. This also suggests the action of some factor, such as selection or ascertainment bias, that varies among data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号