首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A general procedure is described for measuring and testing population differences in gametic frequencies. The total dispersion among populations is subdivided in hierarchical fashion. The multiple-locus treatment is simply the sum of the single-locus analyses, provided gametic equilibrium obtains among the loci. In the event that gametic equilibrium does not obtain, correlations among loci need to be dealt with.—The analysis is then used to examine the genetic infrastructure of two Indian tribes from South America, the Ye'cuana (Makiritare) and the Yanomama. From historical evidence, we may identify several "clusters" of villages within each tribe. The demographic and cultural practices affecting village formation and the maintenance of peer integrity are rather different in these tribes, however, and lead us to postulate rather different patterns of genetic variation among villages. Analyses of five codominant two-allele loci, four dominant two-allele loci and two complex loci (with four codominant haplotypes each) demonstrate that Yanomama clusters are more disparate than Ye'cuana clusters, as would have been predicted on sociocultural grounds.  相似文献   

2.
Previous studies have found that at most human loci, ancestral alleles are "African," in the sense that they reach their highest frequency there. Conventional wisdom holds that this reflects a recent African origin of modern humans. This paper challenges that view by showing that the empirical pattern (of elevated allele frequencies within Africa) is not as pervasive as has been thought. We confirm this African bias in a set of mainly protein-coding loci, but find a smaller bias in Alu insertion polymorphisms, and an even smaller bias in noncoding loci. Thus, the strong bias that was originally observed must reflect some factor that varies among data sets--something other than population history. This factor may be the per-locus mutation rate: the African bias is most pronounced in loci where this rate is high. The distribution of ancestral alleles among populations has been studied using 2 methods. One of these involves comparing the fractions of loci that reach maximal frequency in each population. The other compares the average frequencies of ancestral alleles. The first of these methods reflects history in a manner that depends on the mutation rate. When that rate is high, ancestral alleles at most loci reach their highest frequency in the ancestral population. When that rate is low, the reverse is true. The other method--comparing averages--is unresponsive. Average ancestral allele frequencies are affected neither by mutation rate nor by the history of population size and migration. In the absence of selection and ascertainment bias, they should be the same everywhere. This is true of one data set, but not of 2 others. This also suggests the action of some factor, such as selection or ascertainment bias, that varies among data sets.  相似文献   

3.
Inference of intraspecific population divergence patterns typically requires genetic data for molecular markers with relatively high mutation rates. Microsatellites, or short tandem repeat (STR) polymorphisms, have proven informative in many such investigations. These markers are characterized, however, by high levels of homoplasy and varying mutational properties, often leading to inaccurate inference of population divergence. A SNPSTR is a genetic system that consists of an STR polymorphism closely linked (typically < 500 bp) to one or more single-nucleotide polymorphisms (SNPs). SNPSTR systems are characterized by lower levels of homoplasy than are STR loci. Divergence time estimates based on STR variation (on the derived SNP allele background) should, therefore, be more accurate and precise. We use coalescent-based simulations in the context of several models of demographic history to compare divergence time estimates based on SNPSTR haplotype frequencies and STR allele frequencies. We demonstrate that estimates of divergence time based on STR variation on the background of a derived SNP allele are more accurate (3% to 7% bias for SNPSTR versus 11% to 20% bias for STR) and more precise than STR-based estimates, conditional on a recent SNP mutation. These results hold even for models involving complex demographic scenarios with gene flow, population expansion, and population bottlenecks. Varying the timing of the mutation event generating the SNP revealed that estimates of divergence time are sensitive to SNP age, with more recent SNPs giving more accurate and precise estimates of divergence time. However, varying both mutational properties of STR loci and SNP age demonstrated that multiple independent SNPSTR systems provide less biased estimates of divergence time. Furthermore, the combination of estimates based separately on STR and SNPSTR variation provides insight into the age of the derived SNP alleles. In light of our simulations, we interpret estimates from data for human populations.  相似文献   

4.
An estimator for pairwise relatedness using molecular markers   总被引:21,自引:0,他引:21  
Wang J 《Genetics》2002,160(3):1203-1215
I propose a new estimator for jointly estimating two-gene and four-gene coefficients of relatedness between individuals from an outbreeding population with data on codominant genetic markers and compare it, by Monte Carlo simulations, to previous ones in precision and accuracy for different distributions of population allele frequencies, numbers of alleles per locus, actual relationships, sample sizes, and proportions of relatives included in samples. In contrast to several previous estimators, the new estimator is well behaved and applies to any number of alleles per locus and any allele frequency distribution. The estimates for two- and four-gene coefficients of relatedness from the new estimator are unbiased irrespective of the sample size and have sampling variances decreasing consistently with an increasing number of alleles per locus to the minimum asymptotic values determined by the variation in identity-by-descent among loci per se, regardless of the actual relationship. The new estimator is also robust for small sample sizes and for unknown relatives being included in samples for estimating allele frequencies. Compared to previous estimators, the new one is generally advantageous, especially for highly polymorphic loci and/or small sample sizes.  相似文献   

5.
Recent admixture between genetically differentiated populations can result in high levels of association between alleles at loci that are <=10 cM apart. The transmission/disequilibrium test (TDT) proposed by Spielman et al. (1993) can be a powerful test of linkage between disease and marker loci in the presence of association and therefore could be a useful test of linkage in admixed populations. The degree of association between alleles at two loci depends on the differences in allele frequencies, at the two loci, in the founding populations; therefore, the choice of marker is important. For a multiallelic marker, one strategy that may improve the power of the TDT is to group marker alleles within a locus, on the basis of information about the founding populations and the admixed population, thereby collapsing the marker into one with fewer alleles. We have examined the consequences of collapsing a microsatellite into a two-allele marker, when two founding populations are assumed for the admixed population, and have found that if there is random mating in the admixed population, then typically there is a collapsing for which the power of the TDT is greater than that for the original microsatellite marker. A method is presented for finding the optimal collapsing that has minimal dependence on the disease and that uses estimates either of marker allele frequencies in the two founding populations or of marker allele frequencies in the current, admixed population and in one of the founding populations. Furthermore, this optimal collapsing is not always the collapsing with the largest difference in allele frequencies in the founding populations. To demonstrate this strategy, we considered a recent data set, published previously, that provides frequency estimates for 30 microsatellites in 13 populations.  相似文献   

6.
Myotonic dystrophy (DM) is a dominant neuromuscular disease that results from an unstable CTG-repeat expansion in the 3' UTR of the myotonin kinase gene at 19q13.3. This repeat is normally polymorphic with a trimodal distribution reflecting 5-, 11-17-, and 19-30-repeat-length alleles. An absolute association between expanded CTG alleles and the 1-kb insertion allele of an intragenic polymorphism in Caucasians has led to the proposal that the 5-repeat allele gives rise to alleles of 19-30 repeats, from which expanded alleles are derived, a transition not involving the 11-17-repeat alleles. A survey of eight global populations confirms the stability of the 11-17-repeat alleles but shows disociation between the 1-kb insertion polymorphism and both the 5- and 19-30-repeat-length alleles. These data indicate more than one ancestral allele from which expanded alleles are derived and suggest that widely variable population frequencies of DM may reflect distinct frequencies of such predisposed alleles.  相似文献   

7.
Rannala B  Qiu WG  Dykhuizen DE 《Genetics》2000,155(2):499-508
Recent breakthroughs in molecular technology, most significantly the polymerase chain reaction (PCR) and in situ hybridization, have allowed the detection of genetic variation in bacterial communities without prior cultivation. These methods often produce data in the form of the presence or absence of alleles or genotypes, however, rather than counts of alleles. Using relative allele frequencies from presence-absence data as estimates of population allele frequencies tends to underestimate the frequencies of common alleles and overestimate those of rare ones, potentially biasing the results of a test of neutrality in favor of balancing selection. In this study, a maximum-likelihood estimator (MLE) of bacterial allele frequencies designed for use with presence-absence data is derived using an explicit stochastic model of the host infection (or bacterial sampling) process. The performance of the MLE is evaluated using computer simulation and a method is presented for evaluating the fit of estimated allele frequencies to the neutral infinite alleles model (IAM). The methods are applied to estimate allele frequencies at two outer surface protein loci (ospA and ospC) of the Lyme disease spirochete, Borrelia burgdorferi, infecting local populations of deer ticks (Ixodes scapularis) and to test the fit to a neutral IAM.  相似文献   

8.
Abstract The D ' coefficient is one of the most commonly used measures of the extent of gametic disequilibrium between multiallelic loci. It has been suggested that the range of the D ' measure of overall disequilibrium between pairs of multiallelic loci depends on allele frequencies, except under some very restricted conditions. Nevertheless, the problem of dependence of the range of D ' has not been characterized under a wide set of possible polymorphisms. Evaluation of the utility of D ' as a measure of the strength of overall disequilibrium between all possible pairs of alleles at two multiallelic loci requires better knowledge of its range than is currently available. In this work, the conditions of polymorphism under which the range of D ' is frequency independent are given. It is found that the range of D ' is more often independent of allelic frequencies than is commonly thought. Furthermore, the range of D ' undergoes only small fluctuations as a function of the polymorphisms at the loci. Numerical cases and microsatellite data from humans are used for illustration. These observations indicate that the D ' coefficient is a useful tool for the estimation and comparison of the extent of overall disequilibrium across pairs of multiallelic loci.  相似文献   

9.
Estimation of allele frequencies for VNTR loci   总被引:9,自引:4,他引:5       下载免费PDF全文
VNTR loci provide valuable information for a number of fields of study involving human genetics, ranging from forensics (DNA fingerprinting and paternity testing) to linkage analysis and population genetics. Alleles of a VNTR locus are simply fragments obtained from a particular portion of the DNA molecule and are defined in terms of their length. The essential element of a VNTR fragment is the repeat, which is a short sequence of basepairs. The core of the fragment is composed of a variable number of identical repeats that are linked in tandem. A sample of fragments from a population of individuals exhibits substantial variation in length because of variation in the number of repeats. Each distinct fragment length defines an allele, but any given fragment is measured with error. Therefore the observed distribution of fragment lengths is not discrete but is continuous, and determination of distinct allele classes is not straightforward. A mixture model is the natural statistical method for estimating the allele frequencies of VNTR loci. In this article we develop nonparametric methods for obtaining the distribution of allele sizes and estimates of their frequencies. Methods for obtaining maximum-likelihood estimates are developed. In addition, we suggest an empirical Bayes method to improve the maximum-likelihood estimates of the gene frequencies; the empirical Bayes procedure effects a local smoothing. The latter method works particularly well when measurement error is large relative to the repeat size, because the estimated distribution of allele frequencies when maximum likelihood is used is unreliable because of an alternating pattern of over- and underestimation. We define alleles and estimate the allele frequencies for two VNTR loci from the human genome (D17S79 and D2S44), from data obtained from Lifecodes, Inc.  相似文献   

10.
Pathogen resistance and genetic variation at MHC loci   总被引:14,自引:0,他引:14  
Abstract.— Balancing selection in the form of heterozygote advantage, frequency-dependent selection, or selection that varies in time and/or space, has been proposed to explain the high variation at major histocompatibility complex (MHC) genes. Here the effect of variation of the presence and absence of pathogens over time on genetic variation at multiallelic loci is examined. In the basic model, resistance to each pathogen is conferred by a given allele, and this allele is assumed to be dominant. Given that s is the selective disadvantage for homozygotes (and heterozygotes) without the resistance allele and the proportion of generations, which a pathogen is present, is e , fitnesses for homozygotes become (1 — s )(n-1)e and the fitnesses for heterozygotes become (1 — s )(n-2)e, where n is the number of alleles. In this situation, the conditions for a stable, multiallelic polymorphism are met even though there is no intrinsic heterozygote advantage. The distribution of allele frequencies and consequently heterozygosity are a function of the autocorrelation of the presence of the pathogen in subsequent generations. When there is a positive autocorrelation over generations, the observed heterozygosity is reduced. In addition, the effects of lower levels of selection and dominance and the influence of genetic drift were examined. These effects were compared to the observed heterozygosity for two MHC genes in several South American Indian samples. Overall, resistance conferred by specific alleles to temporally variable pathogens may contribute to the observed polymorphism at MHC genes and other similar host defense loci.  相似文献   

11.
Definition and Estimation of Higher-Order Gene Fixation Indices   总被引:1,自引:0,他引:1       下载免费PDF全文
Kermit Ritland 《Genetics》1987,117(4):783-793
Fixation indices summarize the associations between genes that arise from the joint effects of inbreeding and selection. In this paper, fixation indices are derived for pairs, triplets and quadruplets of genes at a single multiallelic locus. The fixation indices are obtained by dividing cumulants by constants; the cumulants describe the statistical distribution of alleles and the constants are functions of gene frequency. The use of cumulants instead of moments is necessary only for four-gene indices, when the fourth cumulant is used. A second type of four-gene index is also required, and this index is based upon the covariation of second-order cumulants. At multiallelic loci, a large number of indices is possible. If alleles are selectively neutral, the number of indices is reduced and the relationship between gene identity and gene cumulants is shown.--Two-gene indices can always be estimated from genotypic frequency data at a single polymorphic locus. Three-gene indices are also estimable except when allele frequency equals one-half. Four-gene indices are not estimable unless selection is assumed to have an equal effect upon each allele (such as under selective neutrality) and the locus contains at least three alleles of unequal frequency. For diallelic or selected loci, an alternative four-gene fixation index is proposed. This index incorporates both types of four-gene associations but cannot be related to gene identity.  相似文献   

12.
Spatial structure of genetic variation within populations is well measured by statistics based on the distribution of pairs of individual genotypes, and various such statistics have been widely used in experimental studies. However, the problem of uncharacterized correlations among statistics for different alleles has limited the applications of multiallelic, multilocus summary measures, since these had unknown sampling distributions. Usually multiple alleles and/or multiple loci are required in order to precisely measure spatial structures, and to provide precise indirect estimates of the amount of dispersal in samples of reasonable size. This article examines the correlations among pair-wise statistics, including Moran I-statistics and various measures of conditional kinship, for different alleles of a locus. First the correlations are mathematically derived for random spatial distributions, which allow averages over alleles and loci to be used as more powerful yet exact test statistics for the null hypothesis. Then extensive computer simulations are conducted to examine the correlations among values for different alleles under isolation by distance processes. For loci with more than three alleles, the results show that the correlations are remarkably and perhaps surprisingly small, establishing the principle that then alleles behave as nearly independent realizations of space-time stochastic processes. The results also show that the correlations are largely robust with respect to the degree of spatial structure, and they can be used in a straightforward manner to form confidence intervals for averages. The results allow a precise connection between observations in experimental studies and levels of dispersal in theoretical models.  相似文献   

13.
We provide experimental evidence showing that, during the restriction-enzyme digestion of DNA samples, some of the HaeIII-digested DNA fragments are small enough to prevent their reliable sizing on a Southern gel. As a result of such nondetectability of DNA fragments, individuals who show a single-band DNA profile at a VNTR locus may not necessarily be true homozygotes. In a population database, when the presence of such nondetectable alleles is ignored, we show that a pseudodependence of alleles within as well as across loci may occur. Using a known statistical method, under the hypothesis of independence of alleles within loci, we derive an efficient estimate of null allele frequency, which may be subsequently used for testing allelic independence within and across loci. The estimates of null allele frequencies, thus derived, are shown to agree with direct experimental data on the frequencies of HaeIII-null alleles. Incorporation of null alleles into the analysis of the forensic VNTR database suggests that the assumptions of allelic independence within and between loci are appropriate. In contrast, a failure to incorporate the occurrence of null alleles would provide a wrong inference regarding the independence of alleles within and between loci.  相似文献   

14.
We consider non-neutral models for unlinked loci, where the fitness of a chromosome or individual is not multiplicative across loci. Such models are suitable for many complex diseases, where there are gene-interactions. We derive a genealogical process for such models, called the complex selection graph (CSG). This coalescent-type process is related to the ancestral selection graph, and is derived from the ancestral influence graph by considering the limit as the recombination rate between loci gets large. We analyse the CSG both theoretically and via simulation. The main results are that the gene-interactions do not produce linkage disequilibrium, but do produce dependencies in allele frequencies between loci. For small selection rates, the distributions of the genealogy and the allele frequencies at a single locus are well-approximated by their distributions under a single locus model, where the fitness of each allele is the average of the true fitnesses of that allele with respect to the distribution of alleles at other loci.  相似文献   

15.
Summary Patterns of allozyme variation were surveyed in collections of cultivated and wild sorghum from Africa, the Middle East, and Asia. Data for 30 isozyme loci from a total of 2067 plants representing 429 accessions were analyzed. Regional levels of genetic diversity in the cultivars are greater in northern and central Africa compared to southern Africa, the Middle East, or Asia. The spatial distribution of individual alleles at the most variable loci was studied by plotting allele frequencies on geographic maps covering the distribution of sorghum. Generally, many of the alleles with frequencies below 0.25 are localized in specific portions of the range and are commonly present in more than one race in that region. Several alleles occur in both wild and cultivated sorghum of one region and are absent from sorghum elsewhere, suggesting local introgression between the wild and cultivated forms. Although the same most common allele was found in the wild and cultivated gene pools at 29 of the 30 loci, phenetic analyses separated the majority of wild collections from the cultivars, indicating that the two gene pools are distinct. Wild sorghum from northeast and central Africa exhibits greater genetic similarities to the cultivars compared to wild sorghum of northwest or southern Africa. This is consistent with the theory that wild sorghum of northeast-central Africa is ancestral to domesticated sorghum. Wild sorghums of race arundinaceum of northwest Africa and race virgatum from Egypt are shown to be genetically distinct from both other forms of wild sorghum and from the cultivars. Suggestions for genetic conservation are presented in light of these data.  相似文献   

16.
An Evaluation of Genetic Distances for Use with Microsatellite Loci   总被引:49,自引:8,他引:41  
Mutations of alleles at microsatellite loci tend to result in alleles with repeat scores similar to those of the alleles from which they were derived. Therefore the difference in repeat score between alleles carries information about the amount of time that has passed since they shared a common ancestral allele. This information is ignored by genetic distances based on the infinite alleles model. Here we develop a genetic distance based on the stepwise mutation model that includes allelic repeat score. We adapt earlier treatments of the stepwise mutation model to show analytically that the expectation of this distance is a linear function of time. We then use computer simulations to evaluate the overall reliability of this distance and to compare it with allele sharing and Nei's distance. We find that no distance is uniformly superior for all purposes, but that for phylogenetic reconstruction of taxa that are sufficiently diverged, our new distance is preferable.  相似文献   

17.
The two alleles an individual carries at a locus are identical by descent (ibd) if they have descended from a single ancestral allele in a reference population, and the probability of such identity is the inbreeding coefficient of the individual. Inbreeding coefficients can be predicted from pedigrees with founders constituting the reference population, but estimation from genetic data is not possible without data from the reference population. Most inbreeding estimators that make explicit use of sample allele frequencies as estimates of allele probabilities in the reference population are confounded by average kinships with other individuals. This means that the ranking of those estimates depends on the scope of the study sample and we show the variation in rankings for common estimators applied to different subdivisions of 1000 Genomes data. Allele-sharing estimators of within-population inbreeding relative to average kinship in a study sample, however, do have invariant rankings across all studies including those individuals. They are unbiased with a large number of SNPs. We discuss how allele sharing estimates are the relevant quantities for a range of empirical applications.Subject terms: Population genetics, Evolutionary biology, Molecular ecology  相似文献   

18.
The effect of population bottlenecks on the components of the genetic variance/covariance generated by n neutral independent additive x additive loci has been studied theoretically. In its simplest version, this situation can be modelled by specifying the allele frequencies and homozygous effects at each locus, and an additional factor measuring the strength of the n-th order epistatic interaction. The variance/covariance components in an infinitely large panmictic population (ancestral components) were compared with their expected values at equilibrium over replicates randomly derived from the base population, after t bottlenecks of size N (derived components). Formulae were obtained giving the derived components (and the between-line variance) as functions of the ancestral ones (alternatively, in terms of allele frequencies and effects) and the corresponding inbreeding coefficient F(t). The n-th order derived component of the genetic variance/covariance is continuously eroded by inbreeding, but the remaining components may increase initially until a critical F(t) value is attained, which is inversely related to the order of the pertinent component, and subsequently decline to zero. These changes can be assigned to the between-line variances/covariances of gene substitution and epistatic effects induced by drift. Numerical examples indicate that: (1) the derived additive variance/covariance component will generally exceed its ancestral value unless epistasis is weak; (2) the derived epistatic variance/covariance components will generally exceed their ancestral values unless allele frequencies are extreme; (3) for systems showing equal ancestral additive and total non-additive variance/covariance components, those including a smaller number of epistatic loci may generate a larger excess in additive variance/covariance after bottlenecks than others involving a larger number of loci, provided that F(t) is low. Our results indicate that it is unlikely that the rate of evolution may be significantly accelerated after population bottlenecks, in spite of occasional increments of the derived additive variance over its ancestral value.  相似文献   

19.
Continuous selective models   总被引:5,自引:0,他引:5  
Neglecting age-structure, but taking into account matings with differential fertility in Mendelian reproduction, continuous selective models are formulated for a single locus with an arbitrary number of alleles, with or without distinguishing the sexes, and for two alleles at each of two loci in a monoecious population. In each case, without restricting the mating system, differential equations are derived for the genotypic frequencies, and the validity of the customary Malthusian-parameter differential equations for the gametic frequencies is established. Particular attention is devoted to the conditions for Hardy-Weinberg proportions under random mating. For multiple alleles at a single locus in a monoecious population, exact solutions are obtained for the following three Hardy-Weinberg models: gametic selection, no dominance, and the same selective effect for all alleles but one. The last scheme includes, as special cases, a completely dominant or recessive distinguished allele, and arbitrary selection with only two alleles. Two single-locus assortative mating patterns are analyzed for a monoecious organism using the general formalism. One of these has an arbitrary number of alleles, all the genotypes being distinguishable, while the other involves two alleles, one of which is completely dominant to the other.  相似文献   

20.
Studies of major histocompatibility complex (MHC) diversity in non-model vertebrates typically focus on structure and sequence variation in the antigen-presenting loci: the highly variable and polymorphic class I and class IIB genes. Although these studies provide estimates of the number of genes and alleles/locus, they often overlook variation in functionally related and co-inherited genes important in the immune response. This study utilizes the sequence of the MHC B-locus derived from a commercial turkey to investigate MHC variation in wild birds. Sequences were obtained for nine interspersed MHC amplicons (non-class I/II) from each of 40 birds representing 3 subspecies of wild turkey (Meleagris gallopavo). Analysis of aligned sequences identified 238 single-nucleotide variants approximately one-third of which had minor allele frequencies >0.2 in the sampled birds. PHASE analysis identified 70 prospective MHC haplotypes in the wild turkeys, whereas a combined analysis with commercial birds identified almost 100 haplotypes in the species. Denaturing gradient gel electrophoresis (DGGE) of the class IIB loci was used to test the efficacy of single-nucleotide polymorphism (SNP) haplotyping to capture locus-wide variation. Diversity in SNP haplotypes and haplotype sharing among individuals was directly reflected in the DGGE patterns. Utilization of a reference haplotype to sequence interspersed regions of the MHC has significant advantages over other methods of surveying diversity while identifying high-frequency SNPs for genotyping. SNP haplotyping provides a means to identify both divergent haplotypes and homozygous individuals for assessment of immunological variation in wild and domestic populations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号