首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Multilocus genotype probabilities, estimated using the assumption of independent association of alleles within and across loci, are subject to sampling fluctuation, since allele frequencies used in such computations are derived from samples drawn from a population. We derive exact sampling variances of estimated genotype probabilities and provide simple approximation of sampling variances. Computer simulations conducted using real DNA typing data indicate that, while the sampling distribution of estimated genotype probabilities is not symmetric around the point estimate, the confidence interval of estimated (single-locus or multilocus) genotype probabilities can be obtained from the sampling of a logarithmic transformation of the estimated values. This, in turn, allows an examination of heterogeneity of estimators derived from data on different reference populations. Applications of this theory to DNA typing data at VNTR loci suggest that use of different reference population data may yield significantly different estimates. However, significant differences generally occur with rare (less than 1 in 40,000) genotype probabilities. Conservative estimates of five-locus DNA profile probabilities are always less than 1 in 1 million in an individual from the United States, irrespective of the racial/ethnic origin.  相似文献   

2.
We consider two methods of estimating phenotype probabilities for a number of standard genetic markers like the ABO, MNSs, and PGM markers. The first method is based on the maximum likelihood estimates of the allele probabilities, and the second (multinomial) method uses the phenotype proportions in the sample. The latter is easy to use, the estimates are always unbiased, and simple formulae for variances are available. The former method, although giving more efficient estimates, requires the assumption of panmixia so that the Hardy-Weinberg law can be used. The two methods are compared theoretically, where possible, or by simulation. Under panmixia, the maximum likelihood estimates can be substantially more efficient than the multinomial estimates. The estimates are also compared in the codominant allele case for nonpanmictic populations. The question of efficiency is of importance when estimating the probability of obtaining a given set of phenotypes, i.e., the product of individual phenotype estimators. This problem is discussed briefly.  相似文献   

3.
Z. B. Zeng  C. C. Cockerham 《Genetics》1991,129(2):535-553
The variances of genetic variances within and between finite populations were systematically studied using a general multiple allele model with mutation in terms of identity by descent measures. We partitioned the genetic variances into components corresponding to genetic variances and covariances within and between loci. We also analyzed the sampling variance. Both transient and equilibrium results were derived exactly and the results can be used in diverse applications. For the genetic variance within populations, sigma 2 omega, the coefficient of variation can be very well approximated as [formula: see text] for a normal distribution of allelic effects, ignoring recurrent mutation in the absence of linkage, where m is the number of loci, N is the effective population size, theta 1(0) is the initial identity by descent measure of two genes within populations and t is the generation number. The first term is due to genic variance, the second due to linkage disequilibrium, and third due to sampling. In the short term, the variation is predominantly due to linkage disequilibrium and sampling; but in the long term it can be largely due to genic variance. At equilibrium with mutation [formula: see text] where u is the mutation rate. The genetic variance between populations is a parameter. Variance arises only among sample estimates due to finite sampling of populations and individuals. The coefficient of variation for sample gentic variance between populations, sigma 2b, can be generally approximated as [formula: see text] when the number of loci is large where S is the number of sampling populations.  相似文献   

4.
Current methods for measures of genetic diversity of populations and germplasm collections are often based on statistics calculated from molecular markers. The objective of this study was to investigate the precision and accuracy of the most common estimators of genetic variability and population structure, as calculated from simple sequence repeat (SSR) marker data from cacao (Theobroma cacao L.). Computer simulated genomes of replicate populations were generated from initial allele frequencies estimated using SSR data from cacao accessions in a collection. The simulated genomes consisted of ten linkage groups of 100 cM in length each. Heterozygosity, gene diversity and the F statistics were studied as a function of number of loci and trees sampled. The results showed that relatively small random samples of trees were needed to achieve consistency in the observed estimations. In contrast, very large random samples of loci per linkage group were required to enable reliable inferences on the whole genome. Precision of estimates was increased by more than 50% with an increase in sample size from one to five loci per linkage group or 50 per genome, and up to 70% with ten loci per linkage group, or equivalently, 100 loci per genome. The use of fewer, highly polymorphic loci to analyze genetic variability led to estimates with substantially smaller variance but with an upward bias. Nevertheless, the relative differences of estimates among populations were generally consistent for the different levels of polymorphism considered.  相似文献   

5.
Hill WG  Weir BS 《Molecular ecology》2004,13(4):895-908
A moment-based method for estimating a measure of population diversity, theta or Wright's FST, is given for dominant markers such as amplified fragment length polymorphisms (AFLPs) or RAPDs in noninbred populations. Basic assumptions are that there is random mating, Hardy-Weinberg equilibrium, linkage equilibrium, no mutation from common ancestor and equally distant populations. It is based on the variances between and within populations of genotype frequencies, whereas previously moment methods for dominant markers have been indirect in that they have been based on first estimating allele frequencies and then using the variances of those frequencies. The use of genotype frequencies directly appears to be more robust. Approximate sampling errors of the estimates are given. Methods are extended to estimate genetic distances and their sampling errors. The AFLP data from samples of breeds of pig are used for illustration.  相似文献   

6.
The allele frequencies at ten polymorphic loci are described from 31 Bufo marinus populations in the Moreton Bay region in southeastern Queensland, Australia and the variation of these is found to be non-random in all cases. The pattern of non-randomness varies among loci, being clinal in two instances. The allele frequencies at the same ten loci are also described for 12 populations sampled from throughout B. marinus' Australian range. The frequency variation on this larger geographical scale is non-random at all but two loci (Mpi and Hbdh) and also varies among loci, in this case being clinal in four instances. In both cases, the patterns of variation are most reasonably explained as having resulted from genetic drift occurring during the recent range expansion which B. marinus is known to have experienced in Australia. It seems that natural selection has played little, if any, role in generating the observed gene frequency patterns. These results emphasize the need for caution in interpreting geographical patterns of variation. They show that even when clinal patterns exist at some loci but not at others, one cannot conclude that the patterns result from natural selection, unless the demographic histories of the studied populations are known and are inconsistent with the alternative hypothesis that the patterns result from genetic drift.  相似文献   

7.
Inferences of selection and migration in the Danish house mouse hybrid zone   总被引:2,自引:0,他引:2  
We analysed the patterns of allele frequency change for ten diagnostic autosomal allozyme loci in the hybrid zone between the house mouse subspecies Mus musculus domesticus and M. m. musculus in central Jutland. After determining the general orientation of the clines of allele frequencies, we analysed the cline shapes along the direction of maximum gradient. Eight of the ten clines are best described by steep central steps with coincident positions and an average width of 8.9 km (support limits 7.6–12.4) flanked by tails of introgression, indicating the existence of a barrier to gene flow and only weak selection on the loci studied. We derived estimates of migration from linkage disequilibrium in the centre of the zone, and by applying isolation by distance methods to microsatellite data from some of these populations. These give concordant estimates of σ =  0.5–0.8 km generation     . The barrier to gene flow is of the order of 20 km (support limits 14–28), and could be explained by selection of a few per cent at 43–120 underdominant loci that reduces the mean fitness in the central populations to 0.45. Some of the clines appear symmetrical, whereas others are strongly asymmetrical, and two loci appear to have escaped the central barrier to gene flow, reflecting the differential action of selection on different parts of the genome. Asymmetry is always in the direction of more introgression into musculus , indicating either a general progression of domesticus into the musculus territory, possibly mediated by differential behaviour, or past movement of the hybrid zone in the opposite direction, impeded by potential geographical barriers to migration in domesticus territory.  © 2005 The Linnean Society of London, Biological Journal of the Linnean Society , 2005, 84 , 593–616.  相似文献   

8.
Z. B. Zeng  D. Houle    C. C. Cockerham 《Genetics》1990,126(1):235-247
S. Wright suggested an estimator, m, of the number of loci, m, contributing to the difference in a quantitative character between two differentiated populations, which is calculated from the phenotypic means and variances in the two parental populations and their F1 and F2 hybrids. The same method can also be used to estimate m contributing to the genetic variance within a single population, by using divergent selection to create differentiated lines from the base population. In this paper we systematically examine the utility and problems of this technique under the influences of unequal allelic effects and initial allele frequencies, and linkage, which are known to lead m to underestimate m. In addition, we examine the effects of population size and selection intensity during the generations of selection. During selection, the estimator m rapidly approaches its expected value at the selection limit. With reasonable assumptions about unequal allelic effects and initial allele frequencies, the expected value of m without linkage is likely to be on the order of one-third of the number of genes. The estimates suffer most seriously from linkage. The practical maximum expectation of m is just about the number of chromosomes, considerably less than the "recombination index" which has been assumed to be the upper limit. The estimates are also associated with large sampling variances. An estimator of the variance of m derived by R. Lande substantially underestimates the actual variance. Modifications to the method can ameliorate some of the problems. These include using F3 or later generation variances or the genetic variance in the base population, and replicating the experiments and estimation procedure. However, even in the best of circumstances, information from m is very limited and can be misleading.  相似文献   

9.
We studied the patterns of within- and between-population variation at 29 trinucleotide loci in a random sample of 200 healthy individuals from four diverse populations: Germans, Nigerians, Chinese, and New Guinea highlanders. The loci were grouped as disease-causing (seven loci with CAG repeats), gene-associated (seven loci with CAG/CCG repeats and eight loci with AAT repeats), or anonymous (seven loci with AAT repeats). We used heterozygosity and variance of allele size (expressed in units of repeat counts) as measures of within-population variability and GST (based on heterozygosity as well as on allele size variance) as the measure of genetic differentiation between populations. Our observations are: (1) locus type is the major significant factor for differences in within-population genetic variability; (2) the disease-causing CAG repeats (in the nondisease range of repeat counts) have the highest within-population variation, followed by the AAT-repeat anonymous loci, the AAT-repeat gene-associated loci, and the CAG/CTG-repeat gene-associated loci; (3) an imbalance index beta, the ratio of the estimates of the product of effective population size and mutation rate based on allele size variance and heterozygosity, is the largest for disease-causing loci, followed by AAT- and CAG/CCG-repeat gene-associated loci and AAT-repeat anonymous loci; (4) mean allele size correlates positively with allele size variance for AAT- and CAG/CCG-repeat gene-associated loci and negatively for anonymous loci; and (5) GST is highest for the disease-causing loci. These observations are explained by specific differences of rates and patterns of mutations in these four groups of trinucleotide loci, taking into consideration the effects of the past demographic history of the modern human population.  相似文献   

10.
Slatkin M  Muirhead CA 《Genetics》2000,156(4):2119-2126
A method is proposed for estimating the intensity of overdominant selection scaled by the effective population size, S = 2Ns, from allele frequencies. The method is based on the assumption that, with strong overdominant selection, allele frequencies are nearly at their deterministic equilibrium values and that, to a first approximation, deviations depend only on S. Simulations verify that reasonably accurate estimates of S can be obtained for realistic sample sizes. The method is applied to data from several loci in the major histocompatibility complex (Mhc) in numerous human populations. For alleles distinguished by both serological typing and the sequence of the peptide-binding region, our estimates of S are comparable to those obtained by analysis of DNA sequences in showing that selection is strongest on HLA-B and weaker on HLA-A, HLA-DRB1, and HLA-DQA1. The intensity of selection on HLA-B varied considerably among populations. Two populations, Native American and Inuit, showed an excess rather than a deficiency in homozygosity. Comparable estimates of S were obtained for alleles at Mhc class II loci distinguished by serological reactions (serotyping) and by differences in the amino acid sequences of the peptide-binding region (molecular typing). A comparison of two types of data for DQA1 and DRB1 showed that serotyping led to generally lower estimates of S.  相似文献   

11.
We investigated 39 previously developed Betula, Alnus, and Corylus simple sequence repeat (SSR) markers for their utility in the cross-generic amplification of two European alder species, i.e., Alnus glutinosa and A. incana. Of these markers, ten loci had successful amplification within Alnus species. Finally, we designed two multiplexes composed of eight and nine loci for A. glutinosa and A. incana, respectively. Multiplexes were tested on 100 samples from five different populations of each species across Europe. The majority of loci had a relatively high genetic diversity, were in Hardy–Weinberg equilibrium, and showed low error rates and low occurrence of null alleles. By comparing sequences of source species and both Alnus species, we concluded that repeat motifs of five of these ten loci differed from those described for the source species. These differences represent mainly the modifications of the original motifs and affected compound or interrupted repeats as well as pure ones. The repeat motifs of three loci of the two alder species also differed. These mutations could lead to erroneous estimates of allele homology, because alleles with identical lengths will not have the same number of repeat units. Hence, before using microsatellite markers in studies comparing two or more species, they should be carefully examined and sequenced to ensure that allele homology is really stable and not affected by various inserts that change the sequence.  相似文献   

12.
Falush D  Stephens M  Pritchard JK 《Genetics》2003,164(4):1567-1587
We describe extensions to the method of Pritchard et al. for inferring population structure from multilocus genotype data. Most importantly, we develop methods that allow for linkage between loci. The new model accounts for the correlations between linked loci that arise in admixed populations ("admixture linkage disequilibium"). This modification has several advantages, allowing (1) detection of admixture events farther back into the past, (2) inference of the population of origin of chromosomal regions, and (3) more accurate estimates of statistical uncertainty when linked loci are used. It is also of potential use for admixture mapping. In addition, we describe a new prior model for the allele frequencies within each population, which allows identification of subtle population subdivisions that were not detectable using the existing method. We present results applying the new methods to study admixture in African-Americans, recombination in Helicobacter pylori, and drift in populations of Drosophila melanogaster. The methods are implemented in a program, structure, version 2.0, which is available at http://pritch.bsd.uchicago.edu.  相似文献   

13.
Recent admixture between genetically differentiated populations can result in high levels of association between alleles at loci that are <=10 cM apart. The transmission/disequilibrium test (TDT) proposed by Spielman et al. (1993) can be a powerful test of linkage between disease and marker loci in the presence of association and therefore could be a useful test of linkage in admixed populations. The degree of association between alleles at two loci depends on the differences in allele frequencies, at the two loci, in the founding populations; therefore, the choice of marker is important. For a multiallelic marker, one strategy that may improve the power of the TDT is to group marker alleles within a locus, on the basis of information about the founding populations and the admixed population, thereby collapsing the marker into one with fewer alleles. We have examined the consequences of collapsing a microsatellite into a two-allele marker, when two founding populations are assumed for the admixed population, and have found that if there is random mating in the admixed population, then typically there is a collapsing for which the power of the TDT is greater than that for the original microsatellite marker. A method is presented for finding the optimal collapsing that has minimal dependence on the disease and that uses estimates either of marker allele frequencies in the two founding populations or of marker allele frequencies in the current, admixed population and in one of the founding populations. Furthermore, this optimal collapsing is not always the collapsing with the largest difference in allele frequencies in the founding populations. To demonstrate this strategy, we considered a recent data set, published previously, that provides frequency estimates for 30 microsatellites in 13 populations.  相似文献   

14.
We have used a new method for binning minisatellite alleles (semi-automated allele aggregation) and report the extent of population diversity detectable by eleven minisatellite loci in 2,689 individuals from 19 human populations distributed widely throughout the world. Whereas population relationships are consistent with those found in other studies, our estimate of genetic differentiation (F(st)) between populations is less than 8%, which is lower than comparative estimates of between 10%-15% obtained by using other sources of polymorphism data. We infer that mutational processes are involved in reducing F(st) estimates from minisatellite data because, first, the lowest F(st) estimates are found at loci showing autocorrelated frequencies among alleles of similar size and, second, F(st) declines with heterozygosity but by more than predicted assuming simple models of mutation. These conclusions are consistent with the view that minisatellites are subject to selective or mutational constraints in addition to those expected under simple step-wise mutation models.  相似文献   

15.
E S Dement'eva 《Genetika》1975,10(7):122-130
The article comprises the results on the analysis of the structure of the great Pamirs' population (of a higher rank) and of one of its parts, the subpopulation of the valley of the river Bartang. Wright's F coefficient was used for the statistical treatment of the data obtained in the course of the analysis. The FST estimates were obtained from the variances of the frequencies of the genes located in 5 loci (ABO, MN, P, Rh and P.T.C.) culculated for 23 samples of the great populations of the Pamirs and for 9 samples of the population of the Bartang river valley. The general inbreeding coefficient for the Pamirs FIT=0,0323, its random component FST=0,0017 and the non-random component FIS=0.0306.  相似文献   

16.
Estimates of the effective number of breeding adults were derived for three semi-isolated populations of the common toad Bufo bufo based on temporal (i.e. adult-progeny) variance in allele frequency for three highly polymorphic minisatellite loci. Estimates of spatial variance in allele frequency among populations and of age-specific measures of genetic variability are also described. Each population was characterized by a low effective adult breeding number ( N b) based on a large age-specific variance in mini-satellite allele frequency. Estimates of N b (range 21–46 for population means across three loci) were ≊ 55–230-fold lower than estimates of total adult census size. The implications of low effective breeding numbers for long-term maintenance of genetic variability and population viability are discussed relative to the species' reproductive ecology, current land-use practices, and present and historical habitat modification and loss. The utility of indirect measures of population parameters such as N b and N e based on time-series data of minisatellite allele frequencies is discussed relative to similar measures estimated from commonly used genetic markers such as protein allozymes.  相似文献   

17.
Studies of inbreeding depression or kin selection require knowledge of relatedness between individuals. If pedigree information is lacking, one has to rely on genotypic information to infer relatedness. In this study we investigated the performance (absolute and relative) of 10 marker-based relatedness estimators using allele frequencies at microsatellite loci obtained from natural populations of two bird species and one mammal species. Using Monte Carlo simulations we show that many factors affect the performance of estimators and that different sets of loci promote the use of different estimators: in general, there is no single best-performing estimator. The use of locus-specific weights turns out to greatly improve the performance of estimators when marker loci are used that differ strongly in allele frequency distribution. Microsatellite-based estimates are expected to explain between 25 and 79% of variation in true relatedness depending on the microsatellite dataset and on the population composition (i.e. the frequency distribution of relationship in the population). We recommend performing Monte Carlo simulations to decide which estimator to use in studies of pairwise relatedness.  相似文献   

18.
The effective population size (N(e)) is notoriously difficult to accurately estimate in wild populations as it is influenced by a number of parameters that are difficult to delineate in natural systems. The different methods that are used to estimate N(e) are affected variously by different processes at the population level, such as the life-history characteristics of the organism, gene flow, and population substructure, as well as by the frequency patterns of genetic markers used and the sampling design. Here, we compare N(e) estimates obtained by different genetic methods and from demographic data and elucidate how the estimates are affected by various factors in an exhaustively sampled and comprehensively described natural brown trout (Salmo trutta) system. In general, the methods yielded rather congruent estimates, and we ascribe that to the adequate genotyping and exhaustive sampling. Effects of violating the assumptions of the different methods were nevertheless apparent. In accordance with theoretical studies, skewed allele frequencies would underestimate temporal allele frequency changes and thereby upwardly bias N(e) if not accounted for. Overlapping generations and iteroparity would also upwardly bias N(e) when applied to temporal samples taken over short time spans. Gene flow from a genetically not very dissimilar source population decreases temporal allele frequency changes and thereby acts to increase estimates of N(e). Our study reiterates the importance of adequate sampling, quantification of life-history parameters and gene flow, and incorporating these data into the N(e) estimation.  相似文献   

19.
L. Ollivier  LLG. Janss 《Genetics》1993,135(3):907-909
A method of estimating the number of loci contributing to quantitative variation has been proposed by S. Wright in 1921. The method makes use of the means of inbred lines and the variances of their F(1), F(2) and backcrosses. The method has been extended to crosses between outbreeding populations by R. Lande in 1981. Additive gene action is one of the major assumptions required for obtaining valid estimates. It is shown here that this assumption may be relaxed. One can estimate both a total number of effective loci and a number of dominant loci (the latter only when the parents are inbred) by comparing the variances of the F(1), F(2) and backcrosses. Numerical illustrations are given, based on crossbreeding data.  相似文献   

20.
The Mauna Kea silversword, Argyroxiphium sandwicense ssp. sandwicense, has experienced both a severe population crash associated with an increase in alien ungulate populations on Mauna Kea, and a population bottleneck associated with reintroduction. In this paper, we address the genetic consequences of both demographic events using eight microsatellite loci. The population crash was not accompanied by a significant reduction in number of alleles or heterozygosity. However, the population bottleneck was accompanied by significant reductions in observed number of alleles, effective number of alleles, and expected heterozygosity, though not in observed heterozygosity. The effective size of the population bottleneck was calculated using both observed heterozygosities and allele frequency variances. Both methods corroborated the historical census size of the population bottleneck of at most three individuals. The results suggest that: (i) small populations, even those that result from severe reductions in historical population size and extent, are not necessarily genetically depauperate; and (ii) species reintroduction plans need to be conceived and implemented carefully, with due consideration to the genetic impact of sampling for reintroduction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号