首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
An estimator for pairwise relatedness using molecular markers   总被引:21,自引:0,他引:21  
Wang J 《Genetics》2002,160(3):1203-1215
I propose a new estimator for jointly estimating two-gene and four-gene coefficients of relatedness between individuals from an outbreeding population with data on codominant genetic markers and compare it, by Monte Carlo simulations, to previous ones in precision and accuracy for different distributions of population allele frequencies, numbers of alleles per locus, actual relationships, sample sizes, and proportions of relatives included in samples. In contrast to several previous estimators, the new estimator is well behaved and applies to any number of alleles per locus and any allele frequency distribution. The estimates for two- and four-gene coefficients of relatedness from the new estimator are unbiased irrespective of the sample size and have sampling variances decreasing consistently with an increasing number of alleles per locus to the minimum asymptotic values determined by the variation in identity-by-descent among loci per se, regardless of the actual relationship. The new estimator is also robust for small sample sizes and for unknown relatives being included in samples for estimating allele frequencies. Compared to previous estimators, the new one is generally advantageous, especially for highly polymorphic loci and/or small sample sizes.  相似文献   

2.
Measurement of temporal change in allele frequencies represents an indirect method for estimating the genetically effective size of populations. When allele frequencies are estimated for gene markers that display dominant gene expression, such as, e.g. random amplified polymorphic DNA (RAPD) and amplified fragment length polymorphism (AFLP) markers, the estimates can be seriously biased. We quantify bias for previous allele frequency estimators and present a new expression that is generally less biased and provides a more precise assessment of temporal allele frequency change. We further develop an estimator for effective population size that is appropriate when dealing with dominant gene markers. Comparison with estimates based on codominantly expressed genes, such as allozymes or microsatellites, indicates that about twice as many loci or sampled individuals are required when using dominant markers to achieve the same precision.  相似文献   

3.
A. M. Valdes  M. Slatkin    N. B. Freimer 《Genetics》1993,133(3):737-749
We summarize available data on the frequencies of alleles at microsatellite loci in human populations and compare observed distributions of allele frequencies to those generated by a simulation of the stepwise mutation model. We show that observed frequency distributions at 108 loci are consistent with the results of the model under the assumption that mutations cause an increase or decrease in repeat number by one and under the condition that the product Nu, where N is the effective population size and u is the mutation rate, is larger than one. We show that the variance of the distribution of allele sizes is a useful estimator of Nu and performs much better than previously suggested estimators for the stepwise mutation model. In the data, there is no correlation between the mean and variance in allele size at a locus or between the number of alleles and mean allele size, which suggests that the mutation rate at these loci is independent of allele size.  相似文献   

4.
Kitakado T  Kitada S  Obata Y  Kishino H 《Genetics》2006,173(4):2063-2072
In stock enhancement programs, it is important to assess mixing rates of released individuals in stocks. For this purpose, genetic stock identification has been applied. The allele frequencies in a composite population are expressed as a mixture of the allele frequencies in the natural and released populations. The estimation of mixing rates is possible, under successive sampling from the composite population, on the basis of temporal changes in allele frequencies. The allele frequencies in the natural population may be estimated from those of the composite population in the preceding year. However, it should be noted that these frequencies can vary between generations due to genetic drift. In this article, we develop a new method for simultaneous estimation of mixing rates and genetic drift in a stock enhancement program. Numerical simulation shows that our procedure estimates the mixing rate with little bias. Although the genetic drift is underestimated when the amount of information is small, reduction of the bias is possible by analyzing multiple unlinked loci. The method was applied to real data on mud crab stocking, and the result showed a yearly variation in the mixing rate.  相似文献   

5.
Tallmon DA  Luikart G  Beaumont MA 《Genetics》2004,167(2):977-988
We describe and evaluate a new estimator of the effective population size (N(e)), a critical parameter in evolutionary and conservation biology. This new "SummStat" N(e) estimator is based upon the use of summary statistics in an approximate Bayesian computation framework to infer N(e). Simulations of a Wright-Fisher population with known N(e) show that the SummStat estimator is useful across a realistic range of individuals and loci sampled, generations between samples, and N(e) values. We also address the paucity of information about the relative performance of N(e) estimators by comparing the SummStat estimator to two recently developed likelihood-based estimators and a traditional moment-based estimator. The SummStat estimator is the least biased of the four estimators compared. In 32 of 36 parameter combinations investigated using initial allele frequencies drawn from a Dirichlet distribution, it has the lowest bias. The relative mean square error (RMSE) of the SummStat estimator was generally intermediate to the others. All of the estimators had RMSE > 1 when small samples (n = 20, five loci) were collected a generation apart. In contrast, when samples were separated by three or more generations and N(e) < or = 50, the SummStat and likelihood-based estimators all had greatly reduced RMSE. Under the conditions simulated, SummStat confidence intervals were more conservative than the likelihood-based estimators and more likely to include true N(e). The greatest strength of the SummStat estimator is its flexible structure. This flexibility allows it to incorporate any potentially informative summary statistic from population genetic data.  相似文献   

6.
Wang J 《Molecular ecology》2004,13(10):3169-3178
Knowledge of the genetic relatedness between a pair of individuals is important in many research areas of quantitative genetics, conservation genetics, evolution and ecology. Many estimators have been developed to estimate such pairwise relatedness (r) using codominant markers, such as microsatellites and enzymes. In contrast, only two estimators are proposed to use dominant markers, such as random amplified polymorphic DNAs (RAPDs) and amplified fragment length polymorphisms (AFLPs), in relatedness inference. They are both biased estimators, and their statistical properties and robustness to the sampling errors in allele frequency have not been investigated. In this short paper, I propose two new pairwise relatedness estimators for dominant markers, and compare them in precision, accuracy and robustness to sampling with the two previous estimators using simulations. It was found that the new estimator based on the least squares approach is unbiased when allele frequencies are known or estimated from a sample without correcting for sampling effects. It has, however, a low precision and as a result, an intermediate overall performance among the four estimators in terms of the mean squared deviation (MSD) of estimates from actual values of r. The new estimator based on a similarity index is slightly biased but has generally the lowest MSD among the four estimators compared, regardless of the number of loci, type of actual relationships, allele frequencies known or estimated from samples. Simulations also show that the confidence intervals estimated by bootstrapping are appropriate for different estimators provided that the number of loci used in the estimation is not small.  相似文献   

7.
E G Williamson  M Slatkin 《Genetics》1999,152(2):755-761
We develop a maximum-likelihood framework for using temporal changes in allele frequencies to estimate the number of breeding individuals in a population. We use simulations to compare the performance of this estimator to an F-statistic estimator of variance effective population size. The maximum-likelihood estimator had a lower variance and smaller bias. Taking advantage of the likelihood framework, we extend the model to include exponential growth and show that temporal allele frequency data from three or more sampling events can be used to test for population growth.  相似文献   

8.
One of the most common questions asked before starting a new population genetic study using microsatellite allele frequencies is “how many individuals do I need to sample from each population?” This question has previously been answered by addressing how many individuals are needed to detect all of the alleles present in a population (i.e. rarefaction based analyses). However, we argue that obtaining accurate allele frequencies and accurate estimates of diversity are much more important than detecting all of the alleles, given that very rare alleles (i.e. new mutations) are not very informative for assessing genetic diversity within a population or genetic structure among populations. Here we present a comparison of allele frequencies, expected heterozygosities and genetic distances between real and simulated populations by randomly subsampling 5–100 individuals from four empirical microsatellite genotype datasets (Formica lugubris, Sciurus vulgaris, Thalassarche melanophris, and Himantopus novaezelandia) to create 100 replicate datasets at each sample size. Despite differences in taxon (two birds, one mammal, one insect), population size, number of loci and polymorphism across loci, the degree of differences between simulated and empirical dataset allele frequencies, expected heterozygosities and pairwise FST values were almost identical among the four datasets at each sample size. Variability in allele frequency and expected heterozygosity among replicates decreased with increasing sample size, but these decreases were minimal above sample sizes of 25 to 30. Therefore, there appears to be little benefit in sampling more than 25 to 30 individuals per population for population genetic studies based on microsatellite allele frequencies.  相似文献   

9.
GONe is a user-friendly, Windows-based program for estimating effective size (N(e) ) in populations with overlapping generations. It uses the Jorde-Ryman modification to the temporal method to account for age structure in populations. This method requires estimates of age-specific survival and birth rate and allele frequencies measured in two or more consecutive cohorts. Allele frequencies are acquired by reading in genotypic data from files formatted for either GENEPOP or TEMPOFS. For each interval between consecutive cohorts, N(e) is estimated at each locus and over all loci. Furthermore, N(e) estimates are output for three different genetic drift estimators (F(s) , F(c) and F(k) ). Confidence intervals are derived from a chi-square distribution with degrees of freedom equal to the number of independent alleles. GONe has been validated over a wide range of N(e) values, and for scenarios where survival and birth rates differ between sexes, sex ratios are unequal and reproductive variances differ. GONe is freely available for download at https://bcrc.bio.umass.edu/pedigreesoftware/.  相似文献   

10.
A new genetic estimator of the effective population size (N(e)) is introduced. This likelihood-based (LB) estimator uses two temporally spaced genetic samples of individuals from a population. We compared its performance to that of the classical F-statistic-based N(e) estimator (N(eFk)) by using data from simulated populations with known N(e) and real populations. The new likelihood-based estimator (N(eLB)) showed narrower credible intervals and greater accuracy than (N(eFk)) when genetic drift was strong, but performed only slightly better when genetic drift was relatively weak. When drift was strong (e.g., N(e) = 20 for five generations), as few as approximately 10 loci (heterozygosity of 0.6; samples of 30 individuals) are sufficient to consistently achieve credible intervals with an upper limit <50 using the LB method. In contrast, approximately 20 loci are required for the same precision when using the classical F-statistic approach. The N(eLB) estimator is much improved over the classical method when there are many rare alleles. It will be especially useful in conservation biology because it less often overestimates N(e) than does N(eLB) and thus is less likely to erroneously suggest that a population is large and has a low extinction risk.  相似文献   

11.
DNA typing offers a unique opportunity to identify individuals for medical and forensic purposes. Probabilistic inference regarding the chance occurrence of a match between the DNA type of an evidentiary sample and that of an accused suspect, however, requires reliable estimation of genotype and allele frequencies in the population. Although population-based data on DNA typing at several hypervariable loci are being accumulated at various laboratories, a rigorous treatment of the sample size needed for such purposes has not been made from population genetic considerations. It is shown here that the loci that are potentially most useful for forensic identification of individuals have the intrinsic property that they involve a large number of segregating alleles, and a great majority of these alleles are rare. As a consequence, because of the large number of possible genotypes at the hypervariable loci that offer the maximum potential for individualization, the sample size needed to observe all possible genotypes in a sample is large. In fact, the size is so large that even if such a huge number of individuals could be sampled, it could not be guaranteed that such a sample was drawn from a single homogeneous population. Therefore adequate estimation of genotypic probabilities must be based on allele frequencies, and the sample size needed to represent all possible alleles is far more reasonable. Further economization of sample size is possible if one wants to have representation of only the frequent alleles in the sample, so that the rare allele frequencies can be approximated by an upper bound for forensic applications.  相似文献   

12.
Studies on Persea americana have been addressed in different ways with biochemical and molecular techniques. Microsatellites are able to detect multiple alleles for particular loci and are therefore a useful tool to study genealogical relationships, population structures and genetic mapping. Ninety-six samples from 49 cultivars including three horticultural groups and hybrids were collected from the avocado germplasm bank at INIA-CENIAP (Venezuela). A modified DNA extraction protocol was performed. Forty microsatellites were selected from previous references, PCR amplifications were performed, and presence/absence, size, and number of alleles were evaluated on polyacrylamide gels. Attributes for polymorphic alleles were analyzed with POPGENE, and genetic diversity was calculated by effective sample size, number of alleles per locus (Na), effective number of alleles (Ne), Shannon information index (In), observed heterozygosis (H), expected heterozygosity (He), Wright’s fixation index (Fis), and allele frequencies. Only 14 primers were amplified, and AVT106 primer resulted monomorphic. Unique genotypes for each sample were obtained. Nine loci showed allele patterns that can be useful for taxonomic identification of cultivars or varieties. Comparing values of Fis with Ho and He, we found a direct relationship where low heterozygosis alleles identified in the population may affect the expected level. Allele frequencies ranged from 0.5632 to 0.0105. For all loci, at least one rare allele was observed. With the available information from genetic analysis, an identifying system was implemented for selected avocado cultivars maintained at the INIA-CENIAP Venezuelan germplasm bank on the basis of molecular data.  相似文献   

13.
Spatiotemporal diversity at 35 allozyme loci was assayed over 6 years in 1,207 individuals of wild emmer wheat (Triticum dicoccoides)from a microgeographic microsite, Ammiad, north Israel. This analysis used new methods and two additional sample sets (1988 and 1993) and previous allozymic data (1984–1987). This microsite includes four major habitats (North-facing slope, Valley, Ridge, and Karst) that show topographic and ecological heterogeneity. Significant temporal and spatial variations in allele frequencies and levels of genetic diversity were detected in the four subpopulations. Significant associations were observed among allele frequencies and gene diversities at different loci, indicating that many allele frequencies change over time in the same or opposite directions. Multiple regression analysis showed that variation in soil-water content and rainfall distribution in the growing season significantly affected 10 allele frequencies, numbers of alleles at 8 loci, and gene diversity at 4 loci. Random genetic drift and hitchhiking models may not explain such locus-specific spatiotemporal divergence and strong allelic correlation or locus correlation as well as the functional importance of allozymes. Natural ecological selection, presumably through water stress, might be an important force adaptively directing spatiotemporal allozyme diversity and divergence in wild emmer wheat at the Ammiad microsite. Received: 3 July 2000 / Accepted: 1August 2000  相似文献   

14.
Summary The main purpose of germplasm banks is to preserve the genetic variability existing in crop species. The effectiveness of the regeneration of collections stored in gene banks is affected by factors such as sample size, random genetic drift, and seed viability. The objective of this paper is to review probability models and population genetics theory to determine the choice of sample size used for seed regeneration. A number of conclusions can be drawn from the results. First, the size of the sample depends largely on the frequency of the least common allele or genotype. Genotypes or alleles occurring at frequencies of more than 10% can be preserved with a sample size of 40 individuals. A sample size of 100 individuals will preserve genotypes (alleles) that occur at frequencies of 5%. If the frequency of rare genotypes (alleles) drops below 5%, larger sample sizes are required. A second conclusion is that for two, three, and four alleles per locus the sample size required to include a copy of each allele depends more on the frequency of the rare allele or alleles than on the number. Samples of 300 to 400 are required to preserve alleles that are present at a frequency of 1%. Third, if seed is bulked, the expected number of parents involved in any sample drawn from the bulk will be less than the number of parents included in the bulk. Fourth, to maintain a rate of breeding (F) of 1 %, the effective population size (N e) should be at least 150 for three alleles, and 300 for four alleles. Fifth, equalizing the reproductive output of each family to two progeny doubles the effective size of the population. Based on the results presented here, a practical option is considered for regenerating maize seed in a program constrained by limited funds.Part of this paper was presented at the Global Maize Germplasm Workshop, CIMMYT, El Batan, Mexico, March 6–12, 1988  相似文献   

15.
L Chikhi  M W Bruford  M A Beaumont 《Genetics》2001,158(3):1347-1362
When populations are separated for long periods and then brought into contact for a brief episode in part of their range, this can result in genetic admixture. To analyze this type of event we considered a simple model under which two parental populations (P1 and P2) mix and create a hybrid population (H). After that event, the three populations evolve under pure drift without exchange during T generations. We developed a new method, which allows the simultaneous estimation of the time since the admixture event (scaled by the population size t(i) = T/N(i), where N(i) is the effective population size of population i) and the contribution of one of two parental populations (which we call p1). This method takes into account drift since the admixture event, variation caused by sampling, and uncertainty in the estimation of the ancestral allele frequencies. The method is tested on simulated data sets and then applied to a human data set. We find that (i) for single-locus data, point estimates are poor indicators of the real admixture proportions even when there are many alleles; (ii) biallelic loci provide little information about the admixture proportion and the time since admixture, even for very small amounts of drift, but can be powerful when many loci are used; (iii) the precision of the parameters' estimates increases with sample size n = 50 vs. n = 200 but this effect is larger for the t(i)'s than for p1; and (iv) the increase in precision provided by multiple loci is quite large, even when there is substantial drift (we found, for instance, that it is preferable to use five loci than one locus, even when drift is 100 times larger for the five loci). Our analysis of a previously studied human data set illustrates that the joint estimation of drift and p1 can provide additional insights into the data.  相似文献   

16.
Gene diversity is sometimes estimated from samples that contain inbred or related individuals. If inbred or related individuals are included in a sample, then the standard estimator for gene diversity produces a downward bias caused by an inflation of the variance of estimated allele frequencies. We develop an unbiased estimator for gene diversity that relies on kinship coefficients for pairs of individuals with known relationship and that reduces to the standard estimator when all individuals are noninbred and unrelated. Applying our estimator to data simulated based on allele frequencies observed for microsatellite loci in human populations, we find that the new estimator performs favorably compared with the standard estimator in terms of bias and similarly in terms of mean squared error. For human population-genetic data, we find that a close linear relationship previously seen between gene diversity and distance from East Africa is preserved when adjusting for the inclusion of close relatives.  相似文献   

17.
Vitalis R  Couvet D 《Genetics》2001,157(2):911-925
Standard methods for inferring demographic parameters from genetic data are based mainly on one-locus theory. However, the association of genes at different loci (e.g., two-locus identity disequilibrium) may also contain some information about demographic parameters of populations. In this article, we define one- and two-locus parameters of population structure as functions of one- and two-locus probabilities for the identity in state of genes. Since these parameters are known functions of demographic parameters in an infinite island model, we develop moment-based estimators of effective population size and immigration rate from one- and two-locus parameters. We evaluate this method through simulation. Although variance and bias may be quite large, increasing the number of loci on which the estimates are derived improves the method. We simulate an infinite allele model and a K allele model of mutation. Bias and variance are smaller with increasing numbers of alleles per locus. This is, to our knowledge, the first attempt of a joint estimation of local effective population size and immigration rate.  相似文献   

18.
Nielsen R  Tarpy DR  Reeve HK 《Molecular ecology》2003,12(11):3157-3164
Estimating paternity and genetic relatedness is central to many empirical and theoretical studies of social insects. The two important measures of a queen's mating number are her actual number of mates and her effective number of mates. Estimating the effective number of mates is mathematically identical to the problem of estimating the effective number of alleles in population genetics, a common measure of genetic variability introduced by Kimura & Crow (1964). We derive a new bias-corrected estimator of effective number of types (mates or alleles) and compare this new method to previous methods for estimating true and effective numbers of types using Monte Carlo simulations. Our simulation results suggest that the examined estimators of the true number of types have very similar statistical properties, whereas the estimators of effective number of types have quite different statistical properties. Moreover, our new proposed estimator of effective number of types is approximately unbiased, and has considerably lower variance than the original estimator. Our new method will help researchers more accurately estimate intracolony genetic relatedness of social insects, which is an important measure in understanding their ecology and social behaviour. It should also be of use in population genetic studies in which the effective number of alleles is of interest.  相似文献   

19.
We present a Monte-Carlo simulation analysis of the statistical properties of absolute genetic distance and of Nei's minimum and standard genetic distances. The estimation of distances (bias) and of their variances is analysed as well as the distributions of distance and variance estimators, taking into account both gamete and locus samplings. Both of Nei's statistics are non-linear when distances are small and consequently the distributions of their estimators are extremely asymmetrical. It is difficult to find theoretical laws that fit such asymmetrical distributions. Absolute genetic distance is linear and its distributions are better fit by a normal distribution. When distances are medium or large, minimum distance and absolute distance distributions are close to a normal distribution, but those of the standard distance can never be considered as normal. For large distances the jack-knife estimator of the standard distance variance is bad; another standard distance estimator is suggested. Absolute distance, which has the best mathematical properties, is particularly interesting for small distances if the gamete sample size is large, even when the number of loci is small. When both distance and gamete sample size are small, this statistic is biased.  相似文献   

20.
J. S. F. Barker  P. D. East    B. S. Weir 《Genetics》1986,112(3):577-611
Temporal variation in allozyme frequencies at six loci was studied by making monthly collections over 4 yr in one population of the cactophilic species Drosophila buzzatii. Ten sites were defined within the study locality, and for all temporal samples, separate collections were made at each of these sites. Population structure over microgeographic space and changes in population structure over time were analyzed using F-statistic estimators, and multivariate analyses of allele and genotype frequencies with environmental variables were carried out. Allele frequencies showed significant variation over time, although there were no clear cyclical or seasonal patterns. A biplot analysis of allele frequencies over seasons within years and over years showed clear discrimination among years by alleles at four loci. During the 4 yr, three alleles showed directional changes which were associated with directional changes in environmental variables. Significant associations with one or more environmental variables were found for allele frequencies at every locus and for both expected and observed heterozygosities (except those for Est-1 and Est-2). Thus, variation in allele frequencies over time cannot be attributed solely to drift. Significant linkage disequilibria were detected among three loci (Est-2, Hex and Aldox), but there was no evidence for spatial or temporal patterns. The F-statistic analyses showed significant differentiation among months within years for all loci, but the statistic used (coancestry) was heterogeneous among loci. Estimates of F (inbreeding) for all loci were significantly different from zero, with the loci in four groups, Adh-1 (negative), Pgm(small positive), Est-2 and Hex (intermediate) and Est-1 and Aldox (high positive). The correlation of genes within individuals within populations (f) for each locus in each month by site sample differed among loci, as did the (f) for each locus in each month by site sample differed among loci, as did the patterns of change in f over time (seasons). Heterogeneity in the F-statistic estimates indicates that natural selection is directly or indirectly affecting allele and genotype frequencies at some loci. However, the F-statistic analyses showed essentially no microgeographic structure (i.e., among sites), although there was significant heterogeneity in allele frequencies among flies emerging from individual rots. Thus, microspatial heterogeneity probably is most important at the level of individual rots, and coupled with habitat selection, it could be a major factor promoting diversifying selection and the maintenance of polymorphism.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号