首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The paper deals with the problem of estimating the mean life θ in an exponential model \documentclass{article}\pagestyle{empty}\begin{document}$f(x/\theta ) = \frac{1}{\theta }e - x/\theta$\end{document}. It is assumed that in addition to the current ordered sample, we have a sample collected sometime in the recent past when the mean life might have been β. We have proposed a Sometimes—Pool procedure which is based on the outcome of a preliminary test of H0: θ=β and obtained the expressions of the bias and MSE. An attempt has been made to locate that region in the parameter space in which the proposed estimator does better (in MSE sense) than the usual estimator based only on the current sample.  相似文献   

2.
Microsatellite loci mutate at an extremely high rate and are generally thought to evolve through a stepwise mutation model. Several differentiation statistics taking into account the particular mutation scheme of the microsatellite have been proposed. The most commonly used is R(ST) which is independent of the mutation rate under a generalized stepwise mutation model. F(ST) and R(ST) are commonly reported in the literature, but often differ widely. Here we compare their statistical performances using individual-based simulations of a finite island model. The simulations were run under different levels of gene flow, mutation rates, population number and sizes. In addition to the per locus statistical properties, we compare two ways of combining R(ST) over loci. Our simulations show that even under a strict stepwise mutation model, no statistic is best overall. All estimators suffer to different extents from large bias and variance. While R(ST) better reflects population differentiation in populations characterized by very low gene-exchange, F(ST) gives better estimates in cases of high levels of gene flow. The number of loci sampled (12, 24, or 96) has only a minor effect on the relative performance of the estimators under study. For all estimators there is a striking effect of the number of samples, with the differentiation estimates showing very odd distributions for two samples.  相似文献   

3.
Y X Fu  R Chakraborty 《Genetics》1998,150(1):487-497
Minisatellite and microsatellite are short tandemly repetitive sequences dispersed in eukaryotic genomes, many of which are highly polymorphic due to copy number variation of the repeats. Because mutation changes copy numbers of the repeat sequences in a generalized stepwise fashion, stepwise mutation models are widely used for studying the dynamics of these loci. We propose a minimum chi-square (MCS) method for simultaneous estimation of all the parameters in a stepwise mutation model and the ancestral allelic type of a sample. The MCS estimator requires knowing the mean number of alleles of a certain size in a sample, which can be estimated using Monte Carlo samples generated by a coalescent algorithm. The method is applied to samples of seven (CA)n repeat loci from eight human populations and one chimpanzee population. The estimated values of parameters suggest that there is a general tendency for microsatellite alleles to expand in size, because (1) each mutation has a slight tendency to cause size increase and (2) the mean size increase is larger than the mean size decrease for a mutation. Our estimates also suggest that most of these CA-repeat loci evolve according to multistep mutation models rather than single-step mutation models. We also introduced several quantities for measuring the quality of the estimation of ancestral allelic type, and it appears that the majority of the estimated ancestral allelic types are reasonably accurate. Implications of our analysis and potential extensions of the method are discussed.SINCE the discovery that a large number of loci with tandemly repeated sequences in human and many eukaryote species are highly polymorphic because of copy number variation of the repeats in different individuals (Jeffreys 1985; Litt and Luty 1989; Weber and May 1989), allele size data from such loci are rapidly becoming the dominant source of genetic markers for genome mapping, forensic testing, and population studies. Loci with repeat sequences longer than 5 bp are generally referred to as minisatellite or variable number tandem repeat loci, and those with repeat sequences between 2 to 5 bp are referred to as microsatellite or short tandem repeat loci (Tautz 1993). Because mutations change the copy number of such loci in a stepwise fashion, rapid accumulation of population samples from minisatellite and microsatellite loci has resurrected the interest of the stepwise mutation model (SMM), which was popular in the 1970s.  相似文献   

4.
A Coalescent Estimator of the Population Recombination Rate   总被引:42,自引:10,他引:32       下载免费PDF全文
J. Hey  J. Wakeley 《Genetics》1997,145(3):833-846
Population genetic models often use a population recombination parameter 4Nc, where N is the effective population size and c is the recombination rate per generation. In many ways 4Nc is comparable to 4Nu, the population mutation rate. Both combine genome level and population level processes, and together they describe the rate of production of genetic variation in a population. However, 4Nc is more difficult to estimate. For a population sample of DNA sequences, historical recombination can only be detected if polymorphisms exist, and even then most recombination events are not detectable. This paper describes an estimator of 4Nc, hereafter designated γ (gamma), that was developed using a coalescent model for a sample of four DNA sequences with recombination. The reliability of γ was assessed using multiple coalescent simulations. In general γ has low to moderate bias, and the reliability of γ is comparable, though less, than that for a widely used estimator of 4Nu. If there exists an independent estimate of the recombination rate (per generation, per base pair), γ can be used to estimate the effective population size or the neutral mutation rate.  相似文献   

5.
A modified estimator of heritability is proposed under heteroscedastic one way unbalanced random model. The distribution, moments and probability of permissible values (PPV) for conventional and modified estimators are derived. The behaviour of two estimators has been investigated, numerically, to devise a suitable estimator of heritability under variance heterogeneity. The numerical results reveal that under balanced case the heteroscedasticity affects the bias, MSE and PPV of conventional estimator, marginally. In case of unbalanced situations, the conventional estimator underestimates the parameter when more variable group has more observations and overestimates when more variable group has less observations, MSE of the conventional estimator decreases when more variable group has more observations and increases when more variable group has less observations and PPV is marginally decreased. The MSE and PPV are comparable for two estimators while the bias of modified estimator is less than the conventional estimator particularly for small and medium values of the parameter. These results suggest the use of modified estimator with equal or more observations for more variable group in presence of variance heterogeneity.  相似文献   

6.
We consider a method of approximating Weir and Cockerham's theta, an unbiased estimator of genetic population structure, using values readily available from published studies using biased estimators (Wright's F(ST) or Nei's G(ST)). The estimation algorithm is shown to be useful for both model populations and real-world avian populations. However, the correlation between Wright's F(ST) and Weir and Cockerham's theta is strong when compared among 39 empirical avian datasets. Thus, the advantage of approximating an unbiased estimator is unclear considering the small actual effect of theta's bias-removing power on empirical datasets.  相似文献   

7.
Mutation rate variation at human dinucleotide microsatellites   总被引:1,自引:0,他引:1  
Xu H  Chakraborty R  Fu YX 《Genetics》2005,170(1):305-312
Mutation is the ultimate source of genetic variation, and mutation rate is thus an important parameter governing the extent of genetic variation. Microsatellites are highly informative genetic markers that have been widely used in genetic studies. While previous studies showed that the mutation rate differs in di-, tri-, and tetranucleotide repeats, how mutation rate distributes within each class of repeat is poorly understood. This study first revealed the pattern of the mutation rate variation within the dinucleotide repeats. Two data sets were used. The first is the allele frequency data from 115 microsatellites with dinucleotide repeats distributed along the human genome in 10 worldwide populations. The second data set is much larger, consisting of the allele frequency of 5252 dinucleotide repeats from the Genome Database. Mutation rate for each locus is estimated through a new homozygosity-based estimator, which has been shown to be unbiased and highly efficient and is reasonably robust against deviations from the single-step model. The mutation rates among loci can be approximated well by a gamma distribution and its shape parameter can be accurately estimated with this approach. This result provides the basic guidelines for analyzing the large-scale genomic data from microsatellite loci.  相似文献   

8.
The distribution and moments, of ANOVA estimator of heritability are given under unbalanced random model. These expressions are used to investigate the effect of unbalancedness on the bias and variance/MSE of the estimator and also the validity of certain approximations for its variance, numerically. The computed results reveal that the unbalancedness increases both the bias and variance/MSE of the estimator and the Smith-approximation for the variance of the estimator provides better accuracy.  相似文献   

9.
Hardy OJ  Charbonnel N  Fréville H  Heuertz M 《Genetics》2003,163(4):1467-1482
The mutation process at microsatellite loci typically occurs at high rates and with stepwise changes in allele sizes, features that may introduce bias when using classical measures of population differentiation based on allele identity (e.g., F(ST), Nei's Ds genetic distance). Allele size-based measures of differentiation, assuming a stepwise mutation process [e.g., Slatkin's R(ST), Goldstein et al.'s (deltamu)(2)], may better reflect differentiation at microsatellite loci, but they suffer high sampling variance. The relative efficiency of allele size- vs. allele identity-based statistics depends on the relative contributions of mutations vs. drift to population differentiation. We present a simple test based on a randomization procedure of allele sizes to determine whether stepwise-like mutations contributed to genetic differentiation. This test can be applied to any microsatellite data set designed to assess population differentiation and can be interpreted as testing whether F(ST) = R(ST). Computer simulations show that the test efficiently identifies which of F(ST) or R(ST) estimates has the lowest mean square error. A significant test, implying that R(ST) performs better than F(ST), is obtained when the mutation rate, mu, for a stepwise mutation process is (a) >/= m in an island model (m being the migration rate among populations) or (b) >/= 1/t in the case of isolated populations (t being the number of generations since population divergence). The test also informs on the efficiency of other statistics used in phylogenetical reconstruction [e.g., Ds and (deltamu)(2)], a nonsignificant test meaning that allele identity-based statistics perform better than allele size-based ones. This test can also provide insights into the evolutionary history of populations, revealing, for example, phylogeographic patterns, as illustrated by applying it on three published data sets.  相似文献   

10.
Biao Li  Marek Kimmel 《Genetics》2013,195(2):563-572
Microsatellite loci play an important role as markers for identification, disease gene mapping, and evolutionary studies. Mutation rate, which is of fundamental importance, can be obtained from interspecies comparisons, which, however, are subject to ascertainment bias. This bias arises, for example, when a locus is selected on the basis of its large allele size in one species (cognate species 1), in which it is first discovered. This bias is reflected in average allele length in any noncognate species 2 being smaller than that in species 1. This phenomenon was observed in various pairs of species, including comparisons of allele sizes in human and chimpanzee. Various mechanisms were proposed to explain observed differences in mean allele lengths between two species. Here, we examine the framework of a single-step asymmetric and unrestricted stepwise mutation model with genetic drift. Analysis is based on coalescent theory. Analytical results are confirmed by simulations using the simuPOP software. The mechanism of ascertainment bias in this model is a tighter correlation of allele sizes within a cognate species 1 than of allele sizes in two different species 1 and 2. We present computations of the expected average allele size difference, given the mutation rate, population sizes of species 1 and 2, time of separation of species 1 and 2, and the age of the allele. We show that when the past demographic histories of the cognate and noncognate taxa are different, the rate and directionality of mutations affect the allele sizes in the two taxa differently from the simple effect of ascertainment bias. This effect may exaggerate or reverse the effect of difference in mutation rates. We reanalyze literature data, which indicate that despite the bias, the microsatellite mutation rate estimate in the ancestral population is consistently greater than that in either human or chimpanzee and the mutation rate estimate in human exceeds or equals that in chimpanzee with the rate of allele length expansion in human being greater than that in chimpanzee. We also demonstrate that population bottlenecks and expansions in the recent human history have little impact on our conclusions.  相似文献   

11.
Estimating Genetic Variability with Restriction Endonucleases   总被引:16,自引:10,他引:6       下载免费PDF全文
Richard R. Hudson 《Genetics》1982,100(4):711-719
The estimation of the amount of sequence variation in samples of homologous DNA segments is considered. The data are assumed to have been obtained by restriction endonuclease digestion of the segments, from which the numbers and frequencies of the cleavage sites in the sample are determined. An estimator, p, of the proportion of sites that are polymorphic in the sample is derived without assuming any particular population genetic model for the evolution of the population. The estimator is very close to the EWENS, SPIELMAN and HARRIS (1981) estimator that was derived with the symmetric WRIGHT-FISHER neutral mode. ENGELS (1981) has also recently proposed an estimator of the same quantity, and he arrived at his estimator without assuming a particular population genetic model. The sampling variance of p and ENGELS' estimator are derived. It is found that the sampling variance of p is lower than the sampling variance of ENGELS' estimator. Also, the sampling variance of theta, an estimate of theta (=4Nu) is obtained for the symmetric WRIGHT-FISHER neutral model with free recombination and with no recombination.  相似文献   

12.
S D Walter  R J Cook 《Biometrics》1991,47(3):795-811
The relative performance of the unconditioned maximum likelihood estimators (UMLEs), conditional MLEs (CMLEs), and Jewell-type estimators of the odds ratio (OR) and its logarithm were investigated in sets of single 2 x 2 contingency tables. The tables were generated by complete enumeration of all possible cell frequencies consistent with a single fixed margin. The bias, mean squared error (MSE), and average absolute error (AAE) were computed for all estimators using the individual table probabilities as weights. The results showed that, for the OR, Jewell's estimator usually had smaller bias, MSE, and AAE than either of the MLEs. While the differences were often slight for MSE and AAE, for bias it was sometimes substantial. For the log(OR), the UMLE usually had the lowest bias, and its MSE and AAE were only slightly greater than those for the other estimators. Overall, we recommend estimation on the log scale using the UMLE. If OR is to be estimated, Jewell's method had strong merit, although it is nonsymmetric with respect to the table orientation. In view of this, the UMLE may again be favoured in some situations.  相似文献   

13.
A Phylogenetic Estimator of Effective Population Size or Mutation Rate   总被引:17,自引:7,他引:10       下载免费PDF全文
Y. X. Fu 《Genetics》1994,136(2):685-692
A new estimator of the essential parameter θ = 4N(e)μ from DNA polymorphism data is developed under the neutral Wright-Fisher model without recombination and population subdivision, where N(e) is the effective population size and μ is the mutation rate per locus per generation. The new estimator has a variance only slightly larger than the minimum variance of all possible unbiased estimators of the parameter and is substantially smaller than that of any existing estimator. The high efficiency of the new estimator is achieved by making full use of phylogenetic information in a sample of DNA sequences from a population. An example of estimating θ by the new method is presented using the mitochondrial sequences from an American Indian population.  相似文献   

14.
We report the population genetic structure of the endangered tropical tree species Caryocar brasiliense, based on variability at 10 microsatellite loci. Additionally, we compare heterozygosity and inbreeding estimates for continuous and fragmented populations and discuss the consequences for conservation. For a total of 314 individuals over 10 populations, the number of alleles per locus ranged from 20 to 27 and expected and observed heterozygosity varied from 0.129 to 0.924 and 0.067 to 1.000, respectively. Significant values of theta and R(ST) showed important genetic differentiation among populations. theta was much lower than R(ST), suggesting that identity by state and identity by descent have diverged in these populations. Although a significant amount of inbreeding was found under the identity by descent model (f = 0.11), an estimate of inbreeding for microsatellite markers based on a more adequate stepwise mutation model showed no evidence of nonrandom mating (R(IS) = 0.04). Differentiation (pairwise F(ST)) was positively correlated with geographical distance, as expected under the isolation by distance model. No effect of fragmentation on heterozygosity or inbreeding could be detected. This is most likely due to the fact that Cerrado fragmentation is a relatively recent event (approximately 60 years) compared to the species life cycle. Also, the populations surveyed from both fragmented and disturbed areas were composed mainly of adult individuals, already present prior to ecosystem fragmentation. Adequate hypothesis testing of the effect of habitat fragmentation will require the recurrent analysis of juveniles across generations in both fragmented and nonfragmented areas.  相似文献   

15.
Commonly used semiparametric estimators of causal effects specify parametric models for the propensity score (PS) and the conditional outcome. An example is an augmented inverse probability weighting (IPW) estimator, frequently referred to as a doubly robust estimator, because it is consistent if at least one of the two models is correctly specified. However, in many observational studies, the role of the parametric models is often not to provide a representation of the data-generating process but rather to facilitate the adjustment for confounding, making the assumption of at least one true model unlikely to hold. In this paper, we propose a crude analytical approach to study the large-sample bias of estimators when the models are assumed to be approximations of the data-generating process, namely, when all models are misspecified. We apply our approach to three prototypical estimators of the average causal effect, two IPW estimators, using a misspecified PS model, and an augmented IPW (AIPW) estimator, using misspecified models for the outcome regression (OR) and the PS. For the two IPW estimators, we show that normalization, in addition to having a smaller variance, also offers some protection against bias due to model misspecification. To analyze the question of when the use of two misspecified models is better than one we derive necessary and sufficient conditions for when the AIPW estimator has a smaller bias than a simple IPW estimator and when it has a smaller bias than an IPW estimator with normalized weights. If the misspecification of the outcome model is moderate, the comparisons of the biases of the IPW and AIPW estimators show that the AIPW estimator has a smaller bias than the IPW estimators. However, all biases include a scaling with the PS-model error and we suggest caution in modeling the PS whenever such a model is involved. For numerical and finite sample illustrations, we include three simulation studies and corresponding approximations of the large-sample biases. In a dataset from the National Health and Nutrition Examination Survey, we estimate the effect of smoking on blood lead levels.  相似文献   

16.
Variable numbers of tandem repeats (VNTR) typing is widely used for studying the bacterial cause of tuberculosis. Knowledge of the rate of mutation of VNTR loci facilitates the study of the evolution and epidemiology of Mycobacterium tuberculosis. Previous studies have applied population genetic models to estimate the mutation rate, leading to estimates varying widely from around to per locus per year. Resolving this issue using more detailed models and statistical methods would lead to improved inference in the molecular epidemiology of tuberculosis. Here, we use a model-based approach that incorporates two alternative forms of a stepwise mutation process for VNTR evolution within an epidemiological model of disease transmission. Using this model in a Bayesian framework we estimate the mutation rate of VNTR in M. tuberculosis from four published data sets of VNTR profiles from Albania, Iran, Morocco and Venezuela. In the first variant, the mutation rate increases linearly with respect to repeat numbers (linear model); in the second, the mutation rate is constant across repeat numbers (constant model). We find that under the constant model, the mean mutation rate per locus is (95% CI: ,)and under the linear model, the mean mutation rate per locus per repeat unit is (95% CI: ,). These new estimates represent a high rate of mutation at VNTR loci compared to previous estimates. To compare the two models we use posterior predictive checks to ascertain which of the two models is better able to reproduce the observed data. From this procedure we find that the linear model performs better than the constant model. The general framework we use allows the possibility of extending the analysis to more complex models in the future.  相似文献   

17.
In population genetics, under a neutral Wright-Fisher model, the scaling parameter straight theta=4Nmu represents twice the average number of new mutants per generation. The effective population size is N and mu is the mutation rate per sequence per generation. Watterson proposed a consistent estimator of this parameter based on the number of segregating sites in a sample of nucleotide sequences. We study the distribution of the Watterson estimator. Enlarging the size of the sample, we asymptotically set a Central Limit Theorem for the Watterson estimator. This exhibits asymptotic normality with a slow rate of convergence. We then prove the asymptotic efficiency of this estimator. In the second part, we illustrate the slow rate of convergence found in the Central Limit Theorem. To this end, by studying the confidence intervals, we show that the asymptotic Gaussian distribution is not a good approximation for the Watterson estimator.  相似文献   

18.
Microsatellites can be misleading: an empirical and simulation study   总被引:10,自引:0,他引:10  
Abstract. It has been long recognized that highly polymorphic genetic markers can lead to underestimation of divergence between populations when migration is low. Microsatellite loci, which are characterized by extremely high mutation rates, are particularly likely to be affected. Here, we report genetic differentiation estimates in a contact zone between two chromosome races of the common shrew ( Sorex araneus ), based on 10 autosomal microsatellites, a newly developed Y-chromosome microsatellite, and mitochondrial DNA. These results are compared to previous data on proteins and karyotypes. Estimates of genetic differentiation based on F - and R -statistics are much lower for autosomal microsatellites than for all other genetic markers. We show by simulations that this discrepancy stems mainly from the high mutation rate of microsatellite markers for F -statististics and from deviations from a single-step mutation model for R -statistics. The sex-linked genetic markers show that all gene exchange between races is mediated by females. The absence of male-mediated gene flow most likely results from male hybrid sterility.  相似文献   

19.
Using the stepwise mutation model of Ohta and Kimura (1973), formulas are developed for the correlation of heterozygosity and the variance of genetic distance between two finite populations. Studied in detail is the case where the sizes of the two descendant populations are equal to that of the ancestral population and the mutation rate is the same for all loci. Numerical computations are carried out by using the present formulas and those of Li and Nei (1975Genet. Res.25) for the infinite-allele model. The results are as follows: The correlation of heterozygosity decreases with time faster for the stepwise mutation model than for the infinite-allele model. However, the relationships between the correlation of heterozygosity and the normalized genetic identity for the two models are very similar, if the average heterozygosities of the two populations are around 0.20 or less. On the other hand, the variance of genetic distance for the stepwise mutation model may become considerably smaller than that for the infinite-allele model, if the average heterozygosities of the two populations are larger than 0.05. The ratio of the standard deviation to the mean is, however, very large for the stepwise mutation model as well as the infinite-allele model.  相似文献   

20.
Although microsatellites are one of the most popular tools in genetic studies, their mutational dynamics and evolution remain unclear. Here, we apply extensive pedigree genotyping to identify and analyze the patterns and factors associated with de novo germline mutations across nine microsatellite loci in a wild population of lesser kestrels (Falco naumanni). A total of 10 germline mutations events were unambiguously identified in four loci, yielding an average mutation rate of 2.96x10(-3). Across loci, mutation rate was positively correlated with locus variability and average allele size. Mutations were primarily compatible with a stepwise mutation model, although not exclusively involved single-step changes. Unexpectedly, we found an excess of maternally transmitted mutations (male-to-female ratio of 0.1). One of the analyzed loci (Fn2.14) resulted hypermutable (mutation rate=0.87%). This locus showed a size-dependent mutation bias, with longer alleles displaying deletions or additions of a small number of repeat than shorter alleles. Mutation probability at Fn2.14 was higher for females and increased with parental (maternal) age but was not associated with individual physical condition, multilocus heterozygosity, allele length or allele span. Overall, our results do not support the male-biased mutation rate described in other organisms and suggest that mutation dynamics at microsatellite loci are a complex process which requires further research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号