首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We consider the estimation of the scaled mutation parameter θ, which is one of the parameters of key interest in population genetics. We provide a general result showing when estimators of θ can be improved using shrinkage when taking the mean squared error as the measure of performance. As a consequence, we show that Watterson’s estimator is inadmissible, and propose an alternative shrinkage-based estimator that is easy to calculate and has a smaller mean squared error than Watterson’s estimator for all possible parameter values 0<θ<. This estimator is admissible in the class of all linear estimators. We then derive improved versions for other estimators of θ, including the MLE. We also investigate how an improvement can be obtained both when combining information from several independent loci and when explicitly taking into account recombination. A simulation study provides information about the amount of improvement achieved by our alternative estimators.  相似文献   

2.
We analyze a decoupled Moran model with haploid population size N, a biallelic locus under mutation and drift with scaled forward and backward mutation rates θ1=μ1N and θ0=μ0N, and directional selection with scaled strength γ=sN. With small scaled mutation rates θ0 and θ1, which is appropriate for single nucleotide polymorphism data in highly recombining regions, we derive a simple approximate equilibrium distribution for polymorphic alleles with a constant of proportionality. We also put forth an even simpler model, where all mutations originate from monomorphic states. Using this model we derive the sojourn times, conditional on the ancestral and fixed allele, and under equilibrium the distributions of fixed and polymorphic alleles and fixation rates. Furthermore, we also derive the distribution of small samples in the diffusion limit and provide convenient recurrence relations for calculating this distribution. This enables us to give formulas analogous to the Ewens-Watterson estimator of θ for biased mutation rates and selection. We apply this theory to a polymorphism dataset of fourfold degenerate sites in Drosophila melanogaster.  相似文献   

3.
John Graunt (1662) was the first to estimate the ratio y/x where y represents the total population and x the known total number of registered births in the same areas during the preceding year. About 1765 Messance (Stephan, 1948) and Moheau (1778) published very carefully prepared estimates for France based on enumeration of population in certain districts and on the count of births, deaths and marriages as reported for the whole country. The districts from which the ratio of inhabitants to birth was determined only constituted a sample. Laplace (1786) prepared similar estimates in 1802 based on a two-stage sampling plan. Recently Hansen and Hurwitz (1943) showed that the ratio estimate (yi/ni)X of Y is unbiased where all xi's are known and the nth cluster is selected with p.p.s. More recently Hájek (1949), Lahiri (1951), Midzuno (1952) and Sen (1952) developed independently the sampling of n clusters with p.p.s to the totals of the sizes of the sample clusters S(xi). Des Raj (1954) and Sen (1952, 1953) gave unbiased estimate of the variance of the estimator which was generally non-negative for samples with smaller probabilities. Rao and Vijayan (1977) gave an unbiased estimator which is non-negative for samples with larger probabilities. Hájek (1949) provided an almost unbiased estimator of the variance of the estimator. The paper discusses situations where Hájek's estimator of variance should be preferred to the Rao-Vijayan estimator and vice versa.  相似文献   

4.
We study the ancestral recombination graph for a pair of sites in a geographically structured population. In particular, we consider the limiting behavior of the graph, under Wrights island model, as the number of subpopulations, or demes, goes to infinity. After an instantaneous sample-size adjustment, the graph becomes identical to the two-locus graph in an unstructured population, but with a time scale that depends on the migration rate and the deme size. Interestingly, when migration is gametic, this rescaling of time increases the population mutation rate but does not affect the population recombination rate. We compare this to the case of a partially-selfing population, in which both mutation and recombination depend on the selfing rate. Our result for gametic migration holds both for finite-sized demes, and in the limit as the deme size goes to infinity. However, when migration occurs during the diploid phase of the life cycle and demes are finite in size, the population recombination rate does depend on the migration rate, in a way that is reminiscent of partial selfing. Simulations imply that convergence to a rescaled panmictic ancestral recombination graph occurs for any number of sites as the number of demes approaches infinity.Send offprint request to: Sabin LessardS. Lessard was supported by grants from the Natural Sciences and Research Council of Canada, the Fonds Québécois de la Recherche sur la Nature et les Technologies, and the Université de Montréal.J. Wakeley was supported by a Career Award (DEB-0133760) and by a grant (DEB-9815367) from the National Science Foundation.  相似文献   

5.
 Multivariate analysis is a branch of statistics that successfully exploits the powerful tools of linear algebra to obtain a fairly comprehensive theory of estimation. The purpose of this paper is to explore to what extent a linear theory of estimation can be developed in the context of coalescent models used in the analysis of DNA polymorphism. We consider a large class of coalescent models, of which the neutral infinite sites model is one example. In the process, we discover several limitations of linear estimators that are quite distinct from those in the classical theory. In particular, we prove that there does not exist a uniformly BLUE (best linear unbiased estimator) for the scaled mutation parameter, under the assumptions of the neutral model of evolution. In fact, we show that no linear estimator performs uniformly better than the Watterson (1975) method based on the total number of segregating sites. For certain coalescent models, the segregating-sites estimator is actually optimal. The general conclusion is the following. If genealogical information is useful for estimating the rate of evolution, then there is no optimal linear method. If there is an optimal linear method, then no information other than the total number of segregating sites is needed. Received: 29 July 1998 / Revised version: 9 October 1998  相似文献   

6.
In many applications we obtain test statistics by combining estimates from different experiments or studies. The usual combined estimator of the overall effect in independent studies leads to systematic overestimates of the significance level, see Li, Shi , and Roth (1994). This results in a great number of unjustified significant evidences. By examination of the convexity of composed functions involved and application of higher and inverse moments of the χ2 distribution we propose corrections for the estimated standard deviation of the overall effect estimator. Analytical results and simulations show that we improve the estimated significance level in such models.  相似文献   

7.
Spontaneous mutation frequencies were determined for two loci in the fungus Schizophyllum commune, at meiosis and at mitosis. For both loci the meiotic frequency is significantly higher than the mitotic frequency. No correlation was found between meiotic mutagenesis and recombination of markers bracketing the mutant site. The meiotic temperature affected the spontaneous mutation frequency but not the recombination frequency in the cross examined.A number of suppressor mutations were detected for both loci examined. Almost all the suppressors are closely linked to the site they suppress. The distribution of mutations among the suppressor sites was different at meiosis and at mitosis.  相似文献   

8.
Y. X. Fu 《Genetics》1994,138(4):1375-1386
Mutations resulting in segregating sites of a sample of DNA sequences can be classified by size and type and the frequencies of mutations of different sizes and types can be inferred from the sample. A framework for estimating the essential parameter θ = 4Nu utilizing the frequencies of mutations of various sizes and types is developed in this paper, where N is the effective size of a population and μ is mutation rate per sequence per generation. The framework is a combination of coalescent theory, general linear model and Monte-Carlo integration, which leads to two new estimators θ(ξ) and θ(η) as well as a general Watterson''s estimator θ(K) and a general Tajima''s estimator θ(π). The greatest strength of the framework is that it can be used under a variety of population models. The properties of the framework and the four estimators θ(K), θ(π), θ(ξ) and θ(η) are investigated under three important population models: the neutral Wright-Fisher model, the neutral model with recombination and the neutral Wright''s finite-islands model. Under all these models, it is shown that θ(ξ) is the best estimator among the four even when recombination rate or migration rate has to be estimated. Under the neutral Wright-Fisher model, it is shown that the new estimator θ(ξ) has a variance close to a lower bound of variances of all unbiased estimators of θ which suggests that θ(ξ) is a very efficient estimator.  相似文献   

9.
Quantifying diversity is of central importance for the study of structure, function and evolution of microbial communities. The estimation of microbial diversity has received renewed attention with the advent of large-scale metagenomic studies. Here, we consider what the diversity observed in a sample tells us about the diversity of the community being sampled. First, we argue that one cannot reliably estimate the absolute and relative number of microbial species present in a community without making unsupported assumptions about species abundance distributions. The reason for this is that sample data do not contain information about the number of rare species in the tail of species abundance distributions. We illustrate the difficulty in comparing species richness estimates by applying Chao''s estimator of species richness to a set of in silico communities: they are ranked incorrectly in the presence of large numbers of rare species. Next, we extend our analysis to a general family of diversity metrics (‘Hill diversities''), and construct lower and upper estimates of diversity values consistent with the sample data. The theory generalizes Chao''s estimator, which we retrieve as the lower estimate of species richness. We show that Shannon and Simpson diversity can be robustly estimated for the in silico communities. We analyze nine metagenomic data sets from a wide range of environments, and show that our findings are relevant for empirically-sampled communities. Hence, we recommend the use of Shannon and Simpson diversity rather than species richness in efforts to quantify and compare microbial diversity.  相似文献   

10.
Integrons are able to incorporate exogenous genes embedded in mobile cassettes, by a site-specific recombination mechanism. Gene cassettes are collected at the attI site, via an integrase mediated recombination between the cassette recombination site, attC, and the attI site. Interestingly, only three nucleotides are conserved between attC and attI. Here, we have determined the requirements of these in recombination, using the recombination machinery from the paradigmatic class 1 integron. We found that, strikingly, the only requirement is to have identical first nucleotide in the two partner sites, but not the nature of this nucleotide. Furthermore, we showed that the reaction is close to wild-type efficiency when one of the nucleotides in the second or third position is mutated in either the attC or the attI1 site, while identical mutations can have drastic effects when both sites are mutated, resulting in a dramatic decrease of recombination frequency compared to that of the wild-type sites. Finally, we tested the functional role of the amino acids predicted from structural data to interact with the cleavage site. We found that, if the recombination site triplets are tolerant to mutation, the amino acids interacting with them are extremely constrained.  相似文献   

11.
The occurrence and frequency of outcrossing in homothallic fungal species in nature is an unresolved question. Here we report detection of frequent outcrossing in the homothallic fungus Sclerotinia sclerotiorum. In using multilocus linkage disequilibrium (LD) to infer recombination among microsatellite alleles, high mutation rates confound the estimates of recombination. To distinguish high mutation rates from recombination to infer outcrossing, 8 population samples comprising 268 S. sclerotiorum isolates from widely distributed agricultural fields were genotyped for 12 microsatellite markers, resulting in multiple polymorphic markers on three chromosomes. Each isolate was homokaryotic for the 12 loci. Pairwise LD was estimated using three methods: Fisher''s exact test, index of association (IA) and Hedrick''s D′. For most of the populations, pairwise LD decayed with increasing physical distance between loci in two of the three chromosomes. Therefore, the observed recombination of alleles cannot be simply attributed to mutation alone. Different recombination rates in various DNA regions (recombination hot/cold spots) and different evolutionary histories of the populations could explain the observed differences in rates of LD decay among the chromosomes and among populations. The majority of the isolates exhibited mycelial incompatibility, minimizing the possibility of heterokaryon formation and mitotic recombination. Thus, the observed high intrachromosomal recombination is due to meiotic recombination, suggesting frequent outcrossing in these populations, supporting the view that homothallism favors universal compatibility of gametes instead of traditionally believed haploid selfing in S. sclerotiorum. Frequent outcrossing facilitates emergence and spread of new traits such as fungicide resistance, increasing difficulties in managing Sclerotinia diseases.  相似文献   

12.
Genetic and physical mapping in the early region of bacteriophage T7 DNA.   总被引:14,自引:0,他引:14  
A detailed physical map of the early region of bacteriophage T7 DNA has been constructed. This map contains: locations for all the cuts made by the restriction endonucleases HindII, HpaII, HaeIII and HaeII, and many of the cuts by HhaI; the approximate end points for each of 61 different deletions; initiation sites and the termination site for RNAs made by Escherichia coli RNA polymerase; an initiation site for RNA made by T7 RNA polymerase; the five primary RNase III cleavage sites of the early region; and the coding sequences for perhaps nine different early proteins. Virtually all of the non-overlapping coding capacity of the five early messenger RNAs is used, except for untranslated stretches of perhaps 30 or so nucleotides at the ends. It seems likely that each of the nine early proteins is made from its own ribosome-binding and initiation site. The mapped restriction cuts provide fixed reference points, and allow DNA fragments containing specific genetic signals to be identified and isolated.The nucleotide sequences around the ends of three different T7 deletions have been determined. Each deletion eliminated a segment of DNA between repeated sequences of seven, eight or ten base-pairs, located 578 to 2100 base-pairs apart in the wild-type sequence. In each case, one copy of the repeated sequence was retained in the deletion mutant. This is consistent with the deletions having arisen by a genetic crossover between the repeated sequences. The approximate frequency of genetic recombination per base-pair has been estimated within two early genes; in both cases, the value was close to 0.01% recombination per base-pair, consistent with the value expected from the total length of the T7 genetic map. Genetic recombination between non-overlapping deletions appears to be severely depressed when the distance between the deletions is closer than about 40 to 50 base-pairs, but recombination between a point mutation and a deletion does not appear to be similarly depressed. This suggests that efficient genetic recombination in T7 may require a base-paired “synapse” of some minimum size between the recombining DNA molecules.  相似文献   

13.
Mutations in leucine-rich repeat kinase 2 (LRRK2) are a common cause of inherited Parkinson’s disease (PD). The protein is large and complex, but pathogenic mutations cluster in a region containing GTPase and kinase domains. LRRK2 can autophosphorylate in vitro within a dimer pair, although the significance of this reaction is unclear. Here, we mapped the sites of autophosphorylation within LRRK2 and found several potential phosphorylation sites within the GTPase domain. Using mass spectrometry, we found that Thr1343 is phosphorylated and, using kinase dead versions of LRRK2, show that this is an autophosphorylation site. However, we also find evidence for additional sites in the GTPase domain and in other regions of the protein suggesting that there may be multiple autophosphorylation sites within LRRK2. These data suggest that the kinase and GTPase activities of LRRK2 may exhibit complex autoregulatory interdependence.  相似文献   

14.
The integration of phage λ occurs by a reciprocal genetic exchange, promoted by the product of phage int gene, at specific sites on the phage and bacterial genomes (att's). Lysogenic bacteria thus contain two att's which bracket the inserted prophage. Genetically, the phage, bacterial and prophage att's differ from each other, indicating that each site has specific elements which segregate during recombination.In hosts that lack the bacterial att, phage integration occurs at about 0.5% the normal frequency. It results from Int-promoted recombination between the phage att and any one of many secondary sites in the bacterial genome. To analyze these sites, we measured Int-promoted recombination at the secondary prophage att's. We found that they differed from the normal prophage att's and from the phage att. The secondary sites, therefore, do not appear to carry any of the specific elements of the phage or bacterial att's.The transducing phage isolated from secondary site lysogens integrate at two loci. In the absence of helper, they insert via homology with the bacterial DNA. Co-infection with helper results in their integration at the normal bacterial att.  相似文献   

15.
The analysis of equilibrium binding isotherms obtained by methods such as the nitrocellulose filter binding assay, which measure the fraction. θ, of DNA to which at least one protein molecule is bound, as a function of the free protein concentration (LF) require a different type of theoretical framework from that required for analysis of conventional equilibrium binding data, in which the number of moles of protein bound per mole of DNA, θc is measured as a function of LF. The theoretical framework required to analyse equilibrium binding data generated by measuring θ(LF) is developed for co-operative and non-co-operative binding of a protein to a large number of non-specific sites and to a specific sites(s) in the presence of a large number of non-specific sites on a DNA molecule. The theory is simple to apply, equations for θ(LF) being easy to derive and evaluate, and is suitable for least-squares analysis. Two examples of the application of the theory to the analysis of experimental data are provided for the specific and non-specific binding of the EcoRI restriction endonuclease to bacteriophage λ DNA, and for the specific and non-specific binding of the enzyme dihydrofolate reductase from Lactobacillus casei to pBR322 and pWDLcB1 DNA, the latter differing from the former only in a 2.9 × 103 base-pair insert containing the L. casei dihydrofolate reductase structural gene. The theoretical and experimental advantages and disadvantages of measuring θ(LF) rather than θc(LF) are discussed.  相似文献   

16.
Multidisciplinary studies realized these last years about cave sites of Madonna dell’Arma, Arma delle Manie and Santa Lucia superiore, in italian Liguria, have stated precisely the conditions of neandertalian frequentations of those sites, placing at intervals, according to the cases, from isotopic stage five to the beginning of stage three. The mousterian industries recovered in those sites, associated with a lot of faunal remains, are here analyzed about the different stratigraphic levels in each site. In Madonna dell’Arma and Arma delle Manie, those industries show a certain constancy in the different levels, about technological and typological point of view or about raw material management. In Santa Lucia superiore, on the contrary, two different types of occupation have let lithical vestiges with different facies in lower levels and above levels. For the three sites, the analysis of the raw material management shows an essentially local supplying, but also some origins much more far like in the case of the jasper. This carrying of distant stones reveals us one aspect of the mobility or at least the territories extension that were able to apprehend those neandertalian human groups in Liguria region and beyond also. Otherwise, some preferential choices for certain raw materials were done by knappers for the débitage or for the small retouched tool supports. The flaking technics are identified in each site and certain regularities have been stated, like the high frequency of Levallois flaking in the different levels (also external levels) of Madonna dell’Arma site that is not the case in the two other sites.  相似文献   

17.
Statistical Properties of a DNA Sample under the Finite-Sites Model   总被引:1,自引:0,他引:1       下载免费PDF全文
Z. Yang 《Genetics》1996,144(4):1941-1950
Statistical properties of a DNA sample from a random-mating population of constant size are studied under the finite-sites model. It is assumed that there is no migration and no recombination occurs within the locus. A Markov process model is used for nucleotide substitution, allowing for multiple substitutions at a single site. The evolutionary rates among sites are treated as either constant or variable. The general likelihood calculation using numerical integration involves intensive computation and is feasible for three or four sequences only; it may be used for validating approximate algorithms. Methods are developed to approximate the probability distribution of the number of segregating sites in a random sample of n sequences, with either constant or variable substitution rates across sites. Calculations using parameter estimates obtained for human D-loop mitochondrial DNAs show that among-site rate variation has a major effect on the distribution of the number of segregating sites; the distribution under the finite-sites model with variable rates among sites is quite different from that under the infinite-sites model.  相似文献   

18.
The adverse effect of co-inheritance linkage of a large number of sites on adaptation has been studied extensively for asexual populations. However, it is insufficiently understood for multi-site populations in the presence of recombination. In the present work, motivated by our studies of HIV evolution in infected patients, we consider a model of haploid populations with infrequent recombination. We assume that small quantities of beneficial alleles preexist at a large number of sites and neglect new mutation. Using a generalized form of the traveling wave method, we show that the effectiveness of recombination is impeded and the adaptation rate is decreased by inter-sequence correlations, arising due to the fact that some pairs of homologous sites have common ancestors existing after the onset of adaptation. As the recombination rate per individual becomes smaller, site pairs with common ancestors become more frequent, making recombination even less effective. In addition, an increasing number of sites become identical by descent across large samples of sequences, causing reversion of the direction of evolution and the loss of beneficial alleles at these sites. As a result, within a 10-fold range of the recombination rate, the average adaptation rate falls from 90% of the infinite-recombination value down to 10%. The entire transition from almost maximum to almost zero may occur at very small recombination rates. Interestingly, the strong effect of linkage on the adaptation rate is predicted in the absence of average linkage disequilibrium (Lewontin’s measure).  相似文献   

19.
In this paper some properties of a convenient estimator, derived from a martingale estimating function, for the basic reproduction number of the general epidemic model are given for both finite and large samples. These properties give some guidelines for using this convenient estimator. It is shown that it underestimates the parameter and that the bias tends to zero when the population size and the initial number of infectives are increased simultaneously. The bias cannot be removed for a fixed number of introductory infectives. However, the estimator is asymptotically unbiased, conditional on a major outbreak. A simulation study shows that the central limit theorem applies for moderate population sizes.  相似文献   

20.
The sable (Martes zibellina) is a medium-sized mustelid inhabiting forest environments in Siberia, northern China, the Korean Peninsula, and Hokkaido Island, Japan. To further understand the molecular evolution of the major histocompatibility complex (MHC), we sequenced part of exon 2 in MHC class II DRB genes, including codons encoding the antigen binding site, from 33 individuals from continental Eurasia and Japan. We identified 16 MHC class II DRB alleles (Mazi-DRBs), some of which were geographically restricted and others broadly distributed, and eight putative pseudogenes. A single-breakpoint recombination analysis detected a recombination site in the middle of exon 2. A mixed effects model of evolution analysis identified five amino acid sites presumably under positive selection. These sites were all located in the region 3′ to the recombination site, suggesting that positive selection and recombination could be committed to the diversity of the M. zibellina DRB gene. In a Bayesian phylogenetic tree, all Mazi-DRBs and the presumed pseudogenes grouped within a Mustelidae clade. The Mazi-DRBs showed trans-species polymorphism, with some alleles most closely related to alleles from other mustelid species. This result suggests that the sable DRBs have evolved under long-lasting balancing selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号