首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The estimation of population allele frequencies using sample data forms a central component of studies in population genetics. These estimates can be used to test hypotheses on the evolutionary processes governing changes in genetic variation among populations. However, existing studies frequently do not account for sampling uncertainty in these estimates, thus compromising their utility. Incorporation of this uncertainty has been hindered by the lack of a method for constructing confidence intervals containing the population allele frequencies, for the general case of sampling from a finite diploid population of any size. In this study, we address this important knowledge gap by presenting a rigorous mathematical method to construct such confidence intervals. For a range of scenarios, the method is used to demonstrate that for a particular allele, in order to obtain accurate estimates within 0.05 of the population allele frequency with high probability (%), a sample size of is often required. This analysis is augmented by an application of the method to empirical sample allele frequency data for two populations of the checkerspot butterfly (Melitaea cinxia L.), occupying meadows in Finland. For each population, the method is used to derive % confidence intervals for the population frequencies of three alleles. These intervals are then used to construct two joint % confidence regions, one for the set of three frequencies for each population. These regions are then used to derive a % confidence interval for Jost''s D, a measure of genetic differentiation between the two populations. Overall, the results demonstrate the practical utility of the method with respect to informing sampling design and accounting for sampling uncertainty in studies of population genetics, important for scientific hypothesis-testing and also for risk-based natural resource management.  相似文献   

2.
The construction of time-specified reference limits requires systematic sampling in clinical health, particularly for those variables characterized by a circadian rhythm of large amplitude, as it is the case for blood pressure (BP). For the detection of false negatives, tolerance intervals (limits that will include at least a specified proportion of the population with a stated confidence) are important and should substitute when possible for prediction limits. We have previously described a nonparametric method for the computation of model-independent tolerance intervals that are constructed by first dividing the sampling range in several time spans in which no appreciable changes in population characteristics (namely, mean and variance) take place. The tolerance interval is then computed for each of the time spans. The limits thus computed, as well as results of any comparison of a given individual's profile against such tolerance intervals, are highly dependent on the sampling scheme of both the reference individuals and the test subject. To avoid this problem, we have developed an alternative method that allows the computation of model-dependent tolerance bands for hybrid time series. Assuming that a set X of longitudinal series monitored from a given group of reference individuals can be fitted with the same individual model, a population model C(X,t) can be also determined, as well as the deviation S(X,t) of each individual curve from the population model. The tolerance band will then have the form C(X,t) ± kS(X,t), where k is here estimated following a nonparametric approach based on bootstrap techniques. Alternatively, two different values of k can be estimated (for the lower and upper limits of the tolerance interval, respectively) in cases for which we cannot assume symmetry. The method is generally applicable for any population model describing the reference population (including the fit of multiple significant components, nonsinusoidal waveforms, and/or trends). The method was used to establish time-specified tolerance bands for time series of blood pressure monitored automatically in healthy individuals of both genders. Model-dependent intervals are preferred to the model-independent limits when reliance on a specified sampling rate needs to be avoided. These limits may serve for an objective and positive definition of health, for the screening and diagnosis of disease, and for gauging the subject's response to treatment. (Chronobiology International, 17(4), 567–582, 2000)  相似文献   

3.
Summary Many major genes have been identified that strongly influence the risk of cancer. However, there are typically many different mutations that can occur in the gene, each of which may or may not confer increased risk. It is critical to identify which specific mutations are harmful, and which ones are harmless, so that individuals who learn from genetic testing that they have a mutation can be appropriately counseled. This is a challenging task, since new mutations are continually being identified, and there is typically relatively little evidence available about each individual mutation. In an earlier article, we employed hierarchical modeling ( Capanu et al., 2008 , Statistics in Medicine 27 , 1973–1992) using the pseudo‐likelihood and Gibbs sampling methods to estimate the relative risks of individual rare variants using data from a case–control study and showed that one can draw strength from the aggregating power of hierarchical models to distinguish the variants that contribute to cancer risk. However, further research is needed to validate the application of asymptotic methods to such sparse data. In this article, we use simulations to study in detail the properties of the pseudo‐likelihood method for this purpose. We also explore two alternative approaches: pseudo‐likelihood with correction for the variance component estimate as proposed by Lin and Breslow (1996, Journal of the American Statistical Association 91 , 1007–1016) and a hybrid pseudo‐likelihood approach with Bayesian estimation of the variance component. We investigate the validity of these hierarchical modeling techniques by looking at the bias and coverage properties of the estimators as well as at the efficiency of the hierarchical modeling estimates relative to that of the maximum likelihood estimates. The results indicate that the estimates of the relative risks of very sparse variants have small bias, and that the estimated 95% confidence intervals are typically anti‐conservative, though the actual coverage rates are generally above 90%. The widths of the confidence intervals narrow as the residual variance in the second‐stage model is reduced. The results also show that the hierarchical modeling estimates have shorter confidence intervals relative to estimates obtained from conventional logistic regression, and that these relative improvements increase as the variants become more rare.  相似文献   

4.
The mating system and allozyme variation at 20 loci in three Klamath Mountains and two Sierra Nevada populations of Jeffrey pine (Pinus jeffreyi Grev. & Balf.) were investigated. On average, multilocus estimates of the proportion of viable progeny due to outcrossing (tm) were high in all populations (mean tm = 0.935, range 0.881 to 0.971). Despite differences in stand structure, tm did not differ (P > 0.05) between the Klamath (mean tm = 0.933) and Sierra Nevada (mean tm = 0.937) populations. At all but one locus in one population and at two in another, genotype frequencies fit (P > 0.05) Hardy-Weinberg expectations. Mean estimates of observed heterozygosity in Klamath (0.182) and Sierra Nevada (0.327) populations were comparable to values reported for other conifers.  相似文献   

5.
Summary Maximum likelihood estimates of gene frequencies and their standard errors are presented for 21 blood group and serum protein polymorphisms. The observed frequencies for certain high and low frequency antigens also are reported. The data come from a sample of 399 same-sex twin pairs and two sets of triplets from the Greater Philadelphia urban region encompassing roughly five counties in southeastern Pennsylvania and three counties of southern New Jersey. Analyses are carried out separately for the four subgroups created by subdividing the sample by race and co-twin. Total sample estimates are also calculated within the two socially defined racial groups. The gene frequency estimates generally appear to be consistent with previously reported data for U.S. urban populations. The frequency of the Fy allele in the Duffy system, however, seems to be the highest value thus far published for a white population. The white sample Fy allele very well may be a heterogeneous class in which only a very small fraction is comparable to the Fy allele common in the black sample.  相似文献   

6.
Løkkeborg  Svein  Fernö  Anders  Jørgensen  Terje 《Hydrobiologia》2002,483(1-3):259-264
Ultrasonic telemetry using stationary positioning systems allows several fish to be tracked simultaneously, but systems that are incapable of sampling multiple frequencies simultaneously can record data from only one transmitter (individual) at a time. Tracking several individuals simultaneously thus results in longer intervals between successive position fixes for each fish. This deficiency leads to loss of detail in the tracking data collected, and may be expected to cause loss of accuracy in estimates of the swimming speeds and movement patterns of the fish tracked. Even systems that track fish on multiple frequencies are not capable of continuous tracking due to technical issues. We determined the swimming speed, area occupied, activity rhythm and movement pattern of cod (Gadus morhua) using a stationary single-channel positioning system, and analysed how estimates of these behavioural parameters were affected by the interval between successive position fixes. Single fish were tracked at a time, and position fixes were eliminated at regular intervals in the original data to generate new data sets, as if they had been collected in the course of tracking several fish (2–16). In comparison with the complete set, these data sets gave 30–70% decreases in estimates of swimming speed depending on the number of fish supposedly being tracked. These results were similar for two individuals of different size and activity level, indicating that they can be employed as correction factors to partly compensate for underestimates of swimming speed when several fish are tracked simultaneously. Tracking `several' fish only slightly affected the estimates of area occupied (1–15%). The diurnal activity rhythm was also similar between the data sets, whereas details in search pattern were not seen when several fish were tracked simultaneously.  相似文献   

7.
Inbreeding and relationship metrics among and within populations are useful measures for genetic management of wild populations, but accuracy and precision of estimates can be influenced by the number of individual genotypes analysed. Biologists are confronted with varied advice regarding the sample size necessary for reliable estimates when using genomic tools. We developed a simulation framework to identify the optimal sample size for three widely used metrics to enable quantification of expected variance and relative bias of estimates and a comparison of results among populations. We applied this approach to analyse empirical genomic data for 30 individuals from each of four different free‐ranging Rocky Mountain bighorn sheep (Ovis canadensis canadensis) populations in Montana and Wyoming, USA, through cross‐species application of an Ovine array and analysis of approximately 14,000 single nucleotide polymorphisms (SNPs) after filtering. We examined intra‐ and interpopulation relationships using kinship and identity by state metrics, as well as FST between populations. By evaluating our simulation results, we concluded that a sample size of 25 was adequate for assessing these metrics using the Ovine array to genotype Rocky Mountain bighorn sheep herds. However, we conclude that a universal sample size rule may not be able to sufficiently address the complexities that impact genomic kinship and inbreeding estimates. Thus, we recommend that a pilot study and sample size simulation using R code we developed that includes empirical genotypes from a subset of populations of interest would be an effective approach to ensure rigour in estimating genomic kinship and population differentiation.  相似文献   

8.
Estimating asymptotic size using the largest individuals per sample   总被引:1,自引:0,他引:1  
Summary Estimates of asymptotic size are especially useful for comparative studies of taxonomic groups in which animals mature at small sizes relative to their final asymptotic sizes. The largest individuals per sample can provide reasonable estimates of asymptotic size if three conditions are met: 1) at least some adults in a population are near their final asymptotic size, 2) samples of a reasonable size are likely to contain a largest individual that is near the average asymptotic size for the members of its sex, and 3) the coefficient of variation in asymptotic size is small for the members of each sex. In the current study, we show that all three of these conditions are met for one species of Anolis lizards (A. limifrons). For a series of samples from the genus Anolis, the largest individual per sample produces estimates of asymptotic size that are virtually identical to those produced by fitting field data on growth rates to nonlinear growth equations. These results suggest that the largest individual method can provide reasonable estimates of asymptotic size for the members of this genus, and imply that this method may also be useful for estimating asymptotic sizes in other taxa that satisfy the criteria listed above.  相似文献   

9.
10.
Cortisol (CT) concentrations (in mUg/dl) were determined by radioimmunoassay in plasma obtained at about 3-hr intervals during a 24-hr sampling span from 42 boys and 13 girls of short stature (2–4 standard deviations below their peer group mean), and from a reference group of 11 boys and 10 girls with standard stature, before any treatment were administered to the former. Subjects were 11.20 0.37 years of age at the time of study, and were living on a diurnal waking (~07:30 to ~22:30), nocturnal resting routine during sampling, consuming the usual hospital diet. Circadian rhythm parameters were computed separately for each group by the single and population-mean cosinor Tits of a 24-hr cosine curve. A comparison of circadian parameters indicates a statistically significant difference in acrophase (>P =0.033) between short and standard children, as well as added differences in rhythm-adjusted mean (M; P=0.011) and (P =0.035) between boys and girls of short stature. These differences, as well as any other added information from relevant marker rhythms, should be taken into account for the time-specification of therapy before treatment starts in children of short stature.  相似文献   

11.
12.
In bioassay, where different levels of the stimulus may represent different doses of a drug, the binary response is the death or survival of an individual receiving a specified dose. In such applications, it is common to model the probability of a positive response P at the stimulus level x by P = F(x′β), where F is a cumulative distribution function and β is a vector of unknown parameters which characterize the response function. The two most popular models used for modelling binary response bioassay involve the probit model [BLISS (1935), FINNEY (1978)], and the logistic model [BERKSON (1944), BROWN (1982)]. However, these models have some limitations. The use of the probit model involves the inverse of the standard normal distribution function, making it rather intractable. The logistic model has a simple form and a closed expression for the inverse distribution function, however, neither the logistic nor the probit can provide a good fit to response functions which are not symmetric or are symmetric but have a steeper or gentler incline in the central probability region. In this paper we introduce a more realistic model for the analysis of quantal response bioassay. The proposed model, which we refer to it as the generalized logistic model, is a family of response curves indexed by shape parameters m1 and m2. This family is rich enough to include the probit and logistic models as well as many others as special cases or limiting distributions. In particular, we consider the generalized logistic three parameter model where we assume that m1 = m, m is a positive real number, and m2 = 1. We apply this model to various sets of data, comparing the fit results to those obtained previously by other dose-response curves such as the logistic and probit, and showing that the fit can be improved by using the generalized logistic.  相似文献   

13.
A differential elimination method (DEM) is developed to determine the kinetic coefficients for substrate self-inhibition. Finite differentiation of the equation eliminates either KI or KS, which enables the equation to be linearized so that [^(\textq)] {\hat{\text{q}}} , KS, and KI can be estimated without using nonlinear least square regression (NLSR). The DEM options that eliminate KI or KS computed the parameter values exactly when the data did not contain any errors. If one-point or random errors were not too large, both DEM options worked as well as NLSR when data were acquired with geometric intervals for substrate concentration. The DEM was more accurate for fitting the data for the smallest and largest values of S, but relatively weaker in estimating the observed maximum substrate utilization rate, qmax. The estimates for Smax, the concentration at which the maximum specific substrate utilization rate is observed, were relatively invariant among the methods, even when KS and KI differed. When the intervals were arithmetic (i.e., equal intervals of substrate concentration) and the data contained errors, the DEM and NLSR estimated the parameters poorly, indicating that collecting data with an arithmetic interval greatly increases the risk of poor parameter estimation. Parameter estimates by DEM fit very well experimental data from nitrification or photosynthesis, which were taken with geometric intervals of substrate concentration or light intensity, but fit poorly phenol-degradation data, which were obtained with arithmetic substrate intervals. Besides providing a reasonable substitute for NLSR, the DEM also can be used as a tool to diagnose the quality of experimental data by comparing its estimates between the DEM options, or, more rigorously, to those from NLSR.  相似文献   

14.
The amount of between‐individual variation in the unobservable developmental instability (DI) has been the subject of intense recent debates. The unexpectedly high estimates of between‐individual variation in DI based on distributional characteristics of observable asymmetry values (of on average bilaterally symmetric traits) rely on statistical models that assume an underlying normal distribution of developmental errors. This prompted doubts on the assumption of the Gaussian nature of developmental errors. However, when applying other candidate distributions [log‐normal and gamma (γ)], recent analyses of empirical datasets have indicated that estimates remain generally high. Yet, all estimates were based on bilaterally symmetric traits, which did not allow for a formal comparison of the alternative distributions. In the present study, we extend a recent statistical model to allow statistical comparison of the different distributions based on traits that developed repeatedly under the same conditions, such as flower traits and regrown feathers. We analyse simulated and empirical data and show that: (1) it is statistically difficult to differentiate among the three alternatives when variances are small relative to the mean, as is often the case with DI; (2) the normal distribution fits the log‐normal or γ relatively well under those circumstances; (3) the deviance information criterion (DIC) is able to pick up differences in model fit among the three alternative distributions, yet more strongly so when levels of DI were high; (4) empirical datasets show a better fit of the normal over the log‐normal and γ‐distributions as judged by the DIC; and (5) estimates of between‐individual variation in DI in the three empirical datasets were relatively high (> 50%) under each distributional assumption. In conclusion, and based on our three datasets, the normal approximation appears to be a reasonable choice for statistical models of DI and the remarkably high estimates of variation in DI cannot be attributed to non‐normal developmental noise. Nevertheless, our method should be applied to a broad range of traits and organisms to evaluate the generality of this result. We argue that there is an urgent need for studies that reveal the underlying mechanisms of developmental noise and stability, as well as the role of developmental selection, in order to be able to determine the biological importance of the highly skewed distributions of developmental instability often observed. © 2007 The Linnean Society of London, Biological Journal of the Linnean Society, 2007, 92 , 197–210.  相似文献   

15.
This paper presents an analysis of variance (ANOVA) approach by which estimation of F-statistics can be made from data with an arbitrary s-level hierarchical population structure. Assuming a complete random-effect model, a general ANOVA procedure is developed to estimate F-statistics as ratios of different variance components for all levels of population subdivision in the hierarchy. A generalized relationship among F-statistics is also derived to extend the well-known relationship originally found by Sewall Wright. Although not entirely free from the bias particular to small number of subdivisions at each hierarchy and extreme gene frequencies, the ANOVA estimators of F-statistics consider sampling effects at each level of hierarchy, thus removing the bias incurred in the other estimators that are commonly based on direct substitution of unknown gene frequencies by their sample estimates. Therefore, the ANOVA estimation procedure presented here may become increasingly useful in analyzing complex population structure because of increasing use of the estimated hierarchical F-statistics to infer genetic and demographic structures of natural populations within and among species.  相似文献   

16.
17.
Boron Uptake by Excised Barley Roots   总被引:5,自引:0,他引:5  
Active uptake of boron (B) by excised barley roots is linear with time for at least 1.5 h. Although no evidence was found for accumulation of B against a concentration gradient. this component of B uptake does satisfy other criteria for an active transport process. Transport is inhibited by 0.05 mM 2,4-dinitrophenol, 0.05 mM azide, 5 mM arsenate and 5 mM dicoumarol. Also, uptake is temperature-sensitive, being nil at 2°C and maximal at 34 to 38°C. Boron uptake by barley roots increases with time when they are washed in aerated 0.5 mM CaSO4 solution. A double reciprocal plot of the B uptake data manifests a series of phases separated by sharp transitions or “jumps”, and is compatible with the concept of multiphasic uptake mechanisms. Kinetic constants and transition points for the various phases were calculated accordingly. The fit of these data was compared statistically to three other relevant models, viz, the dual model, the “single + diffusion” model (a Michaelis–Menten term and a diffusion term), and the negative cooperativity model. In each case, the data were better represented by the multiphasic model.  相似文献   

18.
Forty-five sorghum germplasm growing in the Eastern Highlands of Ethiopia were evaluated for 10 qualitative traits. Phenotypic frequencies between the accessions from each of the nine Aanaas and Alemaya University, grouped in 10 localities were tabulated. Phenotypic diversity index, H′, was analysed and the result indicated the between localities component of diversity to be relatively smaller than the variation in H′ among characters within localities. The value of H′ for all sample germplasm ranged from 0.36 to 0.95 with a mean of 0.71. The results showed that there is a wide morpho-agronomical diversity among the sample germplasm studied. This information can be used for the conservation of these germplasm resources and future improvement work of the sorghum crop.  相似文献   

19.
The gradual loss of diversity and the establishment of clines in allele frequencies associated with range expansions are patterns observed in many species, including humans. These patterns can result from a series of founder events occurring as populations colonize previously unoccupied areas. We develop a model of an expanding population and, using a branching process approximation, show that spatial gradients reflect different amounts of genetic drift experienced by different subpopulations. We then use this model to measure the net average strength of the founder effect, and we demonstrate that the predictions from the branching process model fit simulation results well. We further show that estimates of the effective founder size are robust to potential confounding factors such as migration between subpopulations. We apply our method to data from Arabidopsis thaliana. We find that the average founder effect is approximately three times larger in the Americas than in Europe, possibly indicating that a more recent, rapid expansion occurred.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号