首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider estimation after a group sequential test. An estimator that is unbiased or has small bias may have substantial conditional bias (Troendle and Yu, 1999, Coburger and Wassmer, 2001). In this paper we derive the conditional maximum likelihood estimators of both the primary parameter and a secondary parameter, and investigate their properties within a conditional inference framework. The method applies to both the usual and adaptive group sequential test designs. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

2.
Brownian motions on coalescent structures have a biological relevance, either as an approximation of the stepwise mutation model for microsatellites, or as a model of spatial evolution considering the locations of individuals at successive generations. We discuss estimation procedures for the dispersal parameter of a Brownian motion defined on coalescent trees. First, we consider the mean square distance unbiased estimator and compute its variance. In a second approach, we introduce a phylogenetic estimator. Given the UPGMA topology, the likelihood of the parameter is computed thanks to a new dynamical programming method. By a proper correction, an unbiased estimator is derived from the pseudomaximum of the likelihood. The last approach consists of computing the likelihood by a Markov chain Monte Carlo sampling method. In the one-dimensional Brownian motion, this method seems less reliable than pseudomaximum-likelihood.  相似文献   

3.
ANDERSON and POSPAHALA (1970) investigated the estimation of wildlife population size using the belt or line transect sampling method and devised a correction for bias, thus leading to an estimator with interesting characteristics. This work was given a uniform mathematical framework in BURNHAM and ANDERSON (1976). In this paper we show that the ANDERSON-POSPAHALA estimator is optimal in the sense of being the (unique) best linear unbiased estimator within the class of estimators which are linear combinations of cell frequencies, provided certain assumptions are met.  相似文献   

4.
Targeted maximum likelihood estimation of a parameter of a data generating distribution, known to be an element of a semi-parametric model, involves constructing a parametric model through an initial density estimator with parameter ? representing an amount of fluctuation of the initial density estimator, where the score of this fluctuation model at ? = 0 equals the efficient influence curve/canonical gradient. The latter constraint can be satisfied by many parametric fluctuation models since it represents only a local constraint of its behavior at zero fluctuation. However, it is very important that the fluctuations stay within the semi-parametric model for the observed data distribution, even if the parameter can be defined on fluctuations that fall outside the assumed observed data model. In particular, in the context of sparse data, by which we mean situations where the Fisher information is low, a violation of this property can heavily affect the performance of the estimator. This paper presents a fluctuation approach that guarantees the fluctuated density estimator remains inside the bounds of the data model. We demonstrate this in the context of estimation of a causal effect of a binary treatment on a continuous outcome that is bounded. It results in a targeted maximum likelihood estimator that inherently respects known bounds, and consequently is more robust in sparse data situations than the targeted MLE using a naive fluctuation model. When an estimation procedure incorporates weights, observations having large weights relative to the rest heavily influence the point estimate and inflate the variance. Truncating these weights is a common approach to reducing the variance, but it can also introduce bias into the estimate. We present an alternative targeted maximum likelihood estimation (TMLE) approach that dampens the effect of these heavily weighted observations. As a substitution estimator, TMLE respects the global constraints of the observed data model. For example, when outcomes are binary, a fluctuation of an initial density estimate on the logit scale constrains predicted probabilities to be between 0 and 1. This inherent enforcement of bounds has been extended to continuous outcomes. Simulation study results indicate that this approach is on a par with, and many times superior to, fluctuating on the linear scale, and in particular is more robust when there is sparsity in the data.  相似文献   

5.
The fate of scientific hypotheses often relies on the ability of a computational model to explain the data, quantified in modern statistical approaches by the likelihood function. The log-likelihood is the key element for parameter estimation and model evaluation. However, the log-likelihood of complex models in fields such as computational biology and neuroscience is often intractable to compute analytically or numerically. In those cases, researchers can often only estimate the log-likelihood by comparing observed data with synthetic observations generated by model simulations. Standard techniques to approximate the likelihood via simulation either use summary statistics of the data or are at risk of producing substantial biases in the estimate. Here, we explore another method, inverse binomial sampling (IBS), which can estimate the log-likelihood of an entire data set efficiently and without bias. For each observation, IBS draws samples from the simulator model until one matches the observation. The log-likelihood estimate is then a function of the number of samples drawn. The variance of this estimator is uniformly bounded, achieves the minimum variance for an unbiased estimator, and we can compute calibrated estimates of the variance. We provide theoretical arguments in favor of IBS and an empirical assessment of the method for maximum-likelihood estimation with simulation-based models. As case studies, we take three model-fitting problems of increasing complexity from computational and cognitive neuroscience. In all problems, IBS generally produces lower error in the estimated parameters and maximum log-likelihood values than alternative sampling methods with the same average number of samples. Our results demonstrate the potential of IBS as a practical, robust, and easy to implement method for log-likelihood evaluation when exact techniques are not available.  相似文献   

6.
Wijsman EM  Nur N 《Human heredity》2001,51(3):145-149
The measured genotype approach can be used to estimate the variance contributions of specific candidate loci to quantitative traits of interest. We show here that both the naive estimate of measured-locus heritability, obtained by invoking infinite-sample theory, and an estimate obtained from a bias-corrected variance estimate based on finite-sample theory, produce biased estimates of heritability. We identify the sources of bias, and quantify their effects. The two sources of bias are: (1) the estimation of heritability from population samples as the ratio of two variances, and (2) the existence of sampling error. We show that neither heritability estimator is less biased (in absolute value) than the other in all situations, and the choice of an ideal estimator is therefore a function of the sample size and magnitude of the locus-specific contribution to the overall phenotypic variance. In most cases the bias is small, so that the practical implications of using either estimator are expected to be minimal.  相似文献   

7.
Cheng Y  Shen Y 《Biometrics》2004,60(4):910-918
For confirmatory trials of regulatory decision making, it is important that adaptive designs under consideration provide inference with the correct nominal level, as well as unbiased estimates, and confidence intervals for the treatment comparisons in the actual trials. However, naive point estimate and its confidence interval are often biased in adaptive sequential designs. We develop a new procedure for estimation following a test from a sample size reestimation design. The method for obtaining an exact confidence interval and point estimate is based on a general distribution property of a pivot function of the Self-designing group sequential clinical trial by Shen and Fisher (1999, Biometrics55, 190-197). A modified estimate is proposed to explicitly account for futility stopping boundary with reduced bias when block sizes are small. The proposed estimates are shown to be consistent. The computation of the estimates is straightforward. We also provide a modified weight function to improve the power of the test. Extensive simulation studies show that the exact confidence intervals have accurate nominal probability of coverage, and the proposed point estimates are nearly unbiased with practical sample sizes.  相似文献   

8.
Zavala A  Naya H  Romero H  Sabbia V  Piovani R  Musto H 《Gene》2005,357(2):137-143
GC level is a key feature in prokaryotic genomes. Widely employed in evolutionary studies, new insights appear however limited because of the relatively low number of characterized genomes. Since public databases mainly comprise several hundreds of prokaryotes with a low number of sequences per genome, a reliable prediction method based on available sequences may be useful for studies that need a trustworthy estimation of whole genomic GC. As the analysis of completely sequenced genomes shows a great variability in distributional shapes, it is of interest to compare different estimators. Our analysis shows that the mean of GC values of a random sample of genes is a reasonable estimator, based on simplicity of the calculation and overall performance. However, usually sequences come from a process that cannot be considered as random sampling. When we analyzed two introduced sources of bias (gene length and protein functional categories) we were able to detect an additional bias in the estimation for some cases, although the precision was not affected. We conclude that the mean genic GC level of a sample of 10 genes is a reliable estimator of genomic GC content, showing comparable accuracy with many widely employed experimental methods.  相似文献   

9.
Point estimation in group sequential and adaptive trials is an important issue in analysing a clinical trial. Most literature in this area is only concerned with estimation after completion of a trial. Since adaptive designs allow reassessment of sample size during the trial, reliable point estimation of the true effect when continuing the trial is additionally needed. We present a bias adjusted estimator which allows a more exact sample size determination based on the conditional power principle than the naive sample mean does.  相似文献   

10.
Barabesi L  Pisani C 《Biometrics》2002,58(3):586-592
In practical ecological sampling studies, a certain design (such as plot sampling or line-intercept sampling) is usually replicated more than once. For each replication, the Horvitz-Thompson estimation of the objective parameter is considered. Finally, an overall estimator is achieved by averaging the single Horvitz-Thompson estimators. Because the design replications are drawn independently and under the same conditions, the overall estimator is simply the sample mean of the Horvitz-Thompson estimators under simple random sampling. This procedure may be wisely improved by using ranked set sampling. Hence, we propose the replicated protocol under ranked set sampling, which gives rise to a more accurate estimation than the replicated protocol under simple random sampling.  相似文献   

11.
Cai J  Sen PK  Zhou H 《Biometrics》1999,55(1):182-189
A random effects model for analyzing multivariate failure time data is proposed. The work is motivated by the need for assessing the mean treatment effect in a multicenter clinical trial study, assuming that the centers are a random sample from an underlying population. An estimating equation for the mean hazard ratio parameter is proposed. The proposed estimator is shown to be consistent and asymptotically normally distributed. A variance estimator, based on large sample theory, is proposed. Simulation results indicate that the proposed estimator performs well in finite samples. The proposed variance estimator effectively corrects the bias of the naive variance estimator, which assumes independence of individuals within a group. The methodology is illustrated with a clinical trial data set from the Studies of Left Ventricular Dysfunction. This shows that the variability of the treatment effect is higher than found by means of simpler models.  相似文献   

12.
Gene diversity is sometimes estimated from samples that contain inbred or related individuals. If inbred or related individuals are included in a sample, then the standard estimator for gene diversity produces a downward bias caused by an inflation of the variance of estimated allele frequencies. We develop an unbiased estimator for gene diversity that relies on kinship coefficients for pairs of individuals with known relationship and that reduces to the standard estimator when all individuals are noninbred and unrelated. Applying our estimator to data simulated based on allele frequencies observed for microsatellite loci in human populations, we find that the new estimator performs favorably compared with the standard estimator in terms of bias and similarly in terms of mean squared error. For human population-genetic data, we find that a close linear relationship previously seen between gene diversity and distance from East Africa is preserved when adjusting for the inclusion of close relatives.  相似文献   

13.
In ecology, as in other research fields, efficient sampling for population estimation often drives sample designs toward unequal probability sampling, such as in stratified sampling. Design based statistical analysis tools are appropriate for seamless integration of sample design into the statistical analysis. However, it is also common and necessary, after a sampling design has been implemented, to use datasets to address questions that, in many cases, were not considered during the sampling design phase. Questions may arise requiring the use of model based statistical tools such as multiple regression, quantile regression, or regression tree analysis. However, such model based tools may require, for ensuring unbiased estimation, data from simple random samples, which can be problematic when analyzing data from unequal probability designs. Despite numerous method specific tools available to properly account for sampling design, too often in the analysis of ecological data, sample design is ignored and consequences are not properly considered. We demonstrate here that violation of this assumption can lead to biased parameter estimates in ecological research. In addition, to the set of tools available for researchers to properly account for sampling design in model based analysis, we introduce inverse probability bootstrapping (IPB). Inverse probability bootstrapping is an easily implemented method for obtaining equal probability re-samples from a probability sample, from which unbiased model based estimates can be made. We demonstrate the potential for bias in model-based analyses that ignore sample inclusion probabilities, and the effectiveness of IPB sampling in eliminating this bias, using both simulated and actual ecological data. For illustration, we considered three model based analysis tools—linear regression, quantile regression, and boosted regression tree analysis. In all models, using both simulated and actual ecological data, we found inferences to be biased, sometimes severely, when sample inclusion probabilities were ignored, while IPB sampling effectively produced unbiased parameter estimates.  相似文献   

14.
Two-stage, drop-the-losers designs for adaptive treatment selection have been considered by many authors. The distributions of conditional sufficient statistics and the Rao-Blackwell technique were used to obtain an unbiased estimate and to construct an exact confidence interval for the parameter of interest. In this paper, we characterize the selection process from a binomial drop-the-losers design using a truncated binomial distribution. We propose a new estimator and show that it is asymptotically consistent with a large sample size in either the first stage or the second stage. Supported by simulation analyses, we recommend the new estimator over the naive estimator and the Rao-Blackwell-type estimator based on its robustness in the finite-sample setting. We frame the concept as a simple and easily implemented procedure for phase 2 oncology trial design that can be confirmatory in nature, and we use an example to illustrate its application.  相似文献   

15.
Ranked set sampling is a method which may be used to increase the efficiency of the estimator of the mean of a population. Ranked set sampling with size biased probability of selection (i.e., the items are selected with probability proportion to its size) is combined with the line intercept method to increase the efficency of estimating cover, density and total amount of some variable of interest (e.g. biomass). A two-stage sampling plan is suggested with line intercept sampling in the first stage. Simple random sampling and ranked set sampling are compared in the second stage to show that the unbiased estimators of density, cover and total amount of some variable of interest based on ranked set sampling have smaller variances than the usual unbiased estimator based on simple random sampling. Efficiency is increased by reducing the number of items which are measured on a transect or by increasing the number of independent transects utilized in a study area. An application procedure is given for estimation of coverage, density and number of stems of mountain mahogany (Cercocarpus montanus) in a study area east of Laramie, Wyoming.  相似文献   

16.
Estimating the rate of change of the composition of communities is of direct interest to address many fundamental and applied questions in ecology. One methodological problem is that it is hard to detect all the species present in a community. Nichols et al. presented an estimator of the local extinction rate that takes into account species probability of detection, but little information is available on its performance. However, they predicted that if a covariance between species detection probability and local extinction rate exists in a community, the estimator of local extinction rate complement would be positively biased.
Here, we show, using simulations over a wide range of parameters that the estimator performs reasonably well. The bias induced by biological factors appears relatively weak. The most important factor enhancing the performance (bias and precision) of the local extinction rate complement estimator is sampling effort. Interestingly, a potentially important biological bias, such as the covariance effect, improves the estimation for small sampling efforts, without inducing a supplementary overestimation when these sampling efforts are high. In the field, all species are rarely detectable so we recommend the use of such estimators that take into account heterogeneity in species detection probability when estimating vital rates responsible for community changes.  相似文献   

17.
In this article, we provide a method of estimation for the treatment effect in the adaptive design for censored survival data with or without adjusting for risk factors other than the treatment indicator. Within the semiparametric Cox proportional hazards model, we propose a bias-adjusted parameter estimator for the treatment coefficient and its asymptotic confidence interval at the end of the trial. The method for obtaining an asymptotic confidence interval and point estimator is based on a general distribution property of the final test statistic from the weighted linear rank statistics at the interims with or without considering the nuisance covariates. The computation of the estimates is straightforward. Extensive simulation studies show that the asymptotic confidence intervals have reasonable nominal probability of coverage, and the proposed point estimators are nearly unbiased with practical sample sizes.  相似文献   

18.
Pan W  Lin X  Zeng D 《Biometrics》2006,62(2):402-412
We propose a new class of models, transition measurement error models, to study the effects of covariates and the past responses on the current response in longitudinal studies when one of the covariates is measured with error. We show that the response variable conditional on the error-prone covariate follows a complex transition mixed effects model. The naive model obtained by ignoring the measurement error correctly specifies the transition part of the model, but misspecifies the covariate effect structure and ignores the random effects. We next study the asymptotic bias in naive estimator obtained by ignoring the measurement error for both continuous and discrete outcomes. We show that the naive estimator of the regression coefficient of the error-prone covariate is attenuated, while the naive estimators of the regression coefficients of the past responses are generally inflated. We then develop a structural modeling approach for parameter estimation using the maximum likelihood estimation method. In view of the multidimensional integration required by full maximum likelihood estimation, an EM algorithm is developed to calculate maximum likelihood estimators, in which Monte Carlo simulations are used to evaluate the conditional expectations in the E-step. We evaluate the performance of the proposed method through a simulation study and apply it to a longitudinal social support study for elderly women with heart disease. An additional simulation study shows that the Bayesian information criterion (BIC) performs well in choosing the correct transition orders of the models.  相似文献   

19.
A finite population consists of kN individuals of N different categories with k individuals each. It is required to estimate the unknown parameter N, the number of different classes in the population. A sequential sampling scheme is considered in which individuals are sampled until a preassigned number of repetitions of already observed categories occur in the sample. Corresponding fixed sample size schemes were considered by Charalambides (1981). The sequential sampling scheme has the advantage of always allowing unbiased estimation of the size parameter N. It is shown that relative to Charalambides' fixed sample size scheme only minor adjustments are required to account for the sequential scheme. In particular, MVU estimators of parametric functions are expressible in terms of the C-numbers introduced by Charalambides.  相似文献   

20.
Estimating effective population size or mutation rate with microsatellites   总被引:4,自引:0,他引:4  
Xu H  Fu YX 《Genetics》2004,166(1):555-563
Microsatellites are short tandem repeats that are widely dispersed among eukaryotic genomes. Many of them are highly polymorphic; they have been used widely in genetic studies. Statistical properties of all measures of genetic variation at microsatellites critically depend upon the composite parameter theta = 4Nmicro, where N is the effective population size and micro is mutation rate per locus per generation. Since mutation leads to expansion or contraction of a repeat number in a stepwise fashion, the stepwise mutation model has been widely used to study the dynamics of these loci. We developed an estimator of theta, theta; (F), on the basis of sample homozygosity under the single-step stepwise mutation model. The estimator is unbiased and is much more efficient than the variance-based estimator under the single-step stepwise mutation model. It also has smaller bias and mean square error (MSE) than the variance-based estimator when the mutation follows the multistep generalized stepwise mutation model. Compared with the maximum-likelihood estimator theta; (L) by, theta; (F) has less bias and smaller MSE in general. theta; (L) has a slight advantage when theta is small, but in such a situation the bias in theta; (L) may be more of a concern.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号