首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Multiple logistic regression analysis is used to estimate the relative risk in case control studies. The estimators obtained are valid when disease is rare. In this paper an estimator of relative risk in a case control study has been proposed using logistic regression results when the incidence of disease is not small. The bias of the usual estimator through logistic regression as compared to the new estimator has been worked out. The expression of Mean Square Error of proposed estimator has been derived in situations when the incidence of disease is known exactly as well as when estimated through an independent survey. It has been observed that there is a significant bias using the conventional estimator of relative risk when incidence of disease is high. In such situations the proposed estimator can be used with advantage.  相似文献   

2.
Yao YC  Tai JJ 《Biometrics》2000,56(3):795-800
Segregation ratio estimation has long been important in human genetics. A simple truncated binomial model is considered that assumes complete ascertainment and a deterministic genotype-phenotype relationship. A simple but intuitively appealing estimator of the segregation ratio, previously proposed, is shown to have a negative bias. It is also shown that the bias of this estimator can be largely reduced via a randomization device, resulting in a new estimator that has the same large-sample behavior but with a negligible bias (decaying at a geometric rate). Numerical results are given to show the small-sample performance of this new estimator. An extension to incomplete ascertainment is also considered.  相似文献   

3.
This paper considers a Stein‐rule mixed regression estimator for estimating a normal linear regression model in the presence of stochastic linear constraints. We derive the small disturbance asymptotic bias and risk of the proposed estimator, and analytically compare its risk with other related estimators. A Monte‐Carlo experiment investigates the empirical risk performance of the proposed estimator.  相似文献   

4.
Is cross-validation valid for small-sample microarray classification?   总被引:5,自引:0,他引:5  
MOTIVATION: Microarray classification typically possesses two striking attributes: (1) classifier design and error estimation are based on remarkably small samples and (2) cross-validation error estimation is employed in the majority of the papers. Thus, it is necessary to have a quantifiable understanding of the behavior of cross-validation in the context of very small samples. RESULTS: An extensive simulation study has been performed comparing cross-validation, resubstitution and bootstrap estimation for three popular classification rules-linear discriminant analysis, 3-nearest-neighbor and decision trees (CART)-using both synthetic and real breast-cancer patient data. Comparison is via the distribution of differences between the estimated and true errors. Various statistics for the deviation distribution have been computed: mean (for estimator bias), variance (for estimator precision), root-mean square error (for composition of bias and variance) and quartile ranges, including outlier behavior. In general, while cross-validation error estimation is much less biased than resubstitution, it displays excessive variance, which makes individual estimates unreliable for small samples. Bootstrap methods provide improved performance relative to variance, but at a high computational cost and often with increased bias (albeit, much less than with resubstitution).  相似文献   

5.
Estimating the rate of change of the composition of communities is of direct interest to address many fundamental and applied questions in ecology. One methodological problem is that it is hard to detect all the species present in a community. Nichols et al. presented an estimator of the local extinction rate that takes into account species probability of detection, but little information is available on its performance. However, they predicted that if a covariance between species detection probability and local extinction rate exists in a community, the estimator of local extinction rate complement would be positively biased.
Here, we show, using simulations over a wide range of parameters that the estimator performs reasonably well. The bias induced by biological factors appears relatively weak. The most important factor enhancing the performance (bias and precision) of the local extinction rate complement estimator is sampling effort. Interestingly, a potentially important biological bias, such as the covariance effect, improves the estimation for small sampling efforts, without inducing a supplementary overestimation when these sampling efforts are high. In the field, all species are rarely detectable so we recommend the use of such estimators that take into account heterogeneity in species detection probability when estimating vital rates responsible for community changes.  相似文献   

6.
The purpose of the study is to estimate the population size under a homogeneous truncated count model and under model contaminations via the Horvitz‐Thompson approach on the basis of a count capture‐recapture experiment. The proposed estimator is based on a mixture of zero‐truncated Poisson distributions. The benefit of using the proposed model is statistical inference of the long‐tailed or skewed distributions and the concavity of the likelihood function with strong results available on the nonparametric maximum likelihood estimator (NPMLE). The results of comparisons, for finding the appropriate estimator among McKendrick's, Mantel‐Haenszel's, Zelterman's, Chao's, the maximum likelihood, and the proposed methods in a simulation study, reveal that under model contaminations the proposed estimator provides the best choice according to its smallest bias and smallest mean square error for a situation of sufficiently large population sizes and the further results show that the proposed estimator performs well even for a homogeneous situation. The empirical examples, containing the cholera epidemic in India based on homogeneity and the heroin user data in Bangkok 2002 based on heterogeneity, are fitted with an excellent goodness‐of‐fit of the models and the confidence interval estimations may also be of considerable interest. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

7.
For multicenter randomized trials or multilevel observational studies, the Cox regression model has long been the primary approach to study the effects of covariates on time-to-event outcomes. A critical assumption of the Cox model is the proportionality of the hazard functions for modeled covariates, violations of which can result in ambiguous interpretations of the hazard ratio estimates. To address this issue, the restricted mean survival time (RMST), defined as the mean survival time up to a fixed time in a target population, has been recommended as a model-free target parameter. In this article, we generalize the RMST regression model to clustered data by directly modeling the RMST as a continuous function of restriction times with covariates while properly accounting for within-cluster correlations to achieve valid inference. The proposed method estimates regression coefficients via weighted generalized estimating equations, coupled with a cluster-robust sandwich variance estimator to achieve asymptotically valid inference with a sufficient number of clusters. In small-sample scenarios where a limited number of clusters are available, however, the proposed sandwich variance estimator can exhibit negative bias in capturing the variability of regression coefficient estimates. To overcome this limitation, we further propose and examine bias-corrected sandwich variance estimators to reduce the negative bias of the cluster-robust sandwich variance estimator. We study the finite-sample operating characteristics of proposed methods through simulations and reanalyze two multicenter randomized trials.  相似文献   

8.
A modified estimator of heritability is proposed under heteroscedastic one way unbalanced random model. The distribution, moments and probability of permissible values (PPV) for conventional and modified estimators are derived. The behaviour of two estimators has been investigated, numerically, to devise a suitable estimator of heritability under variance heterogeneity. The numerical results reveal that under balanced case the heteroscedasticity affects the bias, MSE and PPV of conventional estimator, marginally. In case of unbalanced situations, the conventional estimator underestimates the parameter when more variable group has more observations and overestimates when more variable group has less observations, MSE of the conventional estimator decreases when more variable group has more observations and increases when more variable group has less observations and PPV is marginally decreased. The MSE and PPV are comparable for two estimators while the bias of modified estimator is less than the conventional estimator particularly for small and medium values of the parameter. These results suggest the use of modified estimator with equal or more observations for more variable group in presence of variance heterogeneity.  相似文献   

9.
Mancl and DeRouen (2001, Biometrics57, 126-134) and Kauermann and Carroll (2001, JASA96, 1387-1398) proposed alternative bias-corrected covariance estimators for generalized estimating equations parameter estimates of regression models for marginal means. The finite sample properties of these estimators are compared to those of the uncorrected sandwich estimator that underestimates variances in small samples. Although the formula of Mancl and DeRouen generally overestimates variances, it often leads to coverage of 95% confidence intervals near the nominal level even in some situations with as few as 10 clusters. An explanation for these seemingly contradictory results is that the tendency to undercoverage resulting from the substantial variability of sandwich estimators counteracts the impact of overcorrecting the bias. However, these positive results do not generally hold; for small cluster sizes (e.g., <10) their estimator often results in overcoverage, and the bias-corrected covariance estimator of Kauermann and Carroll may be preferred. The methods are illustrated using data from a nested cross-sectional cluster intervention trial on reducing underage drinking.  相似文献   

10.
Haas PJ  Liu Y  Stokes L 《Biometrics》2006,62(1):135-141
We consider the problem of estimating the number of distinct species S in a study area from the recorded presence or absence of species in each of a sample of quadrats. A generalized jackknife estimator of S is derived, along with an estimate of its variance. It is compared with the jackknife estimator for S proposed by Heltshe and Forrester and the empirical Bayes estimator of Mingoti and Meeden. We show that the empirical Bayes estimator has the form of a generalized jackknife estimator under a specific model for species distribution. We compare the new estimators of S to the empirical Bayes estimator via simulation. We characterize circumstances under which each is superior.  相似文献   

11.
Outcome misclassification occurs frequently in binary-outcome studies and can result in biased estimation of quantities such as the incidence, prevalence, cause-specific hazards, cumulative incidence functions, and so forth. A number of remedies have been proposed to address the potential misclassification of the outcomes in such data. The majority of these remedies lie in the estimation of misclassification probabilities, which are in turn used to adjust analyses for outcome misclassification. A number of authors advocate using a gold-standard procedure on a sample internal to the study to learn about the extent of the misclassification. With this type of internal validation, the problem of quantifying the misclassification also becomes a missing data problem as, by design, the true outcomes are only ascertained on a subset of the entire study sample. Although, the process of estimating misclassification probabilities appears simple conceptually, the estimation methods proposed so far have several methodological and practical shortcomings. Most methods rely on missing outcome data to be missing completely at random (MCAR), a rather stringent assumption which is unlikely to hold in practice. Some of the existing methods also tend to be computationally-intensive. To address these issues, we propose a computationally-efficient, easy-to-implement, pseudo-likelihood estimator of the misclassification probabilities under a missing at random (MAR) assumption, in studies with an available internal-validation sample. We present the estimator through the lens of studies with competing-risks outcomes, though the estimator extends beyond this setting. We describe the consistency and asymptotic distributional properties of the resulting estimator, and derive a closed-form estimator of its variance. The finite-sample performance of this estimator is evaluated via simulations. Using data from a real-world study with competing-risks outcomes, we illustrate how the proposed method can be used to estimate misclassification probabilities. We also show how the estimated misclassification probabilities can be used in an external study to adjust for possible misclassification bias when modeling cumulative incidence functions.  相似文献   

12.
A simple linear regression model is considered where the independent variable assumes only a finite number of values and the response variable is randomly right censored. However, the censoring distribution may depend on the covariate values. A class of noniterative estimators for the slope parameter, namely, the noniterative unrestricted estimator, noniterative restricted estimator and noniterative improved pretest estimator are proposed. The asymptotic bias and mean squared errors of the proposed estimators are derived and compared. The relative dominance picture of the estimators is investigated. A simulation study is also performed to asses the properties of the various estimators for small samples.  相似文献   

13.
Empirical Bayes models have been shown to be powerful tools for identifying differentially expressed genes from gene expression microarray data. An example is the WAME model, where a global covariance matrix accounts for array-to-array correlations as well as differing variances between arrays. However, the existing method for estimating the covariance matrix is very computationally intensive and the estimator is biased when data contains many regulated genes. In this paper, two new methods for estimating the covariance matrix are proposed. The first method is a direct application of the EM algorithm for fitting the multivariate t-distribution of the WAME model. In the second method, a prior distribution for the log fold-change is added to the WAME model, and a discrete approximation is used for this prior. Both methods are evaluated using simulated and real data. The first method shows equal performance compared to the existing method in terms of bias and variability, but is superior in terms of computer time. For large data sets (>15 arrays), the second method also shows superior computer run time. Moreover, for simulated data with regulated genes the second method greatly reduces the bias. With the proposed methods it is possible to apply the WAME model to large data sets with reasonable computer run times. The second method shows a small bias for simulated data, but appears to have a larger bias for real data with many regulated genes.  相似文献   

14.
Several methods have been designed to infer species trees from gene trees while taking into account gene tree/species tree discordance. Although some of these methods provide consistent species tree topology estimates under a standard model, most either do not estimate branch lengths or are computationally slow. An exception, the GLASS method of Mossel and Roch, is consistent for the species tree topology, estimates branch lengths, and is computationally fast. However, GLASS systematically overestimates divergence times, leading to biased estimates of species tree branch lengths. By assuming a multispecies coalescent model in which multiple lineages are sampled from each of two taxa at L independent loci, we derive the distribution of the waiting time until the first interspecific coalescence occurs between the two taxa, considering all loci and measuring from the divergence time. We then use the mean of this distribution to derive a correction to the GLASS estimator of pairwise divergence times. We show that our improved estimator, which we call iGLASS, consistently estimates the divergence time between a pair of taxa as the number of loci approaches infinity, and that it is an unbiased estimator of divergence times when one lineage is sampled per taxon. We also show that many commonly used clustering methods can be combined with the iGLASS estimator of pairwise divergence times to produce a consistent estimator of the species tree topology. Through simulations, we show that iGLASS can greatly reduce the bias and mean squared error in obtaining estimates of divergence times in a species tree.  相似文献   

15.
Zhang Z  Chen Z  Troendle JF  Zhang J 《Biometrics》2012,68(3):697-706
Summary The current statistical literature on causal inference is primarily concerned with population means of potential outcomes, while the current statistical practice also involves other meaningful quantities such as quantiles. Motivated by the Consortium on Safe Labor (CSL), a large observational study of obstetric labor progression, we propose and compare methods for estimating marginal quantiles of potential outcomes as well as quantiles among the treated. By adapting existing methods and techniques, we derive estimators based on outcome regression (OR), inverse probability weighting, and stratification, as well as a doubly robust (DR) estimator. By incorporating stratification into the DR estimator, we further develop a hybrid estimator with enhanced numerical stability at the expense of a slight bias under misspecification of the OR model. The proposed methods are illustrated with the CSL data and evaluated in simulation experiments mimicking the CSL.  相似文献   

16.
In diagnostic medicine, the volume under the receiver operating characteristic (ROC) surface (VUS) is a commonly used index to quantify the ability of a continuous diagnostic test to discriminate between three disease states. In practice, verification of the true disease status may be performed only for a subset of subjects under study since the verification procedure is invasive, risky, or expensive. The selection for disease examination might depend on the results of the diagnostic test and other clinical characteristics of the patients, which in turn can cause bias in estimates of the VUS. This bias is referred to as verification bias. Existing verification bias correction in three‐way ROC analysis focuses on ordinal tests. We propose verification bias‐correction methods to construct ROC surface and estimate the VUS for a continuous diagnostic test, based on inverse probability weighting. By applying U‐statistics theory, we develop asymptotic properties for the estimator. A Jackknife estimator of variance is also derived. Extensive simulation studies are performed to evaluate the performance of the new estimators in terms of bias correction and variance. The proposed methods are used to assess the ability of a biomarker to accurately identify stages of Alzheimer's disease.  相似文献   

17.
The problem of estimation of ratio of population proportions is considered and a difference-type estimator is proposed using auxiliary information. The bias and mean squared error of the proposed estimator is found and compared to the usual estimator and also to WYNN'S (1976) type estimator. An example is included for illustration.  相似文献   

18.
19.
In this article we construct and study estimators of the causal effect of a time-dependent treatment on survival in longitudinal studies. We employ a particular marginal structural model (MSM), proposed by Robins (2000), and follow a general methodology for constructing estimating functions in censored data models. The inverse probability of treatment weighted (IPTW) estimator of Robins et al. (2000) is used as an initial estimator and forms the basis for an improved, one-step estimator that is consistent and asymptotically linear when the treatment mechanism is consistently estimated. We extend these methods to handle informative censoring. The proposed methodology is employed to estimate the causal effect of exercise on mortality in a longitudinal study of seniors in Sonoma County. A simulation study demonstrates the bias of naive estimators in the presence of time-dependent confounders and also shows the efficiency gain of the IPTW estimator, even in the absence such confounding. The efficiency gain of the improved, one-step estimator is demonstrated through simulation.  相似文献   

20.
Antoniadou T  Wallach D 《Biometrics》2000,56(2):420-426
It is important, both for farmer profit and for the environment, to correctly dose nitrogen fertilizer for crop growth. Fertilizer recommendations are embodied in decision rules, which give a recommended dose of nitrogen (N) as a function of information available at the time the decision is made. In this paper, we first propose a criterion for evaluating decision rules. The proposed criterion is the expectation of the objective function when the decision rule is implemented. The major problem here is the estimation of this criterion. Two estimators are considered, a model-based and a nonparametric estimator. A simulation study shows that, in essentially all cases, the nonparametric estimator is better or no worse than the model-based estimator. The bias in the nonparametric estimator is always very small.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号