首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Estimation of a common effect parameter from sparse follow-up data   总被引:30,自引:0,他引:30  
Breslow (1981, Biometrika 68, 73-84) has shown that the Mantel-Haenszel odds ratio is a consistent estimator of a common odds ratio in sparse stratifications. For cohort studies, however, estimation of a common risk ratio or risk difference can be of greater interest. Under a binomial sparse-data model, the Mantel-Haenszel risk ratio and risk difference estimators are consistent in sparse stratifications, while the maximum likelihood and weighted least squares estimators are biased. Under Poisson sparse-data models, the Mantel-Haenszel and maximum likelihood rate ratio estimators have equal asymptotic variances under the null hypothesis and are consistent, while the weighted least squares estimators are again biased; similarly, of the common rate difference estimators the weighted least squares estimators are biased, while the estimator employing "Mantel-Haenszel" weights is consistent in sparse data. Variance estimators that are consistent in both sparse data and large strata can be derived for all the Mantel-Haenszel estimators.  相似文献   

2.
This paper proposes a new estimator of the variance of the Mantel-Haenszel estimator of common odds ratio that is easily computed and consistent in both sparse data and large-strata limiting models. Monte Carlo experiments compare its performance to that of previously proposed variance estimators.  相似文献   

3.
Randomized trials with continuous outcomes are often analyzed using analysis of covariance (ANCOVA), with adjustment for prognostic baseline covariates. The ANCOVA estimator of the treatment effect is consistent under arbitrary model misspecification. In an article recently published in the journal, Wang et al proved the model-based variance estimator for the treatment effect is also consistent under outcome model misspecification, assuming the probability of randomization to each treatment is 1/2. In this reader reaction, we derive explicit expressions which show that when randomization is unequal, the model-based variance estimator can be biased upwards or downwards. In contrast, robust sandwich variance estimators can provide asymptotically valid inferences under arbitrary misspecification, even when randomization probabilities are not equal.  相似文献   

4.
K F Yu 《Biometrics》1992,48(3):961-3; discussion 963-4
An estimator proposed by Greenland and Holland (1991, Biometrics 47, 319-322) for a standardized risk difference parameter is shown to be a maximum likelihood estimator if the consistent estimator of the common odds ratio is appropriately chosen. The statistical problem under consideration is reparameterized. Likelihood equations are derived.  相似文献   

5.
Summary .   Standard prospective logistic regression analysis of case–control data often leads to very imprecise estimates of gene-environment interactions due to small numbers of cases or controls in cells of crossing genotype and exposure. In contrast, under the assumption of gene-environment independence, modern "retrospective" methods, including the "case-only" approach, can estimate the interaction parameters much more precisely, but they can be seriously biased when the underlying assumption of gene-environment independence is violated. In this article, we propose a novel empirical Bayes-type shrinkage estimator to analyze case–control data that can relax the gene-environment independence assumption in a data-adaptive fashion. In the special case, involving a binary gene and a binary exposure, the method leads to an estimator of the interaction log odds ratio parameter in a simple closed form that corresponds to an weighted average of the standard case-only and case–control estimators. We also describe a general approach for deriving the new shrinkage estimator and its variance within the retrospective maximum-likelihood framework developed by Chatterjee and Carroll (2005, Biometrika 92, 399–418). Both simulated and real data examples suggest that the proposed estimator strikes a balance between bias and efficiency depending on the true nature of the gene-environment association and the sample size for a given study.  相似文献   

6.
We consider the estimation of the scaled mutation parameter θ, which is one of the parameters of key interest in population genetics. We provide a general result showing when estimators of θ can be improved using shrinkage when taking the mean squared error as the measure of performance. As a consequence, we show that Watterson’s estimator is inadmissible, and propose an alternative shrinkage-based estimator that is easy to calculate and has a smaller mean squared error than Watterson’s estimator for all possible parameter values 0<θ<. This estimator is admissible in the class of all linear estimators. We then derive improved versions for other estimators of θ, including the MLE. We also investigate how an improvement can be obtained both when combining information from several independent loci and when explicitly taking into account recombination. A simulation study provides information about the amount of improvement achieved by our alternative estimators.  相似文献   

7.
A general statistical framework is proposed for comparing linear models of spatial process and pattern. A spatial linear model for nested analysis of variance can be based on either fixed effects or random effects. Greig-Smith (1952) originally used a fixed effects model, but there are also examples of random effects models in the soil science literature. Assuming intrinsic stationarity for a linear model, the expectations of a spatial nested ANOVA and two term local variance (TTLV, Hill 1973) are functions of the variogram, and several examples are given. Paired quadrat variance (PQV, Ludwig & Goodall 1978) is a variogram estimator which can be used to approximate TTLV, and we provide an example from ecological data. Both nested ANOVA and TTLV can be seen as weighted lag-1 variogram estimators that are functions of support, rather than distance. We show that there are two unbiased estimators for the variogram under aggregation, and computer simulation shows that the estimator with smaller variance depends on the process autocorrelation.  相似文献   

8.
In observational cohort studies with complex sampling schemes, truncation arises when the time to event of interest is observed only when it falls below or exceeds another random time, that is, the truncation time. In more complex settings, observation may require a particular ordering of event times; we refer to this as sequential truncation. Estimators of the event time distribution have been developed for simple left-truncated or right-truncated data. However, these estimators may be inconsistent under sequential truncation. We propose nonparametric and semiparametric maximum likelihood estimators for the distribution of the event time of interest in the presence of sequential truncation, under two truncation models. We show the equivalence of an inverse probability weighted estimator and a product limit estimator under one of these models. We study the large sample properties of the proposed estimators and derive their asymptotic variance estimators. We evaluate the proposed methods through simulation studies and apply the methods to an Alzheimer's disease study. We have developed an R package, seqTrun , for implementation of our method.  相似文献   

9.
BACKGROUND: The ratio of two measured fluorescence signals (called x and y) is used in different applications in fluorescence microscopy. Multiple instances of both signals can be combined in different ways to construct different ratio estimators. METHODS: The mean and variance of three estimators for the ratio between two random variables, x and y, are discussed. Given n samples of x and y, we can intuitively construct two different estimators: the mean of the ratio of each x and y and the ratio between the mean of x and the mean of y. The former is biased and the latter is only asymptotically unbiased. Using the statistical characteristics of this estimator, a third, unbiased estimator can be constructed. RESULTS: We tested the three estimators on simulated data, real-world fluorescence test images, and comparative genome hybridization (CGH) data. The results on the simulated and real-world test images confirm the presented theory. The CGH experiments show that our new estimator performs better than the existing estimators. CONCLUSIONS: We have derived an unbiased ratio estimator that outperforms intuitive ratio estimators.  相似文献   

10.
Estimating the encounter rate variance in distance sampling   总被引:1,自引:0,他引:1  
Summary .  The dominant source of variance in line transect sampling is usually the encounter rate variance. Systematic survey designs are often used to reduce the true variability among different realizations of the design, but estimating the variance is difficult and estimators typically approximate the variance by treating the design as a simple random sample of lines. We explore the properties of different encounter rate variance estimators under random and systematic designs. We show that a design-based variance estimator improves upon the model-based estimator of Buckland et al. (2001, Introduction to Distance Sampling. Oxford: Oxford University Press, p. 79) when transects are positioned at random. However, if populations exhibit strong spatial trends, both estimators can have substantial positive bias under systematic designs. We show that poststratification is effective in reducing this bias.  相似文献   

11.
Commonly used semiparametric estimators of causal effects specify parametric models for the propensity score (PS) and the conditional outcome. An example is an augmented inverse probability weighting (IPW) estimator, frequently referred to as a doubly robust estimator, because it is consistent if at least one of the two models is correctly specified. However, in many observational studies, the role of the parametric models is often not to provide a representation of the data-generating process but rather to facilitate the adjustment for confounding, making the assumption of at least one true model unlikely to hold. In this paper, we propose a crude analytical approach to study the large-sample bias of estimators when the models are assumed to be approximations of the data-generating process, namely, when all models are misspecified. We apply our approach to three prototypical estimators of the average causal effect, two IPW estimators, using a misspecified PS model, and an augmented IPW (AIPW) estimator, using misspecified models for the outcome regression (OR) and the PS. For the two IPW estimators, we show that normalization, in addition to having a smaller variance, also offers some protection against bias due to model misspecification. To analyze the question of when the use of two misspecified models is better than one we derive necessary and sufficient conditions for when the AIPW estimator has a smaller bias than a simple IPW estimator and when it has a smaller bias than an IPW estimator with normalized weights. If the misspecification of the outcome model is moderate, the comparisons of the biases of the IPW and AIPW estimators show that the AIPW estimator has a smaller bias than the IPW estimators. However, all biases include a scaling with the PS-model error and we suggest caution in modeling the PS whenever such a model is involved. For numerical and finite sample illustrations, we include three simulation studies and corresponding approximations of the large-sample biases. In a dataset from the National Health and Nutrition Examination Survey, we estimate the effect of smoking on blood lead levels.  相似文献   

12.
The standardized mean difference is a well‐known effect size measure for continuous, normally distributed data. In this paper we present a general basis for important other distribution families. As a general concept, usable for every distribution family, we introduce the relative effect, also called Mann–Whitney effect size measure of stochastic superiority. This measure is a truly robust measure, needing no assumptions about a distribution family. It is thus the preferred tool for assumption‐free, confirmatory studies. For normal distribution shift, proportional odds, and proportional hazards, we show how to derive many global values such as risk difference average, risk difference extremum, and odds ratio extremum. We demonstrate that the well‐known benchmark values of Cohen with respect to group differences—small, medium, large—can be translated easily into corresponding Mann–Whitney values. From these, we get benchmarks for parameters of other distribution families. Furthermore, it is shown that local measures based on binary data (2 × 2 tables) can be associated with the Mann–Whitney measure: The concept of stochastic superiority can always be used. It is a general statistical value in every distribution family. It therefore yields a procedure for standardizing the assessment of effect size measures. We look at the aspect of relevance of an effect size and—introducing confidence intervals—present some examples for use in statistical practice.  相似文献   

13.
Distance based algorithms are a common technique in the construction of phylogenetic trees from taxonomic sequence data. The first step in the implementation of these algorithms is the calculation of a pairwise distance matrix to give a measure of the evolutionary change between any pair of the extant taxa. A standard technique is to use the log det formula to construct pairwise distances from aligned sequence data. We review a distance measure valid for the most general models, and show how the log det formula can be used as an estimator thereof. We then show that the foundation upon which the log det formula is constructed can be generalized to produce a previously unknown estimator which improves the consistency of the distance matrices constructed from the log det formula. This distance estimator provides a consistent technique for constructing quartets from phylogenetic sequence data under the assumption of the most general Markov model of sequence evolution.  相似文献   

14.
We describe an estimator of the parameter indexing a model for the conditional odds ratio between a binary exposure and a binary outcome given a high-dimensional vector of confounders, when the exposure and a subset of the confounders are missing, not necessarily simultaneously, in a subsample. We argue that a recently proposed estimator restricted to complete-cases confers more protection to model misspecification than existing ones in the sense that the set of data laws under which it is consistent strictly contains each set of data laws under which each of the previous estimators are consistent.  相似文献   

15.
S Greenland 《Biometrics》1989,45(1):183-191
Mickey and Elashoff (1985, Biometrics 41, 623-635) gave an extension of Mantel-Haenszel estimation to log-linear models for 2 x J x K tables. Their extension yields two generalizations of the Mantel-Haenszel odds ratio estimator to K 2 x J tables. This paper provides variance and covariance estimators for these generalized Mantel-Haenszel estimators that are dually consistent (i.e., consistent in both large strata and sparse data), and presents comparisons of the efficiency of the generalized Mantel-Haenszel estimators.  相似文献   

16.
In a randomized clinical trial (RCT), noncompliance with an assigned treatment can occur due to serious side effects, while missing outcomes on patients may happen due to patients' withdrawal or loss to follow up. To avoid the possible loss of power to detect a given risk difference (RD) of interest between two treatments, it is essentially important to incorporate the information on noncompliance and missing outcomes into sample size calculation. Under the compound exclusion restriction model proposed elsewhere, we first derive the maximum likelihood estimator (MLE) of the RD among compliers between two treatments for a RCT with noncompliance and missing outcomes and its asymptotic variance in closed form. Based on the MLE with tanh(-1)(x) transformation, we develop an asymptotic test procedure for testing equality of two treatment effects among compliers. We further derive a sample size calculation formula accounting for both noncompliance and missing outcomes for a desired power 1 - beta at a nominal alpha-level. To evaluate the performance of the test procedure and the accuracy of the sample size calculation formula, we employ Monte Carlo simulation to calculate the estimated Type I error and power of the proposed test procedure corresponding to the resulting sample size in a variety of situations. We find that both the test procedure and the sample size formula developed here can perform well. Finally, we include a discussion on the effects of various parameters, including the proportion of compliers, the probability of non-missing outcomes, and the ratio of sample size allocation, on the minimum required sample size.  相似文献   

17.
Inferring admixture proportions from molecular data   总被引:19,自引:2,他引:17  
We derive here two new estimators of admixture proportions based on a coalescent approach that explicitly takes into account molecular information as well as gene frequencies. These estimators can be applied to any type of molecular data (such as DNA sequences, restriction fragment length polymorphisms [RFLPs], or microsatellite data) for which the extent of molecular diversity is related to coalescent times. Monte Carlo simulation studies are used to analyze the behavior of our estimators. We show that one of them (mY) appears suitable for estimating admixture from molecular data because of its absence of bias and relatively low variance. We then compare it to two conventional estimators that are based on gene frequencies. mY proves to be less biased than conventional estimators over a wide range of situations and especially for microsatellite data. However, its variance is larger than that of conventional estimators when parental populations are not very differentiated. The variance of mY becomes smaller than that of conventional estimators only if parental populations have been kept separated for about N generations and if the mutation rate is high. Simulations also show that several loci should always be studied to achieve a drastic reduction of variance and that, for microsatellite data, the mean square error of mY rapidly becomes smaller than that of conventional estimators if enough loci are surveyed. We apply our new estimator to the case of admixed wolflike Canid populations tested for microsatellite data.   相似文献   

18.
DeGiorgio M  Jankovic I  Rosenberg NA 《Genetics》2010,186(4):1367-1387
Gene diversity, a commonly used measure of genetic variation, evaluates the proportion of heterozygous individuals expected at a locus in a population, under the assumption of Hardy-Weinberg equilibrium. When using the standard estimator of gene diversity, the inclusion of related or inbred individuals in a sample produces a downward bias. Here, we extend a recently developed estimator shown to be unbiased in a diploid autosomal sample that includes known related or inbred individuals to the general case of arbitrary ploidy. We derive an exact formula for the variance of the new estimator, H, and present an approximation to facilitate evaluation of the variance when each individual is related to at most one other individual in a sample. When examining samples from the human X chromosome, which represent a mixture of haploid and diploid individuals, we find that H performs favorably compared to the standard estimator, both in theoretical computations of mean squared error and in data analysis. We thus propose that H is a useful tool in characterizing gene diversity in samples of arbitrary ploidy that contain related or inbred individuals.  相似文献   

19.
Datta S  Satten GA 《Biometrics》2002,58(4):792-802
We propose nonparametric estimators of the stage occupation probabilities and transition hazards for a multistage system that is not necessarily Markovian, using data that are subject to dependent right censoring. We assume that the hazard of being censored at a given instant depends on a possibly time-dependent covariate process as opposed to assuming a fixed censoring hazard (independent censoring). The estimator of the integrated transition hazard matrix has a Nelson-Aalen form where each of the counting processes counting the number of transitions between states and the risk sets for leaving each stage have an IPCW (inverse probability of censoring weighted) form. We estimate these weights using Aalen's linear hazard model. Finally, the stage occupation probabilities are obtained from the estimated integrated transition hazard matrix via product integration. Consistency of these estimators under the general paradigm of non-Markov models is established and asymptotic variance formulas are provided. Simulation results show satisfactory performance of these estimators. An analysis of data on graft-versus-host disease for bone marrow transplant patients is used as an illustration.  相似文献   

20.
In case-control studies with matched pairs, the traditional point estimator of odds ratio (OR) is well-known to be biased with no exact finite variance under binomial sampling. In this paper, we consider use of inverse sampling in which we continue to sample subjects to form matched pairs until we obtain a pre-determined number (>0) of index pairs with the case unexposed but the control exposed. In contrast to use of binomial sampling, we show that the uniformly minimum variance unbiased estimator (UMVUE) of OR does exist under inverse sampling. We further derive an exact confidence interval of OR in closed form. Finally, we develop an exact test and an asymptotic test for testing the null hypothesis H0: OR = 1, as well as discuss sample size determination on the minimum required number of index pairs for a desired power at α-level.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号