首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Standard errors for attributable risk for simple and complex sample designs   总被引:1,自引:0,他引:1  
Graubard BI  Fears TR 《Biometrics》2005,61(3):847-855
Adjusted attributable risk (AR) is the proportion of diseased individuals in a population that is due to an exposure. We consider estimates of adjusted AR based on odds ratios from logistic regression to adjust for confounding. Influence function methods used in survey sampling are applied to obtain simple and easily programmable expressions for estimating the variance of AR. These variance estimators can be applied to data from case-control, cross-sectional, and cohort studies with or without frequency or individual matching and for sample designs with subject samples that range from simple random samples to (sample) weighted multistage stratified cluster samples like those used in national household surveys. The variance estimation of AR is illustrated with: (i) a weighted stratified multistage clustered cross-sectional study of childhood asthma from the Third National Health and Examination Survey (NHANES III), and (ii) a frequency-matched case-control study of melanoma skin cancer.  相似文献   

2.
This paper outlines methods of determining sample size for epidemiologic research in studies of the etiologic fraction. The basic model with a dichotomous disease and a single dichotomous exposure factor is considered. To determine sample size, the researcher must specify: the magnitude of the etiologic fraction ε to be detected as statistically significant, the level of significance α, the power 1 - β of the test, p the proportion of the population exposed to the risk factor and R the proportion of the population with the disease. Sample size formulas and tables are presented for the case-control, cohort and cross-sectional designs. Optimal allocation considerations are examined to minimize cost for a specified power. Extensive use is made of Walter's results concerning the asymptotic variance of the maximum likelihood estimator of the etiologic fraction for the three epidemiologic study designs.  相似文献   

3.
L A Kalish 《Biometrics》1990,46(2):493-499
The standard estimator of the common odds ratio for pair-matched case-control studies, the stratified estimate, is consistent but it ignores all information from the concordant pairs. At the other extreme, the pooled estimator is more efficient as it uses all the data, but is not consistent. In order to trade between bias and precision, Liang and Zeger (1988, Biometrics 44, 1145-1156) proposed an estimator that is a compromise between the stratified and pooled estimates. In the current paper, the possibility of optimizing the trade-off is explored. Specifically, the family of weighted averages of the stratified and pooled estimates is considered, and the weight that minimizes an asymptotic approximation of mean squared error is derived. In practice, the optimal weight must be estimated from the data so that the estimator is only approximately optimal. Small-sample properties are evaluated via simulations.  相似文献   

4.
Fewster RM 《Biometrics》2011,67(4):1518-1531
Summary In spatial surveys for estimating the density of objects in a survey region, systematic designs will generally yield lower variance than random designs. However, estimating the systematic variance is well known to be a difficult problem. Existing methods tend to overestimate the variance, so although the variance is genuinely reduced, it is over‐reported, and the gain from the more efficient design is lost. The current approaches to estimating a systematic variance for spatial surveys are to approximate the systematic design by a random design, or approximate it by a stratified design. Previous work has shown that approximation by a random design can perform very poorly, while approximation by a stratified design is an improvement but can still be severely biased in some situations. We develop a new estimator based on modeling the encounter process over space. The new “striplet” estimator has negligible bias and excellent precision in a wide range of simulation scenarios, including strip‐sampling, distance‐sampling, and quadrat‐sampling surveys, and including populations that are highly trended or have strong aggregation of objects. We apply the new estimator to survey data for the spotted hyena (Crocuta crocuta) in the Serengeti National Park, Tanzania, and find that the reported coefficient of variation for estimated density is 20% using approximation by a random design, 17% using approximation by a stratified design, and 11% using the new striplet estimator. This large reduction in reported variance is verified by simulation.  相似文献   

5.
J M Robins  M H Gail  J H Lubin 《Biometrics》1986,42(2):293-299
The authors consider several aspects of the design and analysis of synthetic case-control studies of cohort data under a proportional hazards model. First, in highly stratified data, consistent estimates of the relative risk are shown to result only if controls are sampled randomly with replacement from the entire risk set or without replacement from the noncases. Second, if previous controls are excluded from consideration as future controls but are included as cases if they fail, then inconsistent estimates of the relative risk can occur if "time" in the proportional hazards model represents an individual's chronological age and age at entry into follow-up is variable. On the other hand, if "time" represents time since the beginning of follow-up, estimates of the relative risk will be consistent, but the usual variance estimator will be inconsistent.  相似文献   

6.
When the underlying disease is rare, to control the coefficient of variation for the sample proportion of cases, we may wish to apply inverse sampling. In this paper, we derive the uniformly minimum variance unbiased estimator (UMVUE) of relative risk and its variance in closed form under inverse sampling. On the basis of a Monte Carlo simulation, we demonstrate that using the UMVUE of relative risk can substantially reduce the mean-squared-error of using the maximum likelihood estimator, especially when the number of index cases in both comparison samples is small. For a given fixed total cost, we include a program that can be used to find the optimal allocation for the number of index cases to minimize the variance of the UMVUE as well.  相似文献   

7.
K Y Liang 《Biometrics》1987,43(2):289-299
A class of estimating functions is proposed for the estimation of multivariate relative risk in stratified case-control studies. It reduces to the well-known Mantel-Haenszel estimator when there is a single binary risk factor. Large-sample properties of the solutions to the proposed estimating equations are established for two distinct situations. Efficiency calculations suggest that the proposed estimators are nearly fully efficient relative to the conditional maximum likelihood estimator for the parameters considered. Application of the proposed method to family data and longitudinal data, where the conditional likelihood approach fails, is discussed. Two examples from case-control studies and one example from a study on familial aggregation are presented.  相似文献   

8.
Bertail P  Tressou J 《Biometrics》2006,62(1):66-74
This article proposes statistical tools for quantitative evaluation of the risk due to the presence of some particular contaminants in food. We focus on the estimation of the probability of the exposure to exceed the so-called provisional tolerable weekly intake (PTWI), when both consumption data and contamination data are independently available. A Monte Carlo approximation of the plug-in estimator, which may be seen as an incomplete generalized U-statistic, is investigated. We obtain the asymptotic properties of this estimator and propose several confidence intervals, based on two estimators of the asymptotic variance: (i) a bootstrap type estimator and (ii) an approximate jackknife estimator relying on the Hoeffding decomposition of the original U-statistics. As an illustration, we present an evaluation of the exposure to Ochratoxin A in France.  相似文献   

9.
Mendelian randomization utilizes genetic variants as instrumental variables (IVs) to estimate the causal effect of an exposure variable on an outcome of interest even in the presence of unmeasured confounders. However, the popular inverse-variance weighted (IVW) estimator could be biased in the presence of weak IVs, a common challenge in MR studies. In this article, we develop a novel penalized inverse-variance weighted (pIVW) estimator, which adjusts the original IVW estimator to account for the weak IV issue by using a penalization approach to prevent the denominator of the pIVW estimator from being close to zero. Moreover, we adjust the variance estimation of the pIVW estimator to account for the presence of balanced horizontal pleiotropy. We show that the recently proposed debiased IVW (dIVW) estimator is a special case of our proposed pIVW estimator. We further prove that the pIVW estimator has smaller bias and variance than the dIVW estimator under some regularity conditions. We also conduct extensive simulation studies to demonstrate the performance of the proposed pIVW estimator. Furthermore, we apply the pIVW estimator to estimate the causal effects of five obesity-related exposures on three coronavirus disease 2019 (COVID-19) outcomes. Notably, we find that hypertensive disease is associated with an increased risk of hospitalized COVID-19; and peripheral vascular disease and higher body mass index are associated with increased risks of COVID-19 infection, hospitalized COVID-19, and critically ill COVID-19.  相似文献   

10.
Combining information across genes in the statistical analysis of microarray data is desirable because of the relatively small number of data points obtained for each individual gene. Here we develop an estimator of the error variance that can borrow information across genes using the James-Stein shrinkage concept. A new test statistic (FS) is constructed using this estimator. The new statistic is compared with other statistics used to test for differential expression: the gene-specific F test (F1), the pooled-variance F statistic (F3), a hybrid statistic (F2) that uses the average of the individual and pooled variances, the regularized t-statistic, the posterior odds statistic B, and the SAM t-test. The FS-test shows best or nearly best power for detecting differentially expressed genes over a wide range of simulated data in which the variance components associated with individual genes are either homogeneous or heterogeneous. Thus FS provides a powerful and robust approach to test differential expression of genes that utilizes information not available in individual gene testing approaches and does not suffer from biases of the pooled variance approach.  相似文献   

11.
Summary Occupational, environmental, and nutritional epidemiologists are often interested in estimating the prospective effect of time‐varying exposure variables such as cumulative exposure or cumulative updated average exposure, in relation to chronic disease endpoints such as cancer incidence and mortality. From exposure validation studies, it is apparent that many of the variables of interest are measured with moderate to substantial error. Although the ordinary regression calibration (ORC) approach is approximately valid and efficient for measurement error correction of relative risk estimates from the Cox model with time‐independent point exposures when the disease is rare, it is not adaptable for use with time‐varying exposures. By recalibrating the measurement error model within each risk set, a risk set regression calibration (RRC) method is proposed for this setting. An algorithm for a bias‐corrected point estimate of the relative risk using an RRC approach is presented, followed by the derivation of an estimate of its variance, resulting in a sandwich estimator. Emphasis is on methods applicable to the main study/external validation study design, which arises in important applications. Simulation studies under several assumptions about the error model were carried out, which demonstrated the validity and efficiency of the method in finite samples. The method was applied to a study of diet and cancer from Harvard's Health Professionals Follow‐up Study (HPFS).  相似文献   

12.
Isofemale lines are commonly used inDrosophila and other genera for the purpose of assaying genetic variation. Isofemale lines can be kept in the laboratory for many generations before genetic work is carried out, and permit the confirmation of newly discovered alleles. A problem not realized by many workers is that the commonly used estimate of allele frequency from these lines is biased. This estimation bias occurs at all times after the first laboratory generation, regardless of whether single individuals or pooled samples are used in each well of an electrophoretic gel. This bias can potentially affect the estimation of population genetic parameters, and in the case of rare allele analysis it can cause gross overestimates of gene flow. This paper provides a correction for allele frequency estimates derived from isofemale lines for any time after the lines are established in the laboratory. When pooled samples are used, this estimator performs better than the standard estimator at all times after the first generation. The estimator is also insensitive to multiple inseminations. After the lines have drifted oneN e generations, multiple inseminations actually make the new estimator perform better than it does in singly inseminated females. Simulations show that estimates made using either estimator after the lines have drifted to fixation have a much greater error associated with their use than do those estimates made earlier in time using the correction. In general it is better to use corrected estimates of gene frequency soon after lines are established than to use uncorrected estimates made after the first laboratory generation. This work was supported by an NSERC fellowship to A.D.L.  相似文献   

13.
Böhning D  Sarol J 《Biometrics》2000,56(1):304-308
In this paper, we consider the case of efficient estimation of the risk difference in a multicenter study allowing for baseline heterogeneity. We consider the optimally weighted estimator for the common risk difference and show that this estimator has considerable bias when the true weights (which are inversely proportional to the variances of the center-specific risk difference estimates) are replaced by their sample estimates. In addition, we propose a new estimator for this situation of the Mantel-Haenszel type that is unbiased and, in addition, has a smaller variance for small sample sizes within the study centers. Simulations illustrate these findings.  相似文献   

14.
T Sato 《Biometrics》1991,47(3):1165-1170
This paper proposes an extension of the Mantel-Haenszel rate ratio for the dichotomous exposure to the multiple exposure levels. This extension is based on the unbiased estimating function approach and yields closed-form Mantel-Haenszel rate ratio estimators. Dually consistent variance and covariance estimators of the estimating functions are given and a quasi-score-based confidence interval for individual common rate ratio is provided. A similar extension to the common rate difference case is also given.  相似文献   

15.
Summary .   We consider methods for estimating the effect of a covariate on a disease onset distribution when the observed data structure consists of right-censored data on diagnosis times and current status data on onset times amongst individuals who have not yet been diagnosed. Dunson and Baird (2001, Biometrics 57, 306–403) approached this problem using maximum likelihood, under the assumption that the ratio of the diagnosis and onset distributions is monotonic nondecreasing. As an alternative, we propose a two-step estimator, an extension of the approach of van der Laan, Jewell, and Petersen (1997, Biometrika 84, 539–554) in the single sample setting, which is computationally much simpler and requires no assumptions on this ratio. A simulation study is performed comparing estimates obtained from these two approaches, as well as that from a standard current status analysis that ignores diagnosis data. Results indicate that the Dunson and Baird estimator outperforms the two-step estimator when the monotonicity assumption holds, but the reverse is true when the assumption fails. The simple current status estimator loses only a small amount of precision in comparison to the two-step procedure but requires monitoring time information for all individuals. In the data that motivated this work, a study of uterine fibroids and chemical exposure to dioxin, the monotonicity assumption is seen to fail. Here, the two-step and current status estimators both show no significant association between the level of dioxin exposure and the hazard for onset of uterine fibroids; the two-step estimator of the relative hazard associated with increasing levels of exposure has the least estimated variance amongst the three estimators considered.  相似文献   

16.
Sequencing pools of individuals rather than individuals separately reduces the costs of estimating allele frequencies at many loci in many populations. Theoretical and empirical studies show that sequencing pools comprising a limited number of individuals (typically fewer than 50) provides reliable allele frequency estimates, provided that the DNA pooling and DNA sequencing steps are carefully controlled. Unequal contributions of different individuals to the DNA pool and the mean and variance in sequencing depth both can affect the standard error of allele frequency estimates. To our knowledge, no study separately investigated the effect of these two factors on allele frequency estimates; so that there is currently no method to a priori estimate the relative importance of unequal individual DNA contributions independently of sequencing depth. We develop a new analytical model for allele frequency estimation that explicitly distinguishes these two effects. Our model shows that the DNA pooling variance in a pooled sequencing experiment depends solely on two factors: the number of individuals within the pool and the coefficient of variation of individual DNA contributions to the pool. We present a new method to experimentally estimate this coefficient of variation when planning a pooled sequencing design where samples are either pooled before or after DNA extraction. Using this analytical and experimental framework, we provide guidelines to optimize the design of pooled sequencing experiments. Finally, we sequence replicated pools of inbred lines of the plant Medicago truncatula and show that the predictions from our model generally hold true when estimating the frequency of known multilocus haplotypes using pooled sequencing.  相似文献   

17.
18.
The state of readiness for high-dimensional single nucleotide polymorphism (SNP) epidemiologic association studies is described, as background for a discussion of statistical aspects of case-control study design and analysis. Specifically, the important role that multistage designs can play in the elimination of false-positive associations and in the control of study costs will be noted. Also, the trade-offs associated with using pooled DNA at early design stages for additional important cost reductions will be discussed in some detail. An odds ratio approach to relating SNP alleles to disease risk using pooled DNA will be proposed, in conjunction with a simple empirical variance estimator, based on comparisons among log-odds ratio estimators from distinct pairs of case and control pools. Simulation studies will be presented to evaluate the moderate sample size properties of such multistage designs and estimation procedures. The design of an ongoing three-stage study in the Women's Health Initiative to relate 250,000 SNPs to the risk of coronary heart disease, stroke, and breast cancer will provide illustration, and will be used to motivate the choice of simulation configurations.  相似文献   

19.
Multiple logistic regression analysis is used to estimate the relative risk in case control studies. The estimators obtained are valid when disease is rare. In this paper an estimator of relative risk in a case control study has been proposed using logistic regression results when the incidence of disease is not small. The bias of the usual estimator through logistic regression as compared to the new estimator has been worked out. The expression of Mean Square Error of proposed estimator has been derived in situations when the incidence of disease is known exactly as well as when estimated through an independent survey. It has been observed that there is a significant bias using the conventional estimator of relative risk when incidence of disease is high. In such situations the proposed estimator can be used with advantage.  相似文献   

20.
MOTIVATION: Pre-processing of SELDI-TOF mass spectrometry data is currently performed on a largel y ad hoc basis. This makes comparison of results from independent analyses troublesome and does not provide a framework for distinguishing different sources of variation in data. RESULTS: In this article, we consider the task of pooling a large number of single-shot spectra, a task commonly performed automatically by the instrument software. By viewing the underlying statistical problem as one of heteroscedastic linear regression, we provide a framework for introducing robust methods and for dealing with missing data resulting from a limited span of recordable intensity values provided by the instrument. Our framework provides an interpretation of currently used methods as a maximum-likelihood estimator and allows theoretical derivation of its variance. We observe that this variance depends crucially on the total number of ionic species, which can vary considerably between different pooled spectra. This variation in variance can potentially invalidate the results from naive methods of discrimination/classification and we outline appropriate data transformations. Introducing methods from robust statistics did not improve the standard errors of the pooled samples. Imputing missing values however-using the EM algorithm-had a notable effect on the result; for our data, the pooled height of peaks which were frequently truncated increased by up to 30%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号