共查询到20条相似文献,搜索用时 0 毫秒
1.
This note clarifies under what conditions a naive analysis using a misclassified predictor will induce bias for the regression coefficients of other perfectly measured predictors in the model. An apparent discrepancy between some previous results and a result for measurement error of a continuous variable in linear regression is resolved. We show that similar to the linear setting, misclassification (even when not related to the other predictors) induces bias in the coefficients of the perfectly measured predictors, unless the misclassified variable and the perfectly measured predictors are independent. Conditional and asymptotic biases are discussed in the case of linear regression, and explored numerically for an example relating birth weight to the weight and smoking status of the mother. 相似文献
2.
It is well known that imprecision in the measurement of predictor variables typically leads to bias in estimated regression coefficients. We compare the bias induced by measurement error in a continuous predictor with that induced by misclassification of a binary predictor in the contexts of linear and logistic regression. To make the comparison fair, we consider misclassification probabilities for a binary predictor that correspond to dichotomizing an imprecise continuous predictor in lieu of its precise counterpart. On this basis, nondifferential binary misclassification is seen to yield more bias than nondifferential continuous measurement error. However, it is known that differential misclassification results if a binary predictor is actually formed by dichotomizing a continuous predictor subject to nondifferential measurement error. When the postulated model linking the response and precise continuous predictor is correct, this differential misclassification is found to yield less bias than continuous measurement error, in contrast with nondifferential misclassification, i.e., dichotomization reduces the bias due to mismeasurement. This finding, however, is sensitive to the form of the underlying relationship between the response and the continuous predictor. In particular, we give a scenario where dichotomization involves a trade-off between model fit and misclassification bias. We also examine how the bias depends on the choice of threshold in the dichotomization process and on the correlation between the imprecise predictor and a second precise predictor. 相似文献
3.
Misclassification of exposure variables is a common problem in epidemiologic studies. This paper compares the matrix method (Barron, 1977, Biometrics 33, 414-418; Greenland, 1988a, Statistics in Medicine 7, 745-757) and the inverse matrix method (Marshall, 1990, Journal of Clinical Epidemiology 43, 941-947) to the maximum likelihood estimator (MLE) that corrects the odds ratio for bias due to a misclassified binary covariate. Under the assumption of differential misclassification, the inverse matrix method is always more efficient than the matrix method; however, the efficiency depends strongly on the values of the sensitivity, specificity, baseline probability of exposure, the odds ratio, case-control ratio, and validation sampling fraction. In a study on sudden infant death syndrome (SIDS), an estimate of the asymptotic relative efficiency (ARE) of the inverse matrix estimate was 0.99, while the matrix method's ARE was 0.19. Under nondifferential misclassification, neither the matrix nor the inverse matrix estimator is uniformly more efficient than the other; the efficiencies again depend on the underlying parameters. In the SIDS data, the MLE was more efficient than the matrix method (ARE = 0.39). In a study investigating the effect of vitamin A intake on the incidence of breast cancer, the MLE was more efficient than the matrix method (ARE = 0.75). 相似文献
4.
A two-stage Bayesian method is presented for analyzing case-control studies in which a binary variable is sometimes measured with error but the correct values of the variable are known for a random subset of the study group. The first stage of the method is analytically tractable and MCMC methods are used for the second stage. The posterior distribution from the first stage becomes the prior distribution for the second stage, thus transferring all relevant information between the stages. The method makes few distributional assumptions and requires no asymptotic approximations. It is computationally fast and can be run using standard software. It is applied to two data sets that have been analyzed by other methods, and results are compared. 相似文献
5.
We propose a methodology for modeling correlated binary data measured with diagnostic error. A shared random effect is used to induce correlations in repeated true latent binary outcomes and in observed responses and to link the probability of a true positive outcome with the probability of having a diagnosis error. We evaluate the performance of our proposed approach through simulations and compare it with an ad hoc approach. The methodology is illustrated with data from a study that assessed the probability of corneal arcus in patients with familial hypercholesterolemia. 相似文献
6.
Modeling diagnostic error without a gold standard has been an active area of biostatistical research. In a majority of the approaches, model-based estimates of sensitivity, specificity, and prevalence are derived from a latent class model in which the latent variable represents an individual's true unobserved disease status. For simplicity, initial approaches assumed that the diagnostic test results on the same subject were independent given the true disease status (i.e., the conditional independence assumption). More recently, various authors have proposed approaches for modeling the dependence structure between test results given true disease status. This note discusses a potential problem with these approaches. Namely, we show that when the conditional dependence between tests is misspecified, estimators of sensitivity, specificity, and prevalence can be biased. Importantly, we demonstrate that with small numbers of tests, likelihood comparisons and other model diagnostics may not be able to distinguish between models with different dependence structures. We present asymptotic results that show the generality of the problem. Further, data analysis and simulations demonstrate the practical implications of model misspecification. Finally, we present some guidelines about the use of these models for practitioners. 相似文献
7.
Low sensitivity and/or specificity of a diagnostic test for outcome results in biased estimates of the time to first event using product limit estimation. For example, if a test has low specificity, estimates of the cumulative distribution function (cdf) are biased towards time zero, while estimates of the cdf are biased away from time zero if a test has low sensitivity. In the context of discrete time survival analysis for infectious disease data, we develop self-consistent algorithms to obtain unbiased estimates of the time to first event when the sensitivity and/or specificity of the diagnostic test for the outcome is less than 100%. Two examples are presented. The first involves estimating time to first detection of HIV-1 infection in infants in a randomized clinical trial, and the second involves estimating time to first Neisseria gonorrhoeae infection in a cohort of Kenyan prostitutes. 相似文献
8.
We consider a Bayesian analysis for modeling a binary response that is subject to misclassification. Additionally, an explanatory variable is assumed to be unobservable, but measurements are available on its surrogate. A binary regression model is developed to incorporate the measurement error in the covariate as well as the misclassification in the response. Unlike existing methods, no model parameters need be assumed known. Markov chain Monte Carlo methods are utilized to perform the necessary computations. The methods developed are illustrated using atomic bomb survival data. A simulation experiment explores advantages of the approach. 相似文献
9.
We have developed a new general approach for handling misclassification in discrete covariates or responses in regression models. The simulation and extrapolation (SIMEX) method, which was originally designed for handling additive covariate measurement error, is applied to the case of misclassification. The statistical model for characterizing misclassification is given by the transition matrix Pi from the true to the observed variable. We exploit the relationship between the size of misclassification and bias in estimating the parameters of interest. Assuming that Pi is known or can be estimated from validation data, we simulate data with higher misclassification and extrapolate back to the case of no misclassification. We show that our method is quite general and applicable to models with misclassified response and/or misclassified discrete regressors. In the case of a binary response with misclassification, we compare our method to the approach of Neuhaus, and to the matrix method of Morrissey and Spiegelman in the case of a misclassified binary regressor. We apply our method to a study on caries with a misclassified longitudinal response. 相似文献
10.
BackgroundAcute Lymphoblastic Leukemia (ALL) has a high survival rate, but cancer-related late effects in the early post-treatment years need documentation. Hospitalizations are an indicator of the burden of late effects. We identify rates and risk factors for hospitalization from five to ten years after diagnosis for childhood and adolescent ALL survivors compared to siblings and a matched population sample.Methods176 ALL survivors were diagnosed at ≤22 years between 1998 and 2008 and treated at an Intermountain Healthcare facility. The Utah Population Database identified siblings, an age- and sex-matched sample of the Utah population, and statewide inpatient hospital discharges. Sex- and birth year-adjusted Poisson models with Generalized Estimating Equations and robust standard errors calculated rates and rate ratios. Cox proportional hazards models identified demographic and clinical risk factors for hospitalizations among survivors.ResultsHospitalization rates for survivors (Rate:3.76, 95% CI = 2.22–6.36) were higher than siblings (Rate:2.69, 95% CI = 1.01–7.18) and the population sample (Rate:1.87, 95% CI = 1.13–3.09). Compared to siblings and population comparisons, rate ratios (RR) were significantly higher for survivors diagnosed between age 6 and 22 years (RR:2.87, 95% CI = 1.03–7.97 vs siblings; RR:2.66, 95% CI = 1.17–6.04 vs population comparisons). Rate ratios for diagnosis between 2004 and 2008 were significantly higher compared to the population sample (RR:4.29, 95% CI = 1.49, 12.32), but not siblings (RR:2.73, 95% CI = 0.54, 13.68). Survivors originally diagnosed with high-risk ALL did not have a significantly higher risk than siblings or population comparators. However, high-risk ALL survivors (Hazard ratio [HR]:3.36, 95% CI = 1.33–8.45) and survivors diagnosed from 2004 to 2008 (HR:9.48, 95% CI = 1.93–46.59) had the highest risk compared to their survivor counterparts.ConclusionsFive to ten years after diagnosis is a sensitive time period for hospitalizations in the ALL population. Survivors of childhood ALL require better long-term surveillance. 相似文献
11.
The accuracy of the Kato-Katz technique in identifying individuals with soil-transmitted helminth (STH) infections is limited by day-to-day variation in helminth egg excretion, confusion with other parasites and the laboratory technicians’ experience. We aimed to estimate the sensitivity and specificity of the Kato-Katz technique to detect infection with Ascaris lumbricoides, hookworm and Trichuris trichiura using a Bayesian approach in the absence of a ‘gold standard’. Data were obtained from a longitudinal study conducted between January 2004 and December 2005 in Samar Province, the Philippines. Each participant provided between one and three stool samples over consecutive days. Stool samples were examined using the Kato-Katz technique and reported as positive or negative for STHs. In the presence of measurement error, the true status of each individual is considered as latent data. Using a Bayesian method, we calculated marginal posterior densities of sensitivity and specificity parameters from the product of the likelihood function of observed and latent data. A uniform prior distribution was used (beta distribution: α = 1, β = 1). A total of 5624 individuals provided at least one stool sample. One, two and three stool samples were provided by 1582, 1893 and 2149 individuals, respectively. All STHs showed variation in test results from day to day. Sensitivity estimates of the Kato-Katz technique for one stool sample were 96.9% (95% Bayesian Credible Interval [BCI]: 96.1%, 97.6%), 65.2% (60.0%, 69.8%) and 91.4% (90.5%, 92.3%), for A. lumbricoides, hookworm and T. trichiura, respectively. Specificity estimates for one stool sample were 96.1% (95.5%, 96.7%), 93.8% (92.4%, 95.4%) and 94.4% (93.2%, 95.5%), for A. lumbricoides, hookworm and T. trichiura, respectively. Our results show that the Kato-Katz technique can perform with reasonable accuracy with one day’s stool collection for A. lumbricoides and T. trichiura. Low sensitivity of the Kato-Katz for detection of hookworm infection may be related to rapid degeneration of delicate hookworm eggs with time. 相似文献
12.
度量误差对模型参数估计值的影响及参数估计方法的比较研究 总被引:3,自引:1,他引:3
基于模型V=aDb,首先在Matlab下用模拟实验的方法,研究了度量误差对模型参数估计的影响,结果表明:当V的误差固定而D的误差不断增大时,用通常最小二乘法对模型进行参数估计,参数a的估计值不断增大,参数b的估计值不断减小,参数估计值随着 D的度量误差的增大越来越远离参数真实值;然后对消除度量误差影响的参数估计方法进行研究,分别用回归校准法、模拟外推法和度量误差模型方法对V和D都有度量误差的数据进行参数估计,结果表明:回归校准法、模拟外推法和度量误差模型方法都能得到参数的无偏估计,克服了用通常最小二乘法进行估计造成的参数估计的系统偏差,结果进一步表明度量误差模型方法优于回归校准法和模拟外推法. 相似文献
13.
Case-control designs are widely used in rare disease studies. In a typical case-control study, data are collected from a sample of all available subjects who have experienced a disease (cases) and a sub-sample of subjects who have not experienced the disease (controls) in a study cohort. Cases are oversampled in case-control studies. Logistic regression is a common tool to estimate the relative risks of the disease with respect to a set of covariates. Very often in such a study, information of ages-at-onset of the disease for all cases and ages at survey of controls are known. Standard logistic regression analysis using age as a covariate is based on a dichotomous outcome and does not efficiently use such age-at-onset (time-to-event) information. We propose to analyze age-at-onset data using a modified case-cohort method by treating the control group as an approximation of a subcohort assuming rare events. We investigate the asymptotic bias of this approximation and show that the asymptotic bias of the proposed estimator is small when the disease rate is low. We evaluate the finite sample performance of the proposed method through a simulation study and illustrate the method using a breast cancer case-control data set. 相似文献
14.
Planning studies involving diagnostic tests is complicated by the fact that virtually no test provides perfectly accurate results. The misclassification induced by imperfect sensitivities and specificities of diagnostic tests must be taken into account, whether the primary goal of the study is to estimate the prevalence of a disease in a population or to investigate the properties of a new diagnostic test. Previous work on sample size requirements for estimating the prevalence of disease in the case of a single imperfect test showed very large discrepancies in size when compared to methods that assume a perfect test. In this article we extend these methods to include two conditionally independent imperfect tests, and apply several different criteria for Bayesian sample size determination to the design of such studies. We consider both disease prevalence studies and studies designed to estimate the sensitivity and specificity of diagnostic tests. As the problem is typically nonidentifiable, we investigate the limits on the accuracy of parameter estimation as the sample size approaches infinity. Through two examples from infectious diseases, we illustrate the changes in sample sizes that arise when two tests are applied to individuals in a study rather than a single test. Although smaller sample sizes are often found in the two-test situation, they can still be prohibitively large unless accurate information is available about the sensitivities and specificities of the tests being used. 相似文献
15.
16.
An estimator of relative risk in a case control study has been proposed in terms of observed cell frequencies and the probability of disease. The bias of the usual estimator i.e odds ratio as compared to the new estimator has been workedout. The expression of Mean Square Error of proposed estimator has been derived in situations where probability of disease is exactly known and when it is estimated through an independent survey. It has been observed that there is a serious error using odds ratio as an estimate of relative risk when probability of disease is not negligible. In such situations the proposed estimator can be used with advantage. 相似文献
17.
18.
The intra- and inter-observer measurement error variability was studied using univariate and multivariate statistical tests.
Eleven skeletal variables of four individuals each in four Primate species were measured ten times by three different researchers,
using six different tools. An average measurement error of 0.52 mm. was obtained. Univariate statistics showed significant
differences among reseachers. A multivariate discriminant analysis could also discriminate them. The measurement error may
be either systematic or random, and depends not only on the researcher, but also on the tool used, the variable measured,
and on the magnitude of the variable. The technique of Measurement Replication is proposed in order to reduce the measurement
error, specially when compairing small samples or when trying to find small average differences between populations. The replication
technique also reduces the standard deviation of the population sample. 相似文献
19.
20.
Gerhard Hommel 《Biometrical journal. Biometrische Zeitschrift》2001,43(5):581-589
It is investigated how one can modify hypotheses in a trial after an interim analysis such that the type I error rate is controlled. If only a global statement is desired, a solution was given by Bauer (1989). For a general multiple testing problem, Kieser , Bauer and Lehmacher (1999) and Bauer and Kieser (1999) gave solutions, by means of which the initial set of hypotheses can be reduced after the interim analysis. The same techniques can be applied to obtain more flexible strategies, as changing weights of hypotheses, changing an a priori order, or even including new hypotheses. It is emphasized that the application of these methods requires very careful planning of a trial as well as a critical discussion of the scientific aims in order to avoid every manipulation. 相似文献