首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
M A Espeland  S L Hui 《Biometrics》1987,43(4):1001-1012
Misclassification is a common source of bias and reduced efficiency in the analysis of discrete data. Several methods have been proposed to adjust for misclassification using information on error rates (i) gathered by resampling the study population, (ii) gathered by sampling a separate population, or (iii) assumed a priori. We present unified methods for incorporating these types of information into analyses based on log-linear models and maximum likelihood estimation. General variance expressions are developed. Examples from epidemiologic studies are used to demonstrate the proposed methodology.  相似文献   

2.
In many observational studies, individuals are measured repeatedly over time, although not necessarily at a set of pre-specified occasions. Instead, individuals may be measured at irregular intervals, with those having a history of poorer health outcomes being measured with somewhat greater frequency and regularity. In this paper, we consider likelihood-based estimation of the regression parameters in marginal models for longitudinal binary data when the follow-up times are not fixed by design, but can depend on previous outcomes. In particular, we consider assumptions regarding the follow-up time process that result in the likelihood function separating into two components: one for the follow-up time process, the other for the outcome measurement process. The practical implication of this separation is that the follow-up time process can be ignored when making likelihood-based inferences about the marginal regression model parameters. That is, maximum likelihood (ML) estimation of the regression parameters relating the probability of success at a given time to covariates does not require that a model for the distribution of follow-up times be specified. However, to obtain consistent parameter estimates, the multinomial distribution for the vector of repeated binary outcomes must be correctly specified. In general, ML estimation requires specification of all higher-order moments and the likelihood for a marginal model can be intractable except in cases where the number of repeated measurements is relatively small. To circumvent these difficulties, we propose a pseudolikelihood for estimation of the marginal model parameters. The pseudolikelihood uses a linear approximation for the conditional distribution of the response at any occasion, given the history of previous responses. The appeal of this approximation is that the conditional distributions are functions of the first two moments of the binary responses only. When the follow-up times depend only on the previous outcome, the pseudolikelihood requires correct specification of the conditional distribution of the current outcome given the outcome at the previous occasion only. Results from a simulation study and a study of asymptotic bias are presented. Finally, we illustrate the main results using data from a longitudinal observational study that explored the cardiotoxic effects of doxorubicin chemotherapy for the treatment of acute lymphoblastic leukemia in children.  相似文献   

3.
This note clarifies under what conditions a naive analysis using a misclassified predictor will induce bias for the regression coefficients of other perfectly measured predictors in the model. An apparent discrepancy between some previous results and a result for measurement error of a continuous variable in linear regression is resolved. We show that similar to the linear setting, misclassification (even when not related to the other predictors) induces bias in the coefficients of the perfectly measured predictors, unless the misclassified variable and the perfectly measured predictors are independent. Conditional and asymptotic biases are discussed in the case of linear regression, and explored numerically for an example relating birth weight to the weight and smoking status of the mother.  相似文献   

4.
Data with missing covariate values but fully observed binary outcomes are an important subset of the missing data challenge. Common approaches are complete case analysis (CCA) and multiple imputation (MI). While CCA relies on missing completely at random (MCAR), MI usually relies on a missing at random (MAR) assumption to produce unbiased results. For MI involving logistic regression models, it is also important to consider several missing not at random (MNAR) conditions under which CCA is asymptotically unbiased and, as we show, MI is also valid in some cases. We use a data application and simulation study to compare the performance of several machine learning and parametric MI methods under a fully conditional specification framework (MI-FCS). Our simulation includes five scenarios involving MCAR, MAR, and MNAR under predictable and nonpredictable conditions, where “predictable” indicates missingness is not associated with the outcome. We build on previous results in the literature to show MI and CCA can both produce unbiased results under more conditions than some analysts may realize. When both approaches were valid, we found that MI-FCS was at least as good as CCA in terms of estimated bias and coverage, and was superior when missingness involved a categorical covariate. We also demonstrate how MNAR sensitivity analysis can build confidence that unbiased results were obtained, including under MNAR-predictable, when CCA and MI are both valid. Since the missingness mechanism cannot be identified from observed data, investigators should compare results from MI and CCA when both are plausibly valid, followed by MNAR sensitivity analysis.  相似文献   

5.
6.
7.
The results of an epidemiological survey on surgical cases of human hydatidosis in 9 italian regions (Central, Southern and Insular Italy) with the highest incidence of disease and a population of 27,054,000 inhabitants are reported. The period considered was from 1980 through 1984. 2,592 cases have been collected and related to sex, age, occupation, residence of surgically treated patients and cyst localization. Comparison of results from the present and a previous survey was carried out.  相似文献   

8.
Neuhaus JM 《Biometrics》2002,58(3):675-683
Misclassified clustered and longitudinal data arise in studies where the response indicates a condition identified through an imperfect diagnostic procedure. Examples include longitudinal studies that use an imperfect diagnostic test to assess whether or not an individual has been infected with a specific virus. This article presents methods to implement both population-averaged and cluster-specific analyses of such data when the misclassification rates are known. The methods exploit the fact that the class of generalized linear models enjoys a closure property in the case of misclassified responses. Data from longitudinal studies of infectious disease will illustrate the findings.  相似文献   

9.
10.
Choosing the right exposure index for epidemiological studies on 50–60 Hz magnetic fields is difficult due to the lack of knowledge about critical exposure parameters for the biological effects of magnetic fields. This paper uses data from a previously published epidemiological investigation on early pregnancy loss (EPL) to study the methods of evaluating the exposure-response relationship of 50 Hz magnetic fields. Two approaches were used. The first approach was to apply generalized additive modeling to suggest the functional form of the relationship between EPL and magnetic field strength. The second approach evaluated the goodness of fit of the EPL data with eight alternative exposure indices: the 24 h average of magnetic field strength, three indices measuring the proportion of time above specified thresholds, and four indices measuring the proportion of time within specified intensity windows. Because the original exposure data included only spot measurements, estimates for the selected exposure indices were calculated indirectly from the spot measurements using empirical nonlinear equations derived from 24 h recordings in 60 residences. The results did not support intensity windows, and a threshold-type dependence on field strength appeared to be more plausible than a linear relationship. In addition, the study produced data suggesting that spot measurements may be used as surrogates for other exposure indices besides the time average field strength. No final conclusions should be drawn from this study alone, but we hope that this exercise stimulates evaluation of alternative exposure indices in other planned and ongoing epidemiological studies. © 1996 Wiley-Liss, Inc.  相似文献   

11.
In this paper, we develop Poisson-type regression methods that require the durations of exposure be measured only on a possibly nonrandom subset of the cohort members. These methods can be used to make inferences about the incidence density during exposure as well as the ratio of incidence densities during exposure versus not during exposure. Numerical studies demonstrate that the proposed methods yield reliable results in practical settings. We describe an application to a population-based case-control study assessing the transient increase in the risk of primary cardiac arrest during leisure-time physical activity.  相似文献   

12.
In longitudinal studies where time to a final event is the ultimate outcome often information is available about intermediate events the individuals may experience during the observation period. Even though many extensions of the Cox proportional hazards model have been proposed to model such multivariate time-to-event data these approaches are still very rarely applied to real datasets. The aim of this paper is to illustrate the application of extended Cox models for multiple time-to-event data and to show their implementation in popular statistical software packages. We demonstrate a systematic way of jointly modelling similar or repeated transitions in follow-up data by analysing an event-history dataset consisting of 270 breast cancer patients, that were followed-up for different clinical events during treatment in metastatic disease. First, we show how this methodology can also be applied to non Markovian stochastic processes by representing these processes as "conditional" Markov processes. Secondly, we compare the application of different Cox-related approaches to the breast cancer data by varying their key model components (i.e. analysis time scale, risk set and baseline hazard function). Our study showed that extended Cox models are a powerful tool for analysing complex event history datasets since the approach can address many dynamic data features such as multiple time scales, dynamic risk sets, time-varying covariates, transition by covariate interactions, autoregressive dependence or intra-subject correlation.  相似文献   

13.
基于支持向量回归的生物测定数据分析   总被引:1,自引:0,他引:1  
王志明  谭显胜  周玮  袁哲明 《昆虫学报》2010,53(12):1436-1441
生物测定是生物学、 医学、 毒理学的重要内容与基础。常用的定量生物测定数据分析方法时间-剂量-死亡率模型(TDM)不能对复杂生测数据建立统一模型, 信息利用不充分。本文基于支持向量回归(SVR), 提出了一种能对不同供试因子、 不同供试对象和不同环境条件下复杂生测数据统一建模的新方法。14个简单生测数据和2套复杂生测数据的对比分析结果表明, SVR模型拟合与留一法预测精度均优于TDM模型, 估计的LD50和LT50等指标更为可信。SVR模型有望作为TDM模型的有益补充, 在定量生物测定数据分析中得到广泛应用。  相似文献   

14.
Electric field strength values calculated by wave propagation modeling were applied as an exposure metric in a case–control study conducted in Germany to investigate a possible association between radio frequency electromagnetic fields (RF‐EMF) emitted from television and radio broadcast transmitters and the risk of childhood leukemia. To validate this approach it was examined at 850 measurement sites whether calculated RF‐EMF are an improvement to an exposure proxy based on distance from the place of residence to a transmitter. Further, the agreement between measured and calculated RF‐EMF was explored. For dichotomization at the 90% quantiles of the exposure distributions it was found that distance agreed less with measured RF‐EMF (Kappa coefficient: 0.55) than did calculated RF‐EMF (Kappa coefficient: 0.74). Distance was a good exposure proxy for a single transmitter only which uses the frequency bands of amplitude modulated radio, whereas it appeared to be of limited informative value in studies involving several transmitters, particularly if these are operating in different frequency bands. The analysis of the agreement between calculated RF‐EMF and measured RF‐EMF showed a sensitivity of 76.6% and a specificity of 97.4%, leading to an exposure misclassification that still allows one to detect a true odds ratio as low as 1.4 with a statistical power of >80% at a two‐sided significance level of 5% in a study with 2,000 cases and 6,000 controls. Thus, calculated RF‐EMF is confirmed to be an appropriate exposure metric in large‐scale epidemiological studies on broadcast transmitters. Bioelectromagnetics 30:81–91, 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

15.
Marginal methods have been widely used for the analysis of longitudinal ordinal and categorical data. These models do not require full parametric assumptions on the joint distribution of repeated response measurements but only specify the marginal or even association structures. However, inference results obtained from these methods often incur serious bias when variables are subject to error. In this paper, we tackle the problem that misclassification exists in both response and categorical covariate variables. We develop a marginal method for misclassification adjustment, which utilizes second‐order estimating functions and a functional modeling approach, and can yield consistent estimates and valid inference for mean and association parameters. We propose a two‐stage estimation approach for cases in which validation data are available. Our simulation studies show good performance of the proposed method under a variety of settings. Although the proposed method is phrased to data with a longitudinal design, it also applies to correlated data arising from clustered and family studies, in which association parameters may be of scientific interest. The proposed method is applied to analyze a dataset from the Framingham Heart Study as an illustration.  相似文献   

16.
Goetghebeur E  Ryan L 《Biometrics》2000,56(4):1139-1144
We propose a semiparametric approach to the proportional hazards regression analysis of interval-censored data. An EM algorithm based on an approximate likelihood leads to an M-step that involves maximizing a standard Cox partial likelihood to estimate regression coefficients and then using the Breslow estimator for the unknown baseline hazards. The E-step takes a particularly simple form because all incomplete data appear as linear terms in the complete-data log likelihood. The algorithm of Turnbull (1976, Journal of the Royal Statistical Society, Series B 38, 290-295) is used to determine times at which the hazard can take positive mass. We found multiple imputation to yield an easily computed variance estimate that appears to be more reliable than asymptotic methods with small to moderately sized data sets. In the right-censored survival setting, the approach reduces to the standard Cox proportional hazards analysis, while the algorithm reduces to the one suggested by Clayton and Cuzick (1985, Applied Statistics 34, 148-156). The method is illustrated on data from the breast cancer cosmetics trial, previously analyzed by Finkelstein (1986, Biometrics 42, 845-854) and several subsequent authors.  相似文献   

17.
18.
Percentage is widely used to describe different results in food microbiology, e.g., probability of microbial growth, percent inactivated, and percent of positive samples. Four sets of percentage data, percent-growth-positive, germination extent, probability for one cell to grow, and maximum fraction of positive tubes, were obtained from our own experiments and the literature. These data were modeled using linear and logistic regression. Five methods were used to compare the goodness of fit of the two models: percentage of predictions closer to observations, range of the differences (predicted value minus observed value), deviation of the model, linear regression between the observed and predicted values, and bias and accuracy factors. Logistic regression was a better predictor of at least 78% of the observations in all four data sets. In all cases, the deviation of logistic models was much smaller. The linear correlation between observations and logistic predictions was always stronger. Validation (accomplished using part of one data set) also demonstrated that the logistic model was more accurate in predicting new data points. Bias and accuracy factors were found to be less informative when evaluating models developed for percentage data, since neither of these indices can compare predictions at zero. Model simplification for the logistic model was demonstrated with one data set. The simplified model was as powerful in making predictions as the full linear model, and it also gave clearer insight in determining the key experimental factors.  相似文献   

19.
20.
Chromosome analyses of children after ecological lead exposure   总被引:1,自引:0,他引:1  
In the present work chromosome analysis was performed in a group of 30 children living in a town with a lead plant. Due to the emission of the smelter the individual lead uptake through food, drinking water and inhalation was increased. They were selected out of 1600 children whose blood lead level, delta-aminolevulinic acid dehydratase activity in the erythrocytes and erythrocyte porphyrine level was measured. In the investigated group of children the values of these parameters showed to be indicative for a significant lead exposure. A total of 10,000 cells was scored after 48 h culture time. Despite a significantly increased lead load as compared with two groups of 10 children from a suburb and the isle of Helgoland there was neither evidence for a higher number of cells with structural chromosome aberrations, nor for an increased aberration yield.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号