首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Missing data is a common issue in research using observational studies to investigate the effect of treatments on health outcomes. When missingness occurs only in the covariates, a simple approach is to use missing indicators to handle the partially observed covariates. The missing indicator approach has been criticized for giving biased results in outcome regression. However, recent papers have suggested that the missing indicator approach can provide unbiased results in propensity score analysis under certain assumptions. We consider assumptions under which the missing indicator approach can provide valid inferences, namely, (1) no unmeasured confounding within missingness patterns; either (2a) covariate values of patients with missing data were conditionally independent of treatment or (2b) these values were conditionally independent of outcome; and (3) the outcome model is correctly specified: specifically, the true outcome model does not include interactions between missing indicators and fully observed covariates. We prove that, under the assumptions above, the missing indicator approach with outcome regression can provide unbiased estimates of the average treatment effect. We use a simulation study to investigate the extent of bias in estimates of the treatment effect when the assumptions are violated and we illustrate our findings using data from electronic health records. In conclusion, the missing indicator approach can provide valid inferences for outcome regression, but the plausibility of its assumptions must first be considered carefully.  相似文献   

2.
This paper proposes dynamic treatment regimes (DTRs) as effective individualized treatment strategies for managing chronic periodontitis. The proposed DTRs are studied via SMARTp —a two-stage sequential multiple assignment randomized trial (SMART) design. For this design, we propose a statistical analysis plan and a novel cluster-level sample size calculation method that factors in typical features of periodontal responses such as non-Gaussianity, spatial clustering, and nonrandom missingness. Here, each patient is viewed as a cluster, and a tooth within a patient's mouth is viewed as an individual unit inside the cluster, with the tooth-level covariance structure described by a conditionally autoregressive structure. To accommodate possible skewness and tail behavior, the tooth-level clinical attachment level (CAL) response is assumed to be skew-t, with the nonrandomly missing structure captured via a shared parameter model corresponding to the missingness indicator. The proposed method considers mean comparison for the regimes with or without sharing an initial treatment, where the expected values and corresponding variances or covariance for the sample means of a pair of DTRs are derived by the inverse probability weighting and method of moments. Simulation studies are conducted to investigate the finite-sample performance of the proposed sample size formulas under a variety of outcome-generating scenarios. An R package SMARTp implementing our sample size formula is available at the Comprehensive R Archive Network for free download.  相似文献   

3.
Generalized additive models (GAMs) have been widely used for flexible modeling of various types of outcomes. When the outcome in a GAM is subject to missing, practical analyses often assume that missingness is missing at random (MAR). This assumption can be of suspicion when the missingness is not by design. Evaluating the potential effects of alternative nonignorable missing data mechanism on the MAR inference from a GAM can be important but often challenging due to the complicatedness of alternative nonignorable models. We apply the index approach to local sensitivity (Troxel, Ma, and Heitjan 2004 (2004). Statistica Sinica 14 , 1221–1237) to evaluate the potential changes of the GAM estimates in the neighborhood of the MAR model. The approach avoids fitting any complicated nonignorable GAM. Only MAR estimates are required to calculate the resulting sensitivity index and adjust the GAM estimates to account for nonignorable missingness. Thus the proposed approach is considerably simpler to conduct, as compared with the alternative methods. The simulation study shows that the index provides valid assessment of the local sensitivity of the GAM estimates to nonignorable missingness. We then illustrate the method using a rheumatoid arthritis clinical trial data set.  相似文献   

4.
LauriOksanen  MarekSammul  MerikeMägi 《Oikos》2006,112(1):149-155
The index of relative competition intensity (RCI) has serious built‐in biases, due to its asymptotic behavior when competition intensity is high and its tendency to obtain very low values when plants with neighbors intact perform better than neighbor removal plants. These biases have been partially corrected in the index of relative neighbor effect (RNE), but the existence of fixed upper and lower bounds (?1≤RNE≤+1) still creates problems and biases in communities where the average intensity of competition or facilitation is high and plant performance pronouncedly varies in space. The third commonly used index, the logarithm of response ratio (lnRR), is mathematically and statistically sound, but when computed from pair‐wise comparisons between neighbor removal and control plants, this index reflects the geometric mean of the treatment effect. Moreover, linear patterns in lnRR reflect exponential patterns in the intensity of competition. As the interest of ecologists usually focuses on arithmetic means, we propose a corrected index of relative competition intensity, CRCI=arc sin (RNE). This index is fairly linear within the observed ranges of competition and facilitation, and for the range of competition intensities where RNE behaves reasonably, the two indices obtain almost identical values. We compared the performance of the four indices, using both imagined and real data, the latter from systems where the responses of plants to neighbor removal ranged from weak to moderate, so that RNE and CRCI were expected to behave similarly. The indices were computed both from pooled data for each community and as averages of pair‐wise comparisons. lnRR and CRCI were found to behave in a consistent and bias‐free manner, yielding similar results regardless of method of computation. This was, by and large, the case with RNE, too, but as the values of indices grew, the values from pair‐wise comparisons became increasingly smaller than values computed from pooled data. RCI yielded grossly aberrant results in computations based on pair‐wise comparisons. Therefore, the further use of RCI is unadvisable and studies where RCI has been derived from pair‐wise comparisons should be excluded from meta‐analyses.  相似文献   

5.
Recent reviews of specific topics, such as the relationship between male attractiveness to females and fluctuating asymmetry or attractiveness and the expression of secondary sexual characters, suggest that publication bias might be a problem in ecology and evolution. In these cases, there is a significant negative correlation between the sample size of published studies and the magnitude or strength of the research findings (formally the ‘effect size’). If all studies that are conducted are equally likely to be published, irrespective of their findings, there should not be a directional relationship between effect size and sample size; only a decrease in the variance in effect size as sample size increases due to a reduction in sampling error. One interpretation of these reports of negative correlations is that studies with small sample sizes and weaker findings (smaller effect sizes) are less likely to be published. If the biological literature is systematically biased this could undermine the attempts of reviewers to summarise actual biology relationships by inflating estimates of average effect sizes. But how common is this problem? And does it really effect the general conclusions of literature reviews? Here, we examine data sets of effect sizes extracted from 40 peer‐reviewed, published meta‐analyses. We estimate how many studies are missing using the newly developed ‘trim and fill’ method. This method uses asymmetry in plots of effect size against sample size (‘funnel plots’) to detect ‘missing’ studies. For random‐effect models of meta‐analysis 38% (15/40) of data sets had a significant number of ‘missing’ studies. After correcting for potential publication bias, 21% (8/38) of weighted mean effects were no longer significantly greater than zero, and 15% (5/34) were no longer statistically robust when we used random‐effects models in a weighted meta‐analysis. The mean correlation between sample size and the magnitude of standardised effect size was also significantly negative (rs=‐0.20, P < 0‐0001). Individual correlations were significantly negative (P < 0.10) in 35% (14/40) of cases. Publication bias may therefore effect the main conclusions of at least 15–21% of meta‐analyses. We suggest that future literature reviews assess the robustness of their main conclusions by correcting for potential publication bias using the ‘trim and fill’ method.  相似文献   

6.
Summary.   The present article deals with informative missing (IM) exposure data in matched case–control studies. When the missingness mechanism depends on the unobserved exposure values, modeling the missing data mechanism is inevitable. Therefore, a full likelihood-based approach for handling IM data has been proposed by positing a model for selection probability, and a parametric model for the partially missing exposure variable among the control population along with a disease risk model. We develop an EM algorithm to estimate the model parameters. Three special cases: (a) binary exposure variable, (b) normally distributed exposure variable, and (c) lognormally distributed exposure variable are discussed in detail. The method is illustrated by analyzing a real matched case–control data with missing exposure variable. The performance of the proposed method is evaluated through simulation studies, and the robustness of the proposed method for violation of different types of model assumptions has been considered.  相似文献   

7.
The present article deals with informative missing (IM) exposure data in matched case-control studies. When the missingness mechanism depends on the unobserved exposure values, modeling the missing data mechanism is inevitable. Therefore, a full likelihood-based approach for handling IM data has been proposed by positing a model for selection probability, and a parametric model for the partially missing exposure variable among the control population along with a disease risk model. We develop an EM algorithm to estimate the model parameters. Three special cases: (a) binary exposure variable, (b) normally distributed exposure variable, and (c) lognormally distributed exposure variable are discussed in detail. The method is illustrated by analyzing a real matched case-control data with missing exposure variable. The performance of the proposed method is evaluated through simulation studies, and the robustness of the proposed method for violation of different types of model assumptions has been considered.  相似文献   

8.
Marginal structural models (MSMs) have been proposed for estimating a treatment's effect, in the presence of time‐dependent confounding. We aimed to evaluate the performance of the Cox MSM in the presence of missing data and to explore methods to adjust for missingness. We simulated data with a continuous time‐dependent confounder and a binary treatment. We explored two classes of missing data: (i) missed visits, which resemble clinical cohort studies; (ii) missing confounder's values, which correspond to interval cohort studies. Missing data were generated under various mechanisms. In the first class, the source of the bias was the extreme treatment weights. Truncation or normalization improved estimation. Therefore, particular attention must be paid to the distribution of weights, and truncation or normalization should be applied if extreme weights are noticed. In the second case, bias was due to the misspecification of the treatment model. Last observation carried forward (LOCF), multiple imputation (MI), and inverse probability of missingness weighting (IPMW) were used to correct for the missingness. We found that alternatives, especially the IPMW method, perform better than the classic LOCF method. Nevertheless, in situations with high marker's variance and rarely recorded measurements none of the examined method adequately corrected the bias.  相似文献   

9.
Summary .  Longitudinal studies often generate incomplete response patterns according to a missing not at random mechanism. Shared parameter models provide an appealing framework for the joint modelling of the measurement and missingness processes, especially in the nonmonotone missingness case, and assume a set of random effects to induce the interdependence. Parametric assumptions are typically made for the random effects distribution, violation of which leads to model misspecification with a potential effect on the parameter estimates and standard errors. In this article we avoid any parametric assumption for the random effects distribution and leave it completely unspecified. The estimation of the model is then made using a semi-parametric maximum likelihood method. Our proposal is illustrated on a randomized longitudinal study on patients with rheumatoid arthritis exhibiting nonmonotone missingness.  相似文献   

10.
Analysis with time-to-event data in clinical and epidemiological studies often encounters missing covariate values, and the missing at random assumption is commonly adopted, which assumes that missingness depends on the observed data, including the observed outcome which is the minimum of survival and censoring time. However, it is conceivable that in certain settings, missingness of covariate values is related to the survival time but not to the censoring time. This is especially so when covariate missingness is related to an unmeasured variable affected by the patient's illness and prognosis factors at baseline. If this is the case, then the covariate missingness is not at random as the survival time is censored, and it creates a challenge in data analysis. In this article, we propose an approach to deal with such survival-time-dependent covariate missingness based on the well known Cox proportional hazard model. Our method is based on inverse propensity weighting with the propensity estimated by nonparametric kernel regression. Our estimators are consistent and asymptotically normal, and their finite-sample performance is examined through simulation. An application to a real-data example is included for illustration.  相似文献   

11.
Toledano AY  Gatsonis C 《Biometrics》1999,55(2):488-496
We propose methods for regression analysis of repeatedly measured ordinal categorical data when there is nonmonotone missingness in these responses and when a key covariate is missing depending on observables. The methods use ordinal regression models in conjunction with generalized estimating equations (GEEs). We extend the GEE methodology to accommodate arbitrary patterns of missingness in the responses when this missingness is independent of the unobserved responses. We further extend the methodology to provide correction for possible bias when missingness in knowledge of a key covariate may depend on observables. The approach is illustrated with the analysis of data from a study in diagnostic oncology in which multiple correlated receiver operating characteristic curves are estimated and corrected for possible verification bias when the true disease status is missing depending on observables.  相似文献   

12.
Cho Paik M 《Biometrics》2004,60(2):306-314
Matched case-control data analysis is often challenged by a missing covariate problem, the mishandling of which could cause bias or inefficiency. Satten and Carroll (2000, Biometrics56, 384-388) and other authors have proposed methods to handle missing covariates when the probability of missingness depends on the observed data, i.e., when data are missing at random. In this article, we propose a conditional likelihood method to handle the case when the probability of missingness depends on the unobserved covariate, i.e., when data are nonignorably missing. When the missing covariate is binary, the proposed method can be implemented using standard software. Using the Northern Manhattan Stroke Study data, we illustrate the method and discuss how sensitivity analysis can be conducted.  相似文献   

13.
Liu M  Taylor JM  Belin TR 《Biometrics》2000,56(4):1157-1163
This paper outlines a multiple imputation method for handling missing data in designed longitudinal studies. A random coefficients model is developed to accommodate incomplete multivariate continuous longitudinal data. Multivariate repeated measures are jointly modeled; specifically, an i.i.d. normal model is assumed for time-independent variables and a hierarchical random coefficients model is assumed for time-dependent variables in a regression model conditional on the time-independent variables and time, with heterogeneous error variances across variables and time points. Gibbs sampling is used to draw model parameters and for imputations of missing observations. An application to data from a study of startle reactions illustrates the model. A simulation study compares the multiple imputation procedure to the weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) that can be used to address similar data structures.  相似文献   

14.
We consider the effect of informative missingness on association tests that use parental genotypes as controls and that allow for missing parental data. Parental data can be informatively missing when the probability of a parent being available for study is related to that parent's genotype; when this occurs, the distribution of genotypes among observed parents is not representative of the distribution of genotypes among the missing parents. Many previously proposed procedures that allow for missing parental data assume that these distributions are the same. We propose association tests that behave well when parental data are informatively missing, under the assumption that, for a given trio of paternal, maternal, and affected offspring genotypes, the genotypes of the parents and the sex of the missing parents, but not the genotype of the affected offspring, can affect parental missingness. (This same assumption is required for validity of an analysis that ignores incomplete parent-offspring trios.) We use simulations to compare our approach with previously proposed procedures, and we show that if even small amounts of informative missingness are not taken into account, they can have large, deleterious effects on the performance of tests.  相似文献   

15.
Abstract. This study shows how a Gibbs sampling approach can be used for Bayesian inference of inbreeding depression. The method presented is mainly concerned with organisms that can be both selfed and outcrossed. Tests performed on simulated data with unequal variances and missing observations show that the method works well. Real data from the plant Scabiosa canescens is also analyzed.  相似文献   

16.
Summary In individually matched case–control studies, when some covariates are incomplete, an analysis based on the complete data may result in a large loss of information both in the missing and completely observed variables. This usually results in a bias and loss of efficiency. In this article, we propose a new method for handling the problem of missing covariate data based on a missing‐data‐induced intensity approach when the missingness mechanism does not depend on case–control status and show that this leads to a generalization of the missing indicator method. We derive the asymptotic properties of the estimates from the proposed method and, using an extensive simulation study, assess the finite sample performance in terms of bias, efficiency, and 95% confidence coverage under several missing data scenarios. We also make comparisons with complete‐case analysis (CCA) and some missing data methods that have been proposed previously. Our results indicate that, under the assumption of predictable missingness, the suggested method provides valid estimation of parameters, is more efficient than CCA, and is competitive with other, more complex methods of analysis. A case–control study of multiple myeloma risk and a polymorphism in the receptor Inter‐Leukin‐6 (IL‐6‐α) is used to illustrate our findings.  相似文献   

17.
For regression with covariates missing not at random where the missingness depends on the missing covariate values, complete-case (CC) analysis leads to consistent estimation when the missingness is independent of the response given all covariates, but it may not have the desired level of efficiency. We propose a general empirical likelihood framework to improve estimation efficiency over the CC analysis. We expand on methods in Bartlett et al. (2014, Biostatistics 15 , 719–730) and Xie and Zhang (2017, Int J Biostat 13 , 1–20) that improve efficiency by modeling the missingness probability conditional on the response and fully observed covariates by allowing the possibility of modeling other data distribution-related quantities. We also give guidelines on what quantities to model and demonstrate that our proposal has the potential to yield smaller biases than existing methods when the missingness probability model is incorrect. Simulation studies are presented, as well as an application to data collected from the US National Health and Nutrition Examination Survey.  相似文献   

18.
Albert PS  Follmann DA  Wang SA  Suh EB 《Biometrics》2002,58(3):631-642
Longitudinal clinical trials often collect long sequences of binary data. Our application is a recent clinical trial in opiate addicts that examined the effect of a new treatment on repeated binary urine tests to assess opiate use over an extended follow-up. The dataset had two sources of missingness: dropout and intermittent missing observations. The primary endpoint of the study was comparing the marginal probability of a positive urine test over follow-up across treatment arms. We present a latent autoregressive model for longitudinal binary data subject to informative missingness. In this model, a Gaussian autoregressive process is shared between the binary response and missing-data processes, thereby inducing informative missingness. Our approach extends the work of others who have developed models that link the various processes through a shared random effect but do not allow for autocorrelation. We discuss parameter estimation using Monte Carlo EM and demonstrate through simulations that incorporating within-subject autocorrelation through a latent autoregressive process can be very important when longitudinal binary data is subject to informative missingness. We illustrate our new methodology using the opiate clinical trial data.  相似文献   

19.
Emerging ecological time series from long-term ecological studies and remote sensing provide excellent opportunities for ecologists to study the dynamic patterns and governing processes of ecological systems. However, signal extraction from long-term time series often requires system learning (e.g., estimation of true system state) to process the large amount of information, to reconstruct system state, to account for measurement error, and to handle missing data. State-space models (SSMs) are a natural choice for these tasks and thus have received increasing attention in ecological and environmental studies. Data-based learning using SSMs that connect ecological processes to the measurement of system state becomes a useful technique in the ecological informatics toolkit. The present study illustrates the use of the Kalman filter (KF), an estimator of SSMs, with case studies of population dynamics. The examples of the SSM applications include the reconstruction of system state using the KF method and Markov chain Monte Carlo methods, estimation of measurement-error variances in the estimates of animal population abundance using basic structural models (BSMs), and estimation of missing values using the KF and Kalman smoother. Estimation of measurement-error variances by BSMs does not require knowledge of the functional form that generates the time series data. Instead, BSMs approximate the trajectory or deterministic skeleton of a system dynamics in a semi-parametric fashion, and provide a robust estimator of measurement-error variances. The present study also compares Bayesian SSMs with non-Bayesian SSMs. The joint use of the KF method or its extensions and Markov chain Monte Carlo (MCMC) methods is a promising approach to the parameter estimation of SSMs.  相似文献   

20.
Huang Y  Dagne G 《Biometrics》2012,68(3):943-953
Summary It is a common practice to analyze complex longitudinal data using semiparametric nonlinear mixed-effects (SNLME) models with a normal distribution. Normality assumption of model errors may unrealistically obscure important features of subject variations. To partially explain between- and within-subject variations, covariates are usually introduced in such models, but some covariates may often be measured with substantial errors. Moreover, the responses may be missing and the missingness may be nonignorable. Inferential procedures can be complicated dramatically when data with skewness, missing values, and measurement error are observed. In the literature, there has been considerable interest in accommodating either skewness, incompleteness or covariate measurement error in such models, but there has been relatively little study concerning all three features simultaneously. In this article, our objective is to address the simultaneous impact of skewness, missingness, and covariate measurement error by jointly modeling the response and covariate processes based on a flexible Bayesian SNLME model. The method is illustrated using a real AIDS data set to compare potential models with various scenarios and different distribution specifications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号