首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When novel scientific questions arise after longitudinal binary data have been collected, the subsequent selection of subjects from the cohort for whom further detailed assessment will be undertaken is often necessary to efficiently collect new information. Key examples of additional data collection include retrospective questionnaire data, novel data linkage, or evaluation of stored biological specimens. In such cases, all data required for the new analyses are available except for the new target predictor or exposure. We propose a class of longitudinal outcome-dependent sampling schemes and detail a design corrected conditional maximum likelihood analysis for highly efficient estimation of time-varying and time-invariant covariate coefficients when resource limitations prohibit exposure ascertainment on all participants. Additionally, we detail an important study planning phase that exploits available cohort data to proactively examine the feasibility of any proposed substudy as well as to inform decisions regarding the most desirable study design. The proposed designs and associated analyses are discussed in the context of a study that seeks to examine the modifying effect of an interleukin-10 cytokine single nucleotide polymorphism on asthma symptom regression in adolescents participating Childhood Asthma Management Program Continuation Study. Using this example we assume that all data necessary to conduct the study are available except subject-specific genotype data. We also assume that these data would be ascertained by analyzing stored blood samples, the cost of which limits the sample size.  相似文献   

2.
Ekholm A  McDonald JW  Smith PW 《Biometrics》2000,56(3):712-718
Models for a multivariate binary response are parameterized by univariate marginal probabilities and dependence ratios of all orders. The w-order dependence ratio is the joint success probability of w binary responses divided by the joint success probability assuming independence. This parameterization supports likelihood-based inference for both regression parameters, relating marginal probabilities to explanatory variables, and association model parameters, relating dependence ratios to simple and meaningful mechanisms. Five types of association models are proposed, where responses are (1) independent given a necessary factor for the possibility of a success, (2) independent given a latent binary factor, (3) independent given a latent beta distributed variable, (4) follow a Markov chain, and (5) follow one of two first-order Markov chains depending on the realization of a binary latent factor. These models are illustrated by reanalyzing three data sets, foremost a set of binary time series on auranofin therapy against arthritis. Likelihood-based approaches are contrasted with approaches based on generalized estimating equations. Association models specified by dependence ratios are contrasted with other models for a multivariate binary response that are specified by odds ratios or correlation coefficients.  相似文献   

3.
This paper proposes a method for modeling longitudinal binary data when nonresponse depends on unobserved responses. The proposed method presumes that the target of inference is the marginal distribution of the response at each occasion and its dependence on covariates, and can accommodate both monotone and non-monotone missingness. The approach involves a marginally specified pattern-mixture model that directly parameterizes both the marginal means at each occasion and the dependence of each response on indicators of nonresponse pattern. This formulation readily incorporates a variety of nonresponse processes assumed within a sensitivity analysis. Once identifying restrictions have been made, estimation of model parameters proceeds via solution to a set of modified generalized estimating equations. The proposed method provides an alternative to standard selection and pattern-mixture modeling frameworks, while featuring certain advantages of each. The paper concludes with application of the method to data from a contraceptive clinical trial with substantial dropout.  相似文献   

4.
When two binary responses are measured for each study subject across time, it may be of interest to model how the bivariate associations and marginal univariate risks involving the two responses change across time. To achieve such a goal, marginal models with bivariate log odds ratio and univariate logit components are extended to include random effects for all components. Specifically, separate normal random effects are specified on the log odds ratio scale for bivariate responses and on the logit scale for univariate responses. Assuming conditional independence given the random effects facilitates the modeling of bivariate associations across time with missing at random incomplete data. We fit the model to a dataset for which such structures are feasible: a longitudinal randomized trial of a cardiovascular educational program where the responses of interest are change in hypertension and hypercholestemia status. The proposed model is compared to a naive bivariate model that assumes independence between time points and univariate mixed effects logit models.  相似文献   

5.
In longitudinal studies investigators frequently have to assess and address potential biases introduced by missing data. New methods are proposed for modeling longitudinal categorical data with nonignorable dropout using marginalized transition models and shared random effects models. Random effects are introduced for both serial dependence of outcomes and nonignorable missingness. Fisher‐scoring and Quasi–Newton algorithms are developed for parameter estimation. Methods are illustrated with a real dataset.  相似文献   

6.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

7.
Heagerty PJ 《Biometrics》2002,58(2):342-351
Marginal generalized linear models are now frequently used for the analysis of longitudinal data. Semiparametric inference for marginal models was introduced by Liang and Zeger (1986, Biometrics 73, 13-22). This article develops a general parametric class of serial dependence models that permits likelihood-based marginal regression analysis of binary response data. The methods naturally extend the first-order Markov models of Azzalini (1994, Biometrika 81, 767-775) and prove computationally feasible for long series.  相似文献   

8.
For analyzing longitudinal binary data with nonignorable and nonmonotone missing responses, a full likelihood method is complicated algebraically, and often requires intensive computation, especially when there are many follow-up times. As an alternative, a pseudolikelihood approach has been proposed in the literature under minimal parametric assumptions. This formulation only requires specification of the marginal distributions of the responses and missing data mechanism, and uses an independence working assumption. However, this estimator can be inefficient for estimating both time-varying and time-stationary effects under moderate to strong within-subject associations among repeated responses. In this article, we propose an alternative estimator, based on a bivariate pseudolikelihood, and demonstrate in simulations that the proposed method can be much more efficient than the previous pseudolikelihood obtained under the assumption of independence. We illustrate the method using longitudinal data on CD4 counts from two clinical trials of HIV-infected patients.  相似文献   

9.
We propose and compare two approaches for regression analysis of multilevel binary data when clusters are not necessarily nested: a GEE method that relies on a working independence assumption coupled with a three-step method for obtaining empirical standard errors, and a likelihood-based method implemented using Bayesian computational techniques. Implications of time-varying endogenous covariates are addressed. The methods are illustrated using data from the Breast Cancer Surveillance Consortium to estimate mammography accuracy from a repeatedly screened population.  相似文献   

10.
The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretest–posttest longitudinal data. In particular, we consider log‐normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE‐based models may be preferable when the goal is to compare the marginal expected responses.  相似文献   

11.
Wang L  Zhou J  Qu A 《Biometrics》2012,68(2):353-360
We consider the penalized generalized estimating equations (GEEs) for analyzing longitudinal data with high-dimensional covariates, which often arise in microarray experiments and large-scale health studies. Existing high-dimensional regression procedures often assume independent data and rely on the likelihood function. Construction of a feasible joint likelihood function for high-dimensional longitudinal data is challenging, particularly for correlated discrete outcome data. The penalized GEE procedure only requires specifying the first two marginal moments and a working correlation structure. We establish the asymptotic theory in a high-dimensional framework where the number of covariates p(n) increases as the number of clusters n increases, and p(n) can reach the same order as n. One important feature of the new procedure is that the consistency of model selection holds even if the working correlation structure is misspecified. We evaluate the performance of the proposed method using Monte Carlo simulations and demonstrate its application using a yeast cell-cycle gene expression data set.  相似文献   

12.
13.
This paper highlights the consequences of incomplete observations in the analysis of longitudinal binary data, in particular non-monotone missing data patterns. Sensitivity analysis is advocated and a method is proposed based on a log-linear model. A sensitivity parameter that represents the relationship between the response mechanism and the missing data mechanism is introduced. It is shown that although this parameter is identifiable, its estimation is highly questionable. A far better approach is to consider a range of plausible values and to estimate the parameters of interest conditionally upon each value of the sensitivity parameter. This allows us to assess the sensitivity of study's conclusion to assumptions regarding the missing data mechanism. The method is applied to a randomized clinical trial comparing the efficacy of two treatment regimens in patients with persistent asthma.  相似文献   

14.
Yi GY  He W 《Biometrics》2009,65(2):618-625
Summary .  Recently, median regression models have received increasing attention. When continuous responses follow a distribution that is quite different from a normal distribution, usual mean regression models may fail to produce efficient estimators whereas median regression models may perform satisfactorily. In this article, we discuss using median regression models to deal with longitudinal data with dropouts. Weighted estimating equations are proposed to estimate the median regression parameters for incomplete longitudinal data, where the weights are determined by modeling the dropout process. Consistency and the asymptotic distribution of the resultant estimators are established. The proposed method is used to analyze a longitudinal data set arising from a controlled trial of HIV disease ( Volberding et al., 1990 , The New England Journal of Medicine 322, 941–949). Simulation studies are conducted to assess the performance of the proposed method under various situations. An extension to estimation of the association parameters is outlined.  相似文献   

15.
Marginal models for longitudinal continuous proportional data   总被引:5,自引:0,他引:5  
Song PX  Tan M 《Biometrics》2000,56(2):496-502
Summary. Continuous proportional data arise when the response of interest is a percentage between zero and one, e.g., the percentage of decrease in renal function at different follow‐up times from the baseline. In this paper, we propose methods to directly model the marginal means of the longitudinal proportional responses using the simplex distribution of Barndorff‐Nielsen and Jørgensen that takes into account the fact that such responses are percentages restricted between zero and one and may as well have large dispersion. Parameters in such a marginal model are estimated using an extended version of the generalized estimating equations where the score vector is a nonlinear function of the observed response. The method is illustrated with an ophthalmology study on the use of intraocular gas in retinal repair surgeries.  相似文献   

16.
This article presents a likelihood-based method for handling nonignorable dropout in longitudinal studies with binary responses. The methodology developed is appropriate when the target of inference is the marginal distribution of the response at each occasion and its dependence on covariates. A "hybrid" model is formulated, which is designed to retain advantageous features of the selection and pattern-mixture model approaches. This formulation accommodates a variety of assumed forms of nonignorable dropout, while maintaining transparency of the constraints required for identifying the overall model. Once appropriate identifying constraints have been imposed, likelihood-based estimation is conducted via the EM algorithm. The article concludes by applying the approach to data from a randomized clinical trial comparing two doses of a contraceptive.  相似文献   

17.
Albert PS  Follmann DA  Wang SA  Suh EB 《Biometrics》2002,58(3):631-642
Longitudinal clinical trials often collect long sequences of binary data. Our application is a recent clinical trial in opiate addicts that examined the effect of a new treatment on repeated binary urine tests to assess opiate use over an extended follow-up. The dataset had two sources of missingness: dropout and intermittent missing observations. The primary endpoint of the study was comparing the marginal probability of a positive urine test over follow-up across treatment arms. We present a latent autoregressive model for longitudinal binary data subject to informative missingness. In this model, a Gaussian autoregressive process is shared between the binary response and missing-data processes, thereby inducing informative missingness. Our approach extends the work of others who have developed models that link the various processes through a shared random effect but do not allow for autocorrelation. We discuss parameter estimation using Monte Carlo EM and demonstrate through simulations that incorporating within-subject autocorrelation through a latent autoregressive process can be very important when longitudinal binary data is subject to informative missingness. We illustrate our new methodology using the opiate clinical trial data.  相似文献   

18.
In this paper, we develop a Gaussian estimation (GE) procedure to estimate the parameters of a regression model for correlated (longitudinal) binary response data using a working correlation matrix. A two‐step iterative procedure is proposed for estimating the regression parameters by the GE method and the correlation parameters by the method of moments. Consistency properties of the estimators are discussed. A simulation study was conducted to compare 11 estimators of the regression parameters, namely, four versions of the GE, five versions of the generalized estimating equations (GEEs), and two versions of the weighted GEE. Simulations show that (i) the Gaussian estimates have the smallest mean square error and best coverage probability if the working correlation structure is correctly specified and (ii) when the working correlation structure is correctly specified, the GE and the GEE with exchangeable correlation structure perform best as opposed to when the correlation structure is misspecified.  相似文献   

19.
Albert PS 《Biometrics》2000,56(2):602-608
Binary longitudinal data are often collected in clinical trials when interest is on assessing the effect of a treatment over time. Our application is a recent study of opiate addiction that examined the effect of a new treatment on repeated urine tests to assess opiate use over an extended follow-up. Drug addiction is episodic, and a new treatment may affect various features of the opiate-use process such as the proportion of positive urine tests over follow-up and the time to the first occurrence of a positive test. Complications in this trial were the large amounts of dropout and intermittent missing data and the large number of observations on each subject. We develop a transitional model for longitudinal binary data subject to nonignorable missing data and propose an EM algorithm for parameter estimation. We use the transitional model to derive summary measures of the opiate-use process that can be compared across treatment groups to assess treatment effect. Through analyses and simulations, we show the importance of properly accounting for the missing data mechanism when assessing the treatment effect in our example.  相似文献   

20.
Generalized estimating equations (Liang and Zeger, 1986) is a widely used, moment-based procedure to estimate marginal regression parameters. However, a subtle and often overlooked point is that valid inference requires the mean for the response at time t to be expressed properly as a function of the complete past, present, and future values of any time-varying covariate. For example, with environmental exposures it may be necessary to express the response as a function of multiple lagged values of the covariate series. Despite the fact that multiple lagged covariates may be predictive of outcomes, researchers often focus interest on parameters in a 'cross-sectional' model, where the response is expressed as a function of a single lag in the covariate series. Cross-sectional models yield parameters with simple interpretations and avoid issues of collinearity associated with multiple lagged values of a covariate. Pepe and Anderson (1994), showed that parameter estimates for time-varying covariates may be biased unless the mean, given all past, present, and future covariate values, is equal to the cross-sectional mean or unless independence estimating equations are used. Although working independence avoids potential bias, many authors have shown that a poor choice for the response correlation model can lead to highly inefficient parameter estimates. The purpose of this paper is to study the bias-efficiency trade-off associated with working correlation choices for application with binary response data. We investigate data characteristics or design features (e.g. cluster size, overall response association, functional form of the response association, covariate distribution, and others) that influence the small and large sample characteristics of parameter estimates obtained from several different weighting schemes or equivalently 'working' covariance models. We find that the impact of covariance model choice depends highly on the specific structure of the data features, and that key aspects should be examined before choosing a weighting scheme.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号