首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
Researchers interested in the association of a predictor with an outcome will often collect information about that predictor from more than one source. Standard multiple regression methods allow estimation of the effect of each predictor on the outcome while controlling for the remaining predictors. The resulting regression coefficient for each predictor has an interpretation that is conditional on all other predictors. In settings in which interest is in comparison of the marginal pairwise relationships between each predictor and the outcome separately (e.g., studies in psychiatry with multiple informants or comparison of the predictive values of diagnostic tests), standard regression methods are not appropriate. Instead, the generalized estimating equations (GEE) approach can be used to simultaneously estimate, and make comparisons among, the separate pairwise marginal associations. In this paper, we consider maximum likelihood (ML) estimation of these marginal relationships when the outcome is binary. ML enjoys benefits over GEE methods in that it is asymptotically efficient, can accommodate missing data that are ignorable, and allows likelihood-based inferences about the pairwise marginal relationships. We also explore the asymptotic relative efficiency of ML and GEE methods in this setting.  相似文献   

2.
In many observational studies, individuals are measured repeatedly over time, although not necessarily at a set of prespecified occasions. Instead, individuals may be measured at irregular intervals, with those having a history of poorer health outcomes being measured with somewhat greater frequency and regularity; i.e., those individuals with poorer health outcomes may have more frequent follow-up measurements and the intervals between their repeated measurements may be shorter. In this article, we consider estimation of regression parameters in models for longitudinal data where the follow-up times are not fixed by design but can depend on previous outcomes. In particular, we focus on general linear models for longitudinal data where the repeated measures are assumed to have a multivariate Gaussian distribution. We consider assumptions regarding the follow-up time process that result in the likelihood function separating into two components: one for the follow-up time process, the other for the outcome process. The practical implication of this separation is that the former process can be ignored when making likelihood-based inferences about the latter; i.e., maximum likelihood (ML) estimation of the regression parameters relating the mean of the longitudinal outcomes to covariates does not require that a model for the distribution of follow-up times be specified. As a result, standard statistical software, e.g., SAS PROC MIXED (Littell et al., 1996, SAS System for Mixed Models), can be used to analyze the data. However, we also demonstrate that misspecification of the model for the covariance among the repeated measures will, in general, result in regression parameter estimates that are biased. Furthermore, results of a simulation study indicate that the potential bias due to misspecification of the covariance can be quite considerable in this setting. Finally, we illustrate these results using data from a longitudinal observational study (Lipshultz et al., 1995, New England Journal of Medicine 332, 1738-1743) that explored the cardiotoxic effects of doxorubicin chemotherapy for the treatment of acute lymphoblastic leukemia in children.  相似文献   

3.
Association Models for Clustered Data with Binary and Continuous Responses   总被引:1,自引:0,他引:1  
Summary .  We consider analysis of clustered data with mixed bivariate responses, i.e., where each member of the cluster has a binary and a continuous outcome. We propose a new bivariate random effects model that induces associations among the binary outcomes within a cluster, among the continuous outcomes within a cluster, between a binary outcome and a continuous outcome from different subjects within a cluster, as well as the direct association between the binary and continuous outcomes within the same subject. For the ease of interpretations of the regression effects, the marginal model of the binary response probability integrated over the random effects preserves the logistic form and the marginal expectation of the continuous response preserves the linear form. We implement maximum likelihood estimation of our model parameters using standard software such as PROC NLMIXED of SAS . Our simulation study demonstrates the robustness of our method with respect to the misspecification of the regression model as well as the random effects model. We illustrate our methodology by analyzing a developmental toxicity study of ethylene glycol in mice.  相似文献   

4.
O'Brien SM  Dunson DB 《Biometrics》2004,60(3):739-746
Bayesian analyses of multivariate binary or categorical outcomes typically rely on probit or mixed effects logistic regression models that do not have a marginal logistic structure for the individual outcomes. In addition, difficulties arise when simple noninformative priors are chosen for the covariance parameters. Motivated by these problems, we propose a new type of multivariate logistic distribution that can be used to construct a likelihood for multivariate logistic regression analysis of binary and categorical data. The model for individual outcomes has a marginal logistic structure, simplifying interpretation. We follow a Bayesian approach to estimation and inference, developing an efficient data augmentation algorithm for posterior computation. The method is illustrated with application to a neurotoxicology study.  相似文献   

5.
This article presents a likelihood-based method for handling nonignorable dropout in longitudinal studies with binary responses. The methodology developed is appropriate when the target of inference is the marginal distribution of the response at each occasion and its dependence on covariates. A "hybrid" model is formulated, which is designed to retain advantageous features of the selection and pattern-mixture model approaches. This formulation accommodates a variety of assumed forms of nonignorable dropout, while maintaining transparency of the constraints required for identifying the overall model. Once appropriate identifying constraints have been imposed, likelihood-based estimation is conducted via the EM algorithm. The article concludes by applying the approach to data from a randomized clinical trial comparing two doses of a contraceptive.  相似文献   

6.
In longitudinal studies, measurements of the same individuals are taken repeatedly through time. Often, the primary goal is to characterize the change in response over time and the factors that influence change. Factors can affect not only the location but also more generally the shape of the distribution of the response over time. To make inference about the shape of a population distribution, the widely popular mixed-effects regression, for example, would be inadequate, if the distribution is not approximately Gaussian. We propose a novel linear model for quantile regression (QR) that includes random effects in order to account for the dependence between serial observations on the same subject. The notion of QR is synonymous with robust analysis of the conditional distribution of the response variable. We present a likelihood-based approach to the estimation of the regression quantiles that uses the asymmetric Laplace density. In a simulation study, the proposed method had an advantage in terms of mean squared error of the QR estimator, when compared with the approach that considers penalized fixed effects. Following our strategy, a nearly optimal degree of shrinkage of the individual effects is automatically selected by the data and their likelihood. Also, our model appears to be a robust alternative to the mean regression with random effects when the location parameter of the conditional distribution of the response is of interest. We apply our model to a real data set which consists of self-reported amount of labor pain measurements taken on women repeatedly over time, whose distribution is characterized by skewness, and the significance of the parameters is evaluated by the likelihood ratio statistic.  相似文献   

7.
For analyzing longitudinal binary data with nonignorable and nonmonotone missing responses, a full likelihood method is complicated algebraically, and often requires intensive computation, especially when there are many follow-up times. As an alternative, a pseudolikelihood approach has been proposed in the literature under minimal parametric assumptions. This formulation only requires specification of the marginal distributions of the responses and missing data mechanism, and uses an independence working assumption. However, this estimator can be inefficient for estimating both time-varying and time-stationary effects under moderate to strong within-subject associations among repeated responses. In this article, we propose an alternative estimator, based on a bivariate pseudolikelihood, and demonstrate in simulations that the proposed method can be much more efficient than the previous pseudolikelihood obtained under the assumption of independence. We illustrate the method using longitudinal data on CD4 counts from two clinical trials of HIV-infected patients.  相似文献   

8.
For observational longitudinal studies of geriatric populations, outcomes such as disability or cognitive functioning are often censored by death. Statistical analysis of such data may explicitly condition on either vital status or survival time when summarizing the longitudinal response. For example a pattern-mixture model characterizes the mean response at time t conditional on death at time S = s (for s > t), and thus uses future status as a predictor for the time t response. As an alternative, we define regression conditioning on being alive as a regression model that conditions on survival status, rather than a specific survival time. Such models may be referred to as partly conditional since the mean at time t is specified conditional on being alive (S > t), rather than using finer stratification (S = s for s > t). We show that naive use of standard likelihood-based longitudinal methods and generalized estimating equations with non-independence weights may lead to biased estimation of the partly conditional mean model. We develop a taxonomy for accommodation of both dropout and death, and describe estimation for binary longitudinal data that applies selection weights to estimating equations with independence working correlation. Simulation studies and an analysis of monthly disability status illustrate potential bias in regression methods that do not explicitly condition on survival.  相似文献   

9.
Wang Z  Louis TA 《Biometrics》2004,60(4):884-891
Marginal models and conditional mixed-effects models are commonly used for clustered binary data. However, regression parameters and predictions in nonlinear mixed-effects models usually do not have a direct marginal interpretation, because the conditional functional form does not carry over to the margin. Because both marginal and conditional inferences are of interest, a unified approach is attractive. To this end, we investigate a parameterization of generalized linear mixed models with a structured random-intercept distribution that matches the conditional and marginal shapes. We model the marginal mean of response distribution and select the distribution of the random intercept to produce the match and also to model covariate-dependent random effects. We discuss the relation between this approach and some existing models and compare the approaches on two datasets.  相似文献   

10.
Summary In this article, we propose a positive stable shared frailty Cox model for clustered failure time data where the frailty distribution varies with cluster‐level covariates. The proposed model accounts for covariate‐dependent intracluster correlation and permits both conditional and marginal inferences. We obtain marginal inference directly from a marginal model, then use a stratified Cox‐type pseudo‐partial likelihood approach to estimate the regression coefficient for the frailty parameter. The proposed estimators are consistent and asymptotically normal and a consistent estimator of the covariance matrix is provided. Simulation studies show that the proposed estimation procedure is appropriate for practical use with a realistic number of clusters. Finally, we present an application of the proposed method to kidney transplantation data from the Scientific Registry of Transplant Recipients.  相似文献   

11.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

12.
We consider the estimation of a nonparametric smooth function of some event time in a semiparametric mixed effects model from repeatedly measured data when the event time is subject to right censoring. The within-subject correlation is captured by both cross-sectional and time-dependent random effects, where the latter is modeled by a nonhomogeneous Ornstein–Uhlenbeck stochastic process. When the censoring probability depends on other variables in the model, which often happens in practice, the event time data are not missing completely at random. Hence, the complete case analysis by eliminating all the censored observations may yield biased estimates of the regression parameters including the smooth function of the event time, and is less efficient. To remedy, we derive the likelihood function for the observed data by modeling the event time distribution given other covariates. We propose a two-stage pseudo-likelihood approach for the estimation of model parameters by first plugging an estimator of the conditional event time distribution into the likelihood and then maximizing the resulting pseudo-likelihood function. Empirical evaluation shows that the proposed method yields negligible biases while significantly reduces the estimation variability. This research is motivated by the project of hormone profile estimation around age at the final menstrual period for the cohort of women in the Michigan Bone Health and Metabolism Study.  相似文献   

13.
14.
Capturing complex dependence structures between outcome variables (e.g., study endpoints) is of high relevance in contemporary biomedical data problems and medical research. Distributional copula regression provides a flexible tool to model the joint distribution of multiple outcome variables by disentangling the marginal response distributions and their dependence structure. In a regression setup, each parameter of the copula model, that is, the marginal distribution parameters and the copula dependence parameters, can be related to covariates via structured additive predictors. We propose a framework to fit distributional copula regression via model-based boosting, which is a modern estimation technique that incorporates useful features like an intrinsic variable selection mechanism, parameter shrinkage and the capability to fit regression models in high-dimensional data setting, that is, situations with more covariates than observations. Thus, model-based boosting does not only complement existing Bayesian and maximum-likelihood based estimation frameworks for this model class but rather enables unique intrinsic mechanisms that can be helpful in many applied problems. The performance of our boosting algorithm for copula regression models with continuous margins is evaluated in simulation studies that cover low- and high-dimensional data settings and situations with and without dependence between the responses. Moreover, distributional copula boosting is used to jointly analyze and predict the length and the weight of newborns conditional on sonographic measurements of the fetus before delivery together with other clinical variables.  相似文献   

15.
Dropouts are common in longitudinal study. If the dropout probability depends on the missing observations at or after dropout, this type of dropout is called informative (or nonignorable) dropout (ID). Failure to accommodate such dropout mechanism into the model will bias the parameter estimates. We propose a conditional autoregressive model for longitudinal binary data with an ID model such that the probabilities of positive outcomes as well as the drop‐out indicator in each occasion are logit linear in some covariates and outcomes. This model adopting a marginal model for outcomes and a conditional model for dropouts is called a selection model. To allow for the heterogeneity and clustering effects, the outcome model is extended to incorporate mixture and random effects. Lastly, the model is further extended to a novel model that models the outcome and dropout jointly such that their dependency is formulated through an odds ratio function. Parameters are estimated by a Bayesian approach implemented using the user‐friendly Bayesian software WinBUGS. A methadone clinic dataset is analyzed to illustrate the proposed models. Result shows that the treatment time effect is still significant but weaker after allowing for an ID process in the data. Finally the effect of drop‐out on parameter estimates is evaluated through simulation studies.  相似文献   

16.
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one‐step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster‐deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts.  相似文献   

17.
Ekholm A  McDonald JW  Smith PW 《Biometrics》2000,56(3):712-718
Models for a multivariate binary response are parameterized by univariate marginal probabilities and dependence ratios of all orders. The w-order dependence ratio is the joint success probability of w binary responses divided by the joint success probability assuming independence. This parameterization supports likelihood-based inference for both regression parameters, relating marginal probabilities to explanatory variables, and association model parameters, relating dependence ratios to simple and meaningful mechanisms. Five types of association models are proposed, where responses are (1) independent given a necessary factor for the possibility of a success, (2) independent given a latent binary factor, (3) independent given a latent beta distributed variable, (4) follow a Markov chain, and (5) follow one of two first-order Markov chains depending on the realization of a binary latent factor. These models are illustrated by reanalyzing three data sets, foremost a set of binary time series on auranofin therapy against arthritis. Likelihood-based approaches are contrasted with approaches based on generalized estimating equations. Association models specified by dependence ratios are contrasted with other models for a multivariate binary response that are specified by odds ratios or correlation coefficients.  相似文献   

18.
A pseudolikelihood method for analyzing interval censored data   总被引:1,自引:0,他引:1  
We introduce a method based on a pseudolikelihood ratio forestimating the distribution function of the survival time ina mixed-case interval censoring model. In a mixed-case model,an individual is observed a random number of times, and at eachtime it is recorded whether an event has happened or not. Oneseeks to estimate the distribution of time to event. We usea Poisson process as the basis of a likelihood function to constructa pseudolikelihood ratio statistic for testing the value ofthe distribution function at a fixed point, and show that thisconverges under the null hypothesis to a known limit distribution,that can be expressed as a functional of different convex minorantsof a two-sided Brownian motion process with parabolic drift.Construction of confidence sets then proceeds by standard inversion.The computation of the confidence sets is simple, requiringthe use of the pool-adjacent-violators algorithm or a standardisotonic regression algorithm. We also illustrate the superiorityof the proposed method over competitors based on resamplingtechniques or on the limit distribution of the maximum pseudolikelihoodestimator, through simulation studies, and illustrate the differentmethods on a dataset involving time to HIV seroconversion ina group of haemophiliacs.  相似文献   

19.
We describe a new multivariate gamma distribution and discuss its implication in a Poisson-correlated gamma-frailty model. This model is introduced to account for between-subjects correlation occurring in longitudinal count data. For likelihood-based inference involving distributions in which high-dimensional dependencies are present, it may be useful to approximate likelihoods based on the univariate or bivariate marginal distributions. The merit of composite likelihood is to reduce the computational complexity of the full likelihood. A 2-stage composite-likelihood procedure is developed for estimating the model parameters. The suggested method is applied to a meta-analysis study for survival curves.  相似文献   

20.
An important issue in the phylogenetic analysis of nucleotide sequence data using the maximum likelihood (ML) method is the underlying evolutionary model employed. We consider the problem of simultaneously estimating the tree topology and the parameters in the underlying substitution model and of obtaining estimates of the standard errors of these parameter estimates. Given a fixed tree topology and corresponding set of branch lengths, the ML estimates of standard evolutionary model parameters are asymptotically efficient, in the sense that their joint distribution is asymptotically normal with the variance–covariance matrix given by the inverse of the Fisher information matrix. We propose a new estimate of this conditional variance based on estimation of the expected information using a Monte Carlo sampling (MCS) method. Simulations are used to compare this conditional variance estimate to the standard technique of using the observed information under a variety of experimental conditions. In the case in which one wishes to estimate simultaneously the tree and parameters, we provide a bootstrapping approach that can be used in conjunction with the MCS method to estimate the unconditional standard error. The methods developed are applied to a real data set consisting of 30 papillomavirus sequences. This overall method is easily incorporated into standard bootstrapping procedures to allow for proper variance estimation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号