首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Variable Selection for Semiparametric Mixed Models in Longitudinal Studies   总被引:2,自引:0,他引:2  
Summary .  We propose a double-penalized likelihood approach for simultaneous model selection and estimation in semiparametric mixed models for longitudinal data. Two types of penalties are jointly imposed on the ordinary log-likelihood: the roughness penalty on the nonparametric baseline function and a nonconcave shrinkage penalty on linear coefficients to achieve model sparsity. Compared to existing estimation equation based approaches, our procedure provides valid inference for data with missing at random, and will be more efficient if the specified model is correct. Another advantage of the new procedure is its easy computation for both regression components and variance parameters. We show that the double-penalized problem can be conveniently reformulated into a linear mixed model framework, so that existing software can be directly used to implement our method. For the purpose of model inference, we derive both frequentist and Bayesian variance estimation for estimated parametric and nonparametric components. Simulation is used to evaluate and compare the performance of our method to the existing ones. We then apply the new method to a real data set from a lactation study.  相似文献   

3.
Yu Z  Lin X  Tu W 《Biometrics》2012,68(2):429-436
We consider frailty models with additive semiparametric covariate effects for clustered failure time data. We propose a doubly penalized partial likelihood (DPPL) procedure to estimate the nonparametric functions using smoothing splines. We show that the DPPL estimators could be obtained from fitting an augmented working frailty model with parametric covariate effects, whereas the nonparametric functions being estimated as linear combinations of fixed and random effects, and the smoothing parameters being estimated as extra variance components. This approach allows us to conveniently estimate all model components within a unified frailty model framework. We evaluate the finite sample performance of the proposed method via a simulation study, and apply the method to analyze data from a study of sexually transmitted infections (STI).  相似文献   

4.
Summary High‐dimensional and highly correlated data leading to non‐ or weakly identified effects are commonplace. Maximum likelihood will typically fail in such situations and a variety of shrinkage methods have been proposed. Standard techniques, such as ridge regression or the lasso, shrink estimates toward zero, with some approaches allowing coefficients to be selected out of the model by achieving a value of zero. When substantive information is available, estimates can be shrunk to nonnull values; however, such information may not be available. We propose a Bayesian semiparametric approach that allows shrinkage to multiple locations. Coefficients are given a mixture of heavy‐tailed double exponential priors, with location and scale parameters assigned Dirichlet process hyperpriors to allow groups of coefficients to be shrunk toward the same, possibly nonzero, mean. Our approach favors sparse, but flexible, structure by shrinking toward a small number of random locations. The methods are illustrated using a study of genetic polymorphisms and Parkinson's disease.  相似文献   

5.
Summary In studies involving functional data, it is commonly of interest to model the impact of predictors on the distribution of the curves, allowing flexible effects on not only the mean curve but also the distribution about the mean. Characterizing the curve for each subject as a linear combination of a high‐dimensional set of potential basis functions, we place a sparse latent factor regression model on the basis coefficients. We induce basis selection by choosing a shrinkage prior that allows many of the loadings to be close to zero. The number of latent factors is treated as unknown through a highly‐efficient, adaptive‐blocked Gibbs sampler. Predictors are included on the latent variables level, while allowing different predictors to impact different latent factors. This model induces a framework for functional response regression in which the distribution of the curves is allowed to change flexibly with predictors. The performance is assessed through simulation studies and the methods are applied to data on blood pressure trajectories during pregnancy.  相似文献   

6.
We propose methods for Bayesian inference for a new class of semiparametric survival models with a cure fraction. Specifically, we propose a semiparametric cure rate model with a smoothing parameter that controls the degree of parametricity in the right tail of the survival distribution. We show that such a parameter is crucial for these kinds of models and can have an impact on the posterior estimates. Several novel properties of the proposed model are derived. In addition, we propose a class of improper noninformative priors based on this model and examine the properties of the implied posterior. Also, a class of informative priors based on historical data is proposed and its theoretical properties are investigated. A case study involving a melanoma clinical trial is discussed in detail to demonstrate the proposed methodology.  相似文献   

7.
Biomedical studies often collect multivariate event time data from multiple clusters (either subjects or groups) within each of which event times for individuals are correlated and the correlation may vary in different classes. In such survival analyses, heterogeneity among clusters for shared and specific classes can be accommodated by incorporating parametric frailty terms into the model. In this article, we propose a Bayesian approach to relax the parametric distribution assumption for shared and specific‐class frailties by using a Dirichlet process prior while also allowing for the uncertainty of heterogeneity for different classes. Multiple cluster‐specific frailty selections rely on variable selection‐type mixture priors by applying mixtures of point masses at zero and inverse gamma distributions to the variance of log frailties. This selection allows frailties with zero variance to effectively drop out of the model. A reparameterization of log‐frailty terms is performed to reduce the potential bias of fixed effects due to variation of the random distribution and dependence among the parameters resulting in easy interpretation and faster Markov chain Monte Carlo convergence. Simulated data examples and an application to a lung cancer clinical trial are used for illustration.  相似文献   

8.
Summary : We propose a semiparametric Bayesian method for handling measurement error in nutritional epidemiological data. Our goal is to estimate nonparametrically the form of association between a disease and exposure variable while the true values of the exposure are never observed. Motivated by nutritional epidemiological data, we consider the setting where a surrogate covariate is recorded in the primary data, and a calibration data set contains information on the surrogate variable and repeated measurements of an unbiased instrumental variable of the true exposure. We develop a flexible Bayesian method where not only is the relationship between the disease and exposure variable treated semiparametrically, but also the relationship between the surrogate and the true exposure is modeled semiparametrically. The two nonparametric functions are modeled simultaneously via B‐splines. In addition, we model the distribution of the exposure variable as a Dirichlet process mixture of normal distributions, thus making its modeling essentially nonparametric and placing this work into the context of functional measurement error modeling. We apply our method to the NIH‐AARP Diet and Health Study and examine its performance in a simulation study.  相似文献   

9.
Confidence intervals on the total variance in an unbalanced random two-fold nested design are constructed and compared. Computer simulation indicates the proposed intervals provide confidence coefficients that are generally close to the stated level.  相似文献   

10.
Mukherjee B  Zhang L  Ghosh M  Sinha S 《Biometrics》2007,63(3):834-844
In case-control studies of gene-environment association with disease, when genetic and environmental exposures can be assumed to be independent in the underlying population, one may exploit the independence in order to derive more efficient estimation techniques than the traditional logistic regression analysis (Chatterjee and Carroll, 2005, Biometrika92, 399-418). However, covariates that stratify the population, such as age, ethnicity and alike, could potentially lead to nonindependence. In this article, we provide a novel semiparametric Bayesian approach to model stratification effects under the assumption of gene-environment independence in the control population. We illustrate the methods by applying them to data from a population-based case-control study on ovarian cancer conducted in Israel. A simulation study is conducted to compare our method with other popular choices. The results reflect that the semiparametric Bayesian model allows incorporation of key scientific evidence in the form of a prior and offers a flexible, robust alternative when standard parametric model assumptions do not hold.  相似文献   

11.
Summary In National Toxicology Program (NTP) studies, investigators want to assess whether a test agent is carcinogenic overall and specific to certain tumor types, while estimating the dose‐response profiles. Because there are potentially correlations among the tumors, a joint inference is preferred to separate univariate analyses for each tumor type. In this regard, we propose a random effect logistic model with a matrix of coefficients representing log‐odds ratios for the adjacent dose groups for tumors at different sites. We propose appropriate nonparametric priors for these coefficients to characterize the correlations and to allow borrowing of information across different dose groups and tumor types. Global and local hypotheses can be easily evaluated by summarizing the output of a single Monte Carlo Markov chain (MCMC). Two multiple testing procedures are applied for testing local hypotheses based on the posterior probabilities of local alternatives. Simulation studies are conducted and an NTP tumor data set is analyzed illustrating the proposed approach.  相似文献   

12.
Summary .  We consider semiparametric transition measurement error models for longitudinal data, where one of the covariates is measured with error in transition models, and no distributional assumption is made for the underlying unobserved covariate. An estimating equation approach based on the pseudo conditional score method is proposed. We show the resulting estimators of the regression coefficients are consistent and asymptotically normal. We also discuss the issue of efficiency loss. Simulation studies are conducted to examine the finite-sample performance of our estimators. The longitudinal AIDS Costs and Services Utilization Survey data are analyzed for illustration.  相似文献   

13.
14.
Analysis of longitudinal data with excessive zeros has gained increasing attention in recent years; however, current approaches to the analysis of longitudinal data with excessive zeros have primarily focused on balanced data. Dropouts are common in longitudinal studies; therefore, the analysis of the resulting unbalanced data is complicated by the missing mechanism. Our study is motivated by the analysis of longitudinal skin cancer count data presented by Greenberg, Baron, Stukel, Stevens, Mandel, Spencer, Elias, Lowe, Nierenberg, Bayrd, Vance, Freeman, Clendenning, Kwan, and the Skin Cancer Prevention Study Group[New England Journal of Medicine 323 , 789–795]. The data consist of a large number of zero responses (83% of the observations) as well as a substantial amount of dropout (about 52% of the observations). To account for both excessive zeros and dropout patterns, we propose a pattern‐mixture zero‐inflated model with compound Poisson random effects for the unbalanced longitudinal skin cancer data. We also incorporate an autoregressive of order 1 correlation structure in the model to capture longitudinal correlation of the count responses. A quasi‐likelihood approach has been developed in the estimation of our model. We illustrated the method with analysis of the longitudinal skin cancer data.  相似文献   

15.
Ying Yuan  Guosheng Yin 《Biometrics》2010,66(1):105-114
Summary .  We study quantile regression (QR) for longitudinal measurements with nonignorable intermittent missing data and dropout. Compared to conventional mean regression, quantile regression can characterize the entire conditional distribution of the outcome variable, and is more robust to outliers and misspecification of the error distribution. We account for the within-subject correlation by introducing a   ℓ2   penalty in the usual QR check function to shrink the subject-specific intercepts and slopes toward the common population values. The informative missing data are assumed to be related to the longitudinal outcome process through the shared latent random effects. We assess the performance of the proposed method using simulation studies, and illustrate it with data from a pediatric AIDS clinical trial.  相似文献   

16.
17.
In the case of the mixed linear model the random effects are usually assumed to be normally distributed in both the Bayesian and classical frameworks. In this paper, the Dirichlet process prior was used to provide nonparametric Bayesian estimates for correlated random effects. This goal was achieved by providing a Gibbs sampler algorithm that allows these correlated random effects to have a nonparametric prior distribution. A sampling based method is illustrated. This method which is employed by transforming the genetic covariance matrix to an identity matrix so that the random effects are uncorrelated, is an extension of the theory and the results of previous researchers. Also by using Gibbs sampling and data augmentation a simulation procedure was derived for estimating the precision parameter M associated with the Dirichlet process prior. All needed conditional posterior distributions are given. To illustrate the application, data from the Elsenburg Dormer sheep stud were analysed. A total of 3325 weaning weight records from the progeny of 101 sires were used.  相似文献   

18.
Zhang D 《Biometrics》2004,60(1):8-15
The routinely assumed parametric functional form in the linear predictor of a generalized linear mixed model for longitudinal data may be too restrictive to represent true underlying covariate effects. We relax this assumption by representing these covariate effects by smooth but otherwise arbitrary functions of time, with random effects used to model the correlation induced by among-subject and within-subject variation. Due to the usually intractable integration involved in evaluating the quasi-likelihood function, the double penalized quasi-likelihood (DPQL) approach of Lin and Zhang (1999, Journal of the Royal Statistical Society, Series B61, 381-400) is used to estimate the varying coefficients and the variance components simultaneously by representing a nonparametric function by a linear combination of fixed effects and random effects. A scaled chi-squared test based on the mixed model representation of the proposed model is developed to test whether an underlying varying coefficient is a polynomial of certain degree. We evaluate the performance of the procedures through simulation studies and illustrate their application with Indonesian children infectious disease data.  相似文献   

19.
20.
G. Y. Yi  W. Liu  Lang Wu 《Biometrics》2011,67(1):67-75
Summary Longitudinal data arise frequently in medical studies and it is common practice to analyze such data with generalized linear mixed models. Such models enable us to account for various types of heterogeneity, including between‐ and within‐subjects ones. Inferential procedures complicate dramatically when missing observations or measurement error arise. In the literature, there has been considerable interest in accommodating either incompleteness or covariate measurement error under random effects models. However, there is relatively little work concerning both features simultaneously. There is a need to fill up this gap as longitudinal data do often have both characteristics. In this article, our objectives are to study simultaneous impact of missingness and covariate measurement error on inferential procedures and to develop a valid method that is both computationally feasible and theoretically valid. Simulation studies are conducted to assess the performance of the proposed method, and a real example is analyzed with the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号