首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Nummi T  Pan J  Siren T  Liu K 《Biometrics》2011,67(3):871-875
Summary In most research on smoothing splines the focus has been on estimation, while inference, especially hypothesis testing, has received less attention. By defining design matrices for fixed and random effects and the structure of the covariance matrices of random errors in an appropriate way, the cubic smoothing spline admits a mixed model formulation, which places this nonparametric smoother firmly in a parametric setting. Thus nonlinear curves can be included with random effects and random coefficients. The smoothing parameter is the ratio of the random‐coefficient and error variances and tests for linear regression reduce to tests for zero random‐coefficient variances. We propose an exact F‐test for the situation and investigate its performance in a real pine stem data set and by simulation experiments. Under certain conditions the suggested methods can also be applied when the data are dependent.  相似文献   

The restricted mean survival time (RMST) evaluates the expectation of survival time truncated by a prespecified time point, because the mean survival time in the presence of censoring is typically not estimable. The frequentist inference procedure for RMST has been widely advocated for comparison of two survival curves, while research from the Bayesian perspective is rather limited. For the RMST of both right- and interval-censored data, we propose Bayesian nonparametric estimation and inference procedures. By assigning a mixture of Dirichlet processes (MDP) prior to the distribution function, we can estimate the posterior distribution of RMST. We also explore another Bayesian nonparametric approach using the Dirichlet process mixture model and make comparisons with the frequentist nonparametric method. Simulation studies demonstrate that the Bayesian nonparametric RMST under diffuse MDP priors leads to robust estimation and under informative priors it can incorporate prior knowledge into the nonparametric estimator. Analysis of real trial examples demonstrates the flexibility and interpretability of the Bayesian nonparametric RMST for both right- and interval-censored data.  相似文献   

Bornkamp B  Ickstadt K 《Biometrics》2009,65(1):198-205
Summary .  In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose–response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose–response analysis.  相似文献   

Prenatal exposure to carcinogenic polycyclic aromatic hydrocarbons (c‐PAHs) through maternal inhalation induces higher risk for a wide range of fetotoxic effects. However, the most health‐relevant dose function from chronic gestational exposure remains unclear. Whether there is a gestational window during which the human embryo/fetus is particularly vulnerable to PAHs has not been examined thoroughly. We consider a longitudinal semiparametric‐mixed effect model to characterize the individual prenatal PAH exposure trajectory, where a nonparametric cyclic smooth function plus a linear function are used to model the time effect and random effects are used to account for the within‐subject correlation. We propose a penalized least squares approach to estimate the parametric regression coefficients and the nonparametric function of time. The smoothing parameter and variance components are selected using the generalized cross‐validation (GCV) criteria. The estimated subject‐specific trajectory of prenatal exposure is linked to the birth outcomes through a set of functional linear models, where the coefficient of log PAH exposure is a fully nonparametric function of gestational age. This allows the effect of PAH exposure on each birth outcome to vary at different gestational ages, and the window associated with significant adverse effect is identified as a vulnerable prenatal window to PAHs on fetal growth. We minimize the penalized sum of squared errors using a spline‐based expansion of the nonparametric coefficient function to draw statistical inferences, and the smoothing parameter is chosen through GCV.  相似文献   

A nationwide health card recording system for dairy cattle was introduced in Norway in 1975 (the Norwegian Cattle Health Services). The data base holds information on mastitis occurrences on an individual cow basis. A reduction in mastitis frequency across the population is desired, and for this purpose risk factors are investigated. In this paper a Bayesian proportional hazards model is used for modelling the time to first veterinary treatment of clinical mastitis, including both genetic and environmental covariates. Sire effects were modelled as shared random components, and veterinary district was included as an environmental effect with prior spatial smoothing. A non-informative smoothing prior was assumed for the baseline hazard, and Markov chain Monte Carlo methods (MCMC) were used for inference. We propose a new measure of quality for sires, in terms of their posterior probability of being among the, say 10% best sires. The probability is an easily interpretable measure that can be directly used to rank sires. Estimating these complex probabilities is straightforward in an MCMC setting. The results indicate considerable differences between sires with regards to their daughters disease resistance. A regional effect was also discovered with the lowest risk of disease in the south-eastern parts of Norway.  相似文献   

Disease incidence or mortality data are typically available as rates or counts for specified regions, collected over time. We propose Bayesian nonparametric spatial modeling approaches to analyze such data. We develop a hierarchical specification using spatial random effects modeled with a Dirichlet process prior. The Dirichlet process is centered around a multivariate normal distribution. This latter distribution arises from a log-Gaussian process model that provides a latent incidence rate surface, followed by block averaging to the areal units determined by the regions in the study. With regard to the resulting posterior predictive inference, the modeling approach is shown to be equivalent to an approach based on block averaging of a spatial Dirichlet process to obtain a prior probability model for the finite dimensional distribution of the spatial random effects. We introduce a dynamic formulation for the spatial random effects to extend the model to spatio-temporal settings. Posterior inference is implemented through Gibbs sampling. We illustrate the methodology with simulated data as well as with a data set on lung cancer incidences for all 88 counties in the state of Ohio over an observation period of 21 years.  相似文献   

Fully Bayesian methods for Cox models specify a model for the baseline hazard function. Parametric approaches generally provide monotone estimations. Semi‐parametric choices allow for more flexible patterns but they can suffer from overfitting and instability. Regularization methods through prior distributions with correlated structures usually give reasonable answers to these types of situations. We discuss Bayesian regularization for Cox survival models defined via flexible baseline hazards specified by a mixture of piecewise constant functions and by a cubic B‐spline function. For those “semi‐parametric” proposals, different prior scenarios ranging from prior independence to particular correlated structures are discussed in a real study with microvirulence data and in an extensive simulation scenario that includes different data sample and time axis partition sizes in order to capture risk variations. The posterior distribution of the parameters was approximated using Markov chain Monte Carlo methods. Model selection was performed in accordance with the deviance information criteria and the log pseudo‐marginal likelihood. The results obtained reveal that, in general, Cox models present great robustness in covariate effects and survival estimates independent of the baseline hazard specification. In relation to the “semi‐parametric” baseline hazard specification, the B‐splines hazard function is less dependent on the regularization process than the piecewise specification because it demands a smaller time axis partition to estimate a similar behavior of the risk.  相似文献   

We consider testing whether the nonparametric function in a semiparametric additive mixed model is a simple fixed degree polynomial, for example, a simple linear function. This test provides a goodness-of-fit test for checking parametric models against nonparametric models. It is based on the mixed-model representation of the smoothing spline estimator of the nonparametric function and the variance component score test by treating the inverse of the smoothing parameter as an extra variance component. We also consider testing the equivalence of two nonparametric functions in semiparametric additive mixed models for two groups, such as treatment and placebo groups. The proposed tests are applied to data from an epidemiological study and a clinical trial and their performance is evaluated through simulations.  相似文献   

Variable Selection for Semiparametric Mixed Models in Longitudinal Studies   总被引:2,自引:0,他引:2  
Summary .  We propose a double-penalized likelihood approach for simultaneous model selection and estimation in semiparametric mixed models for longitudinal data. Two types of penalties are jointly imposed on the ordinary log-likelihood: the roughness penalty on the nonparametric baseline function and a nonconcave shrinkage penalty on linear coefficients to achieve model sparsity. Compared to existing estimation equation based approaches, our procedure provides valid inference for data with missing at random, and will be more efficient if the specified model is correct. Another advantage of the new procedure is its easy computation for both regression components and variance parameters. We show that the double-penalized problem can be conveniently reformulated into a linear mixed model framework, so that existing software can be directly used to implement our method. For the purpose of model inference, we derive both frequentist and Bayesian variance estimation for estimated parametric and nonparametric components. Simulation is used to evaluate and compare the performance of our method to the existing ones. We then apply the new method to a real data set from a lactation study.  相似文献   

Morita S  Thall PF  Müller P 《Biometrics》2008,64(2):595-602
Summary .   We present a definition for the effective sample size of a parametric prior distribution in a Bayesian model, and propose methods for computing the effective sample size in a variety of settings. Our approach first constructs a prior chosen to be vague in a suitable sense, and updates this prior to obtain a sequence of posteriors corresponding to each of a range of sample sizes. We then compute a distance between each posterior and the parametric prior, defined in terms of the curvature of the logarithm of each distribution, and the posterior minimizing the distance defines the effective sample size of the prior. For cases where the distance cannot be computed analytically, we provide a numerical approximation based on Monte Carlo simulation. We provide general guidelines for application, illustrate the method in several standard cases where the answer seems obvious, and then apply it to some nonstandard settings.  相似文献   

Inference of the insulin secretion rate (ISR) from C-peptide measurements as a quantification of pancreatic β-cell function is clinically important in diseases related to reduced insulin sensitivity and insulin action. ISR derived from C-peptide concentration is an example of nonparametric Bayesian model selection where a proposed ISR time-course is considered to be a "model". An inferred value of inaccessible continuous variables from discrete observable data is often problematic in biology and medicine, because it is a priori unclear how robust the inference is to the deletion of data points, and a closely related question, how much smoothness or continuity the data actually support. Predictions weighted by the posterior distribution can be cast as functional integrals as used in statistical field theory. Functional integrals are generally difficult to evaluate, especially for nonanalytic constraints such as positivity of the estimated parameters. We propose a computationally tractable method that uses the exact solution of an associated likelihood function as a prior probability distribution for a Markov-chain Monte Carlo evaluation of the posterior for the full model. As a concrete application of our method, we calculate the ISR from actual clinical C-peptide measurements in human subjects with varying degrees of insulin sensitivity. Our method demonstrates the feasibility of functional integral Bayesian model selection as a practical method for such data-driven inference, allowing the data to determine the smoothing timescale and the width of the prior probability distribution on the space of models. In particular, our model comparison method determines the discrete time-step for interpolation of the unobservable continuous variable that is supported by the data. Attempts to go to finer discrete time-steps lead to less likely models.  相似文献   

Yue YR  Loh JM 《Biometrics》2011,67(3):937-946
In this work we propose a fully Bayesian semiparametric method to estimate the intensity of an inhomogeneous spatial point process. The basic idea is to first convert intensity estimation into a Poisson regression setting via binning data points on a regular grid, and then model the log intensity semiparametrically using an adaptive version of Gaussian Markov random fields to smooth the corresponding counts. The inference is carried by an efficient Markov chain Monte Carlo simulation algorithm. Compared to existing methods for intensity estimation, for example, parametric modeling and kernel smoothing, the proposed estimator not only provides inference regarding the dependence of the intensity function on possible covariates, but also uses information from the data to adaptively determine the amount of smoothing at the local level. The effectiveness of using our method is demonstrated through simulation studies and an application to a rainforest dataset.  相似文献   

We propose a Bayesian hypothesis testing procedure for comparing the distributions of paired samples. The procedure is based on a flexible model for the joint distribution of both samples. The flexibility is given by a mixture of Dirichlet processes. Our proposal uses a spike-slab prior specification for the base measure of the Dirichlet process and a particular parametrization for the kernel of the mixture in order to facilitate comparisons and posterior inference. The joint model allows us to derive the marginal distributions and test whether they differ or not. The procedure exploits the correlation between samples, relaxes the parametric assumptions, and detects possible differences throughout the entire distributions. A Monte Carlo simulation study comparing the performance of this strategy to other traditional alternatives is provided. Finally, we apply the proposed approach to spirometry data collected in the United States to investigate changes in pulmonary function in children and adolescents in response to air polluting factors.  相似文献   

We discuss inference for data with repeated measurements at multiple levels. The motivating example is data with blood counts from cancer patients undergoing multiple cycles of chemotherapy, with days nested within cycles. Some inference questions relate to repeated measurements over days within cycle, while other questions are concerned with the dependence across cycles. When the desired inference relates to both levels of repetition, it becomes important to reflect the data structure in the model. We develop a semiparametric Bayesian modeling approach, restricting attention to two levels of repeated measurements. For the top-level longitudinal sampling model we use random effects to introduce the desired dependence across repeated measurements. We use a nonparametric prior for the random effects distribution. Inference about dependence across second-level repetition is implemented by the clustering implied in the nonparametric random effects model. Practical use of the model requires that the posterior distribution on the latent random effects be reasonably precise.  相似文献   

In the case of the mixed linear model the random effects are usually assumed to be normally distributed in both the Bayesian and classical frameworks. In this paper, the Dirichlet process prior was used to provide nonparametric Bayesian estimates for correlated random effects. This goal was achieved by providing a Gibbs sampler algorithm that allows these correlated random effects to have a nonparametric prior distribution. A sampling based method is illustrated. This method which is employed by transforming the genetic covariance matrix to an identity matrix so that the random effects are uncorrelated, is an extension of the theory and the results of previous researchers. Also by using Gibbs sampling and data augmentation a simulation procedure was derived for estimating the precision parameter M associated with the Dirichlet process prior. All needed conditional posterior distributions are given. To illustrate the application, data from the Elsenburg Dormer sheep stud were analysed. A total of 3325 weaning weight records from the progeny of 101 sires were used.  相似文献   

Zhang D  Lin X  Sowers M 《Biometrics》2000,56(1):31-39
We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations.  相似文献   

Yu Z  Lin X  Tu W 《Biometrics》2012,68(2):429-436
We consider frailty models with additive semiparametric covariate effects for clustered failure time data. We propose a doubly penalized partial likelihood (DPPL) procedure to estimate the nonparametric functions using smoothing splines. We show that the DPPL estimators could be obtained from fitting an augmented working frailty model with parametric covariate effects, whereas the nonparametric functions being estimated as linear combinations of fixed and random effects, and the smoothing parameters being estimated as extra variance components. This approach allows us to conveniently estimate all model components within a unified frailty model framework. We evaluate the finite sample performance of the proposed method via a simulation study, and apply the method to analyze data from a study of sexually transmitted infections (STI).  相似文献   

Ryu D  Li E  Mallick BK 《Biometrics》2011,67(2):454-466
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves.  相似文献   

In this article we study the relationship between virologic and immunologic responses in AIDS clinical trials. Since plasma HIV RNA copies (viral load) and CD4+ cell counts are crucial virologic and immunologic markers for HIV infection, it is important to study their relationship during HIV/AIDS treatment. We propose a mixed-effects varying-coefficient model based on an exploratory analysis of data from a clinical trial. Since both viral load and CD4+ cell counts are subject to measurement error, we also consider the measurement error problem in covariates in our model. The regression spline method is proposed for inference for parameters in the proposed model. The regression spline method transforms the unknown nonparametric components into parametric functions. It is relatively simple to implement using readily available software, and parameter inference can be developed from standard parametric models. We apply the proposed models and methods to an AIDS clinical study. From this study, we find an interesting relationship between viral load and CD4+ cell counts during antiviral treatments. Biological interpretations and clinical implications are discussed.  相似文献   

Summary : Recent studies have shown that grassland birds are declining more rapidly than any other group of terrestrial birds. Current methods of estimating avian age‐specific nest survival rates require knowing the ages of nests, assuming homogeneous nests in terms of nest survival rates, or treating the hazard function as a piecewise step function. In this article, we propose a Bayesian hierarchical model with nest‐specific covariates to estimate age‐specific daily survival probabilities without the above requirements. The model provides a smooth estimate of the nest survival curve and identifies the factors that are related to the nest survival. The model can handle irregular visiting schedules and it has the least restrictive assumptions compared to existing methods. Without assuming proportional hazards, we use a multinomial semiparametric logit model to specify a direct relation between age‐specific nest failure probability and nest‐specific covariates. An intrinsic autoregressive prior is employed for the nest age effect. This nonparametric prior provides a more flexible alternative to the parametric assumptions. The Bayesian computation is efficient because the full conditional posterior distributions either have closed forms or are log concave. We use the method to analyze a Missouri dickcissel dataset and find that (1) nest survival is not homogeneous during the nesting period, and it reaches its lowest at the transition from incubation to nestling; and (2) nest survival is related to grass cover and vegetation height in the study area.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号