首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we propose a functional partially linear regression model with latent group structures to accommodate the heterogeneous relationship between a scalar response and functional covariates. The proposed model is motivated by a salinity tolerance study of barley families, whose main objective is to detect salinity tolerant barley plants. Our model is flexible, allowing for heterogeneous functional coefficients while being efficient by pooling information within a group for estimation. We develop an algorithm in the spirit of the K-means clustering to identify latent groups of the subjects under study. We establish the consistency of the proposed estimator, derive the convergence rate and the asymptotic distribution, and develop inference procedures. We show by simulation studies that the proposed method has higher accuracy for recovering latent groups and for estimating the functional coefficients than existing methods. The analysis of the barley data shows that the proposed method can help identify groups of barley families with different salinity tolerant abilities.  相似文献   

2.
Neelon B  Dunson DB 《Biometrics》2004,60(2):398-406
In many applications, the mean of a response variable can be assumed to be a nondecreasing function of a continuous predictor, controlling for covariates. In such cases, interest often focuses on estimating the regression function, while also assessing evidence of an association. This article proposes a new framework for Bayesian isotonic regression and order-restricted inference. Approximating the regression function with a high-dimensional piecewise linear model, the nondecreasing constraint is incorporated through a prior distribution for the slopes consisting of a product mixture of point masses (accounting for flat regions) and truncated normal densities. To borrow information across the intervals and smooth the curve, the prior is formulated as a latent autoregressive normal process. This structure facilitates efficient posterior computation, since the full conditional distributions of the parameters have simple conjugate forms. Point and interval estimates of the regression function and posterior probabilities of an association for different regions of the predictor can be estimated from a single MCMC run. Generalizations to categorical outcomes and multiple predictors are described, and the approach is applied to an epidemiology application.  相似文献   

3.
Burgette LF  Reiter JP 《Biometrics》2012,68(1):92-100
We describe a Bayesian quantile regression model that uses a confirmatory factor structure for part of the design matrix. This model is appropriate when the covariates are indicators of scientifically determined latent factors, and it is these latent factors that analysts seek to include as predictors in the quantile regression. We apply the model to a study of birth weights in which the effects of latent variables representing psychosocial health and actual tobacco usage on the lower quantiles of the response distribution are of interest. The models can be fit using an R package called factorQR.  相似文献   

4.
In this article, we develop a latent class model with class probabilities that depend on subject-specific covariates. One of our major goals is to identify important predictors of latent classes. We consider methodology that allows estimation of latent classes while allowing for variable selection uncertainty. We propose a Bayesian variable selection approach and implement a stochastic search Gibbs sampler for posterior computation to obtain model-averaged estimates of quantities of interest such as marginal inclusion probabilities of predictors. Our methods are illustrated through simulation studies and application to data on weight gain during pregnancy, where it is of interest to identify important predictors of latent weight gain classes.  相似文献   

5.
Ritz C  Streibig JC 《Biometrics》2009,65(2):609-617
Summary .  Fluorescence curves are useful for monitoring changes in photosynthesis activity. Various summary measures have been used to quantify differences among fluorescence curves corresponding to different treatments, but these approaches may forfeit valuable information. As each individual fluorescence curve is a functional observation, it is natural to consider a functional regression model. The proposed model consists of a nonparametric component capturing the general form of the curves and a semiparametric component describing the differences among treatments and allowing comparisons of treatments. Several graphical model-checking approaches are introduced. Both approximate, asymptotic confidence intervals as well as simulation-based confidence intervals are available. Analysis of data from a crop experiment using the proposed model shows that the salient features in the fluorescence curves are captured adequately. The proposed functional regression model is useful for analysis of high throughput fluorescence curve data from regular monitoring or screening of plant growth.  相似文献   

6.
7.
Question: We provide a method to calculate the power of ordinal regression models for detecting temporal trends in plant abundance measured as ordinal cover classes. Does power depend on the shape of the unobserved (latent) distribution of percentage cover? How do cover class schemes that differ in the number of categories affect power? Methods: We simulated cover class data by “cutting‐up” a continuous logit‐beta distributed variable using 7‐point and 15‐point cover classification schemes. We used Monte Carlo simulation to estimate power for detecting trends with two ordinal models, proportional odds logistic regression (POM) and logistic regression with cover classes re‐binned into two categories, a model we term an assessment point model (APM). We include a model fit to the logit‐transformed percentage cover data for comparison, which is a latent model. Results: The POM had equal or higher power compared to the APM and latent model, but power varied in complex ways as a function of the assumed latent beta distribution. We discovered that if the latent distribution is skewed, a cover class scheme with more categories might yield higher power to detect trend. Conclusions: Our power analysis method maintains the connection between the observed ordinal cover classes and the unmeasured (latent) percentage cover variable, allowing for a biologically meaningful trend to be defined on the percentage cover scale. Both the shape of the latent beta distribution and the alternative hypothesis should be considered carefully when determining sample size requirements for long‐term vegetation monitoring using cover class measurements.  相似文献   

8.
Summary .  A variety of flexible approaches have been proposed for functional data analysis, allowing both the mean curve and the distribution about the mean to be unknown. Such methods are most useful when there is limited prior information. Motivated by applications to modeling of temperature curves in the menstrual cycle, this article proposes a flexible approach for incorporating prior information in semiparametric Bayesian analyses of hierarchical functional data. The proposed approach is based on specifying the distribution of functions as a mixture of a parametric hierarchical model and a nonparametric contamination. The parametric component is chosen based on prior knowledge, while the contamination is characterized as a functional Dirichlet process. In the motivating application, the contamination component allows unanticipated curve shapes in unhealthy menstrual cycles. Methods are developed for posterior computation, and the approach is applied to data from a European fecundability study.  相似文献   

9.

Summary

We consider a functional linear Cox regression model for characterizing the association between time‐to‐event data and a set of functional and scalar predictors. The functional linear Cox regression model incorporates a functional principal component analysis for modeling the functional predictors and a high‐dimensional Cox regression model to characterize the joint effects of both functional and scalar predictors on the time‐to‐event data. We develop an algorithm to calculate the maximum approximate partial likelihood estimates of unknown finite and infinite dimensional parameters. We also systematically investigate the rate of convergence of the maximum approximate partial likelihood estimates and a score test statistic for testing the nullity of the slope function associated with the functional predictors. We demonstrate our estimation and testing procedures by using simulations and the analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data. Our real data analyses show that high‐dimensional hippocampus surface data may be an important marker for predicting time to conversion to Alzheimer's disease. Data used in the preparation of this article were obtained from the ADNI database ( adni.loni.usc.edu ).  相似文献   

10.
An interpretation for the ROC curve and inference using GLM procedures   总被引:7,自引:0,他引:7  
Pepe MS 《Biometrics》2000,56(2):352-359
The accuracy of a medical diagnostic test is often summarized in a receiver operating characteristic (ROC) curve. This paper puts forth an interpretation for each point on the ROC curve as being a conditional probability of a test result from a random diseased subject exceeding that from a random nondiseased subject. This interpretation gives rise to new methods for making inference about ROC curves. It is shown that inference can be achieved with binary regression techniques applied to indicator variables constructed from pairs of test results, one component of the pair being from a diseased subject and the other from a nondiseased subject. Within the generalized linear model (GLM) binary regression framework, ROC curves can be estimated, and we highlight a new semiparametric estimator. Covariate effects can also be evaluated with the GLM models. The methodology is applied to a pancreatic cancer dataset where we use the regression framework to compare two different serum biomarkers. Asymptotic distribution theory is developed to facilitate inference and to provide insight into factors influencing variability of estimated model parameters.  相似文献   

11.
In many studies, the association of longitudinal measurements of a continuous response and a binary outcome are often of interest. A convenient framework for this type of problems is the joint model, which is formulated to investigate the association between a binary outcome and features of longitudinal measurements through a common set of latent random effects. The joint model, which is the focus of this article, is a logistic regression model with covariates defined as the individual‐specific random effects in a non‐linear mixed‐effects model (NLMEM) for the longitudinal measurements. We discuss different estimation procedures, which include two‐stage, best linear unbiased predictors, and various numerical integration techniques. The proposed methods are illustrated using a real data set where the objective is to study the association between longitudinal hormone levels and the pregnancy outcome in a group of young women. The numerical performance of the estimating methods is also evaluated by means of simulation.  相似文献   

12.
13.
Gaussian process functional regression modeling for batch data   总被引:2,自引:0,他引:2  
A Gaussian process functional regression model is proposed for the analysis of batch data. Covariance structure and mean structure are considered simultaneously, with the covariance structure modeled by a Gaussian process regression model and the mean structure modeled by a functional regression model. The model allows the inclusion of covariates in both the covariance structure and the mean structure. It models the nonlinear relationship between a functional output variable and a set of functional and nonfunctional covariates. Several applications and simulation studies are reported and show that the method provides very good results for curve fitting and prediction.  相似文献   

14.
Lloyd CJ 《Biometrics》2000,56(3):862-867
The performance of a diagnostic test is summarized by its receiver operating characteristic (ROC) curve. Under quite natural assumptions about the latent variable underlying the test, the ROC curve is convex. Empirical data on a test's performance often comes in the form of observed true positive and false positive relative frequencies under varying conditions. This paper describes a family of regression models for analyzing such data. The underlying ROC curves are specified by a quality parameter delta and a shape parameter mu and are guaranteed to be convex provided delta > 1. Both the position along the ROC curve and the quality parameter delta are modeled linearly with covariates at the level of the individual. The shape parameter mu enters the model through the link functions log(p mu) - log(1 - p mu) of a binomial regression and is estimated either by search or from an appropriate constructed variate. One simple application is to the meta-analysis of independent studies of the same diagnostic test, illustrated on some data of Moses, Shapiro, and Littenberg (1993). A second application, to so-called vigilance data, is given, where ROC curves differ across subjects and modeling of the position along the ROC curve is of primary interest.  相似文献   

15.
Practitioners of current data analysis are regularly confronted with the situation where the heavy-tailed skewed response is related to both multiple functional predictors and high-dimensional scalar covariates. We propose a new class of partially functional penalized convolution-type smoothed quantile regression to characterize the conditional quantile level between a scalar response and predictors of both functional and scalar types. The new approach overcomes the lack of smoothness and severe convexity of the standard quantile empirical loss, considerably improving the computing efficiency of partially functional quantile regression. We investigate a folded concave penalized estimator for simultaneous variable selection and estimation by the modified local adaptive majorize-minimization (LAMM) algorithm. The functional predictors can be dense or sparse and are approximated by the principal component basis. Under mild conditions, the consistency and oracle properties of the resulting estimators are established. Simulation studies demonstrate a competitive performance against the partially functional standard penalized quantile regression. A real application using Alzheimer's Disease Neuroimaging Initiative data is utilized to illustrate the practicality of the proposed model.  相似文献   

16.
Summary Physical activity has many well‐documented health benefits for cardiovascular fitness and weight control. For pregnant women, the American College of Obstetricians and Gynecologists currently recommends 30 minutes of moderate exercise on most, if not all, days; however, very few pregnant women achieve this level of activity. Traditionally, studies have focused on examining individual or interpersonal factors to identify predictors of physical activity. There is a renewed interest in whether characteristics of the physical environment in which we live and work may also influence physical activity levels. We consider one of the first studies of pregnant women that examines the impact of characteristics of the built environment on physical activity levels. Using a socioecologic framework, we study the associations between physical activity and several factors including personal characteristics, meteorological/air quality variables, and neighborhood characteristics for pregnant women in four counties of North Carolina. We simultaneously analyze six types of physical activity and investigate cross‐dependencies between these activity types. Exploratory analysis suggests that the associations are different in different regions. Therefore, we use a multivariate regression model with spatially varying regression coefficients. This model includes a regression parameter for each covariate at each spatial location. For our data with many predictors, some form of dimension reduction is clearly needed. We introduce a Bayesian variable selection procedure to identify subsets of important variables. Our stochastic search algorithm determines the probabilities that each covariate's effect is null, non‐null but constant across space, and spatially varying. We found that individual‐level covariates had a greater influence on women's activity levels than neighborhood environmental characteristics, and some individual‐level covariates had spatially varying associations with the activity levels of pregnant women.  相似文献   

17.
Aim To investigate the impact of positional uncertainty in species occurrences on the predictions of seven commonly used species distribution models (SDMs), and explore its interaction with spatial autocorrelation in predictors. Methods A series of artificial datasets covering 155 scenarios including different combinations of five positional uncertainty scenarios and 31 spatial autocorrelation scenarios were simulated. The level of positional uncertainty was defined by the standard deviation of a normally distributed zero‐mean random variable. Each dataset included two environmental gradients (predictor variables) and one set of species occurrence sample points (response variable). Seven commonly used models were selected to develop SDMs: generalized linear models, generalized additive models, boosted regression trees, multivariate adaptive regression spline, random forests, genetic algorithm for rule‐set production and maximum entropy. A probabilistic approach was employed to model and simulate five levels of error in the species locations. To analyse the propagation of positional uncertainty, Monte Carlo simulation was applied to each scenario for each SDM. The models were evaluated for performance using simulated independent test data with Cohen’s Kappa and the area under the receiver operating characteristic curve. Results Positional uncertainty in species location led to a reduction in prediction accuracy for all SDMs, although the magnitude of the reduction varied between SDMs. In all cases the magnitude of this impact varied according to the degree of spatial autocorrelation in predictors and the levels of positional uncertainty. It was shown that when the range of spatial autocorrelation in the predictors was less than or equal to three times the standard deviation of the positional error, the models were less affected by error and, consequently, had smaller decreases in prediction accuracy. When the range of spatial autocorrelation in predictors was larger than three times the standard deviation of positional error, the prediction accuracy was low for all scenarios. Main conclusions The potential impact of positional uncertainty in species occurrences on the predictions of SDMs can be understood by comparing it with the spatial autocorrelation range in predictor variables.  相似文献   

18.
Chenlin Zhang  Huazhen Lin  Li Liu  Jin Liu  Yi Li 《Biometrics》2023,79(3):2232-2245
Functional data analysis has emerged as a powerful tool in response to the ever-increasing resources and efforts devoted to collecting information about response curves or anything that varies over a continuum. However, limited progress has been made with regard to linking the covariance structures of response curves to external covariates, as most functional models assume a common covariance structure. We propose a new functional regression model with covariate-dependent mean and covariance structures. Particularly, by allowing variances of random scores to be covariate-dependent, we identify eigenfunctions for each individual from the set of eigenfunctions that govern the variation patterns across all individuals, resulting in high interpretability and prediction power. We further propose a new penalized quasi-likelihood procedure that combines regularization and B-spline smoothing for model selection and estimation and establish the convergence rate and asymptotic normality of the proposed estimators. The utility of the developed method is demonstrated via simulations, as well as an analysis of the Avon Longitudinal Study of Parents and Children concerning parental effects on the growth curves of their offspring, which yields biologically interesting results.  相似文献   

19.
Aim This study used data from temperate forest communities to assess: (1) five different stepwise selection methods with generalized additive models, (2) the effect of weighting absences to ensure a prevalence of 0.5, (3) the effect of limiting absences beyond the environmental envelope defined by presences, (4) four different methods for incorporating spatial autocorrelation, and (5) the effect of integrating an interaction factor defined by a regression tree on the residuals of an initial environmental model. Location State of Vaud, western Switzerland. Methods Generalized additive models (GAMs) were fitted using the grasp package (generalized regression analysis and spatial predictions, http://www.cscf.ch/grasp ). Results Model selection based on cross‐validation appeared to be the best compromise between model stability and performance (parsimony) among the five methods tested. Weighting absences returned models that perform better than models fitted with the original sample prevalence. This appeared to be mainly due to the impact of very low prevalence values on evaluation statistics. Removing zeroes beyond the range of presences on main environmental gradients changed the set of selected predictors, and potentially their response curve shape. Moreover, removing zeroes slightly improved model performance and stability when compared with the baseline model on the same data set. Incorporating a spatial trend predictor improved model performance and stability significantly. Even better models were obtained when including local spatial autocorrelation. A novel approach to include interactions proved to be an efficient way to account for interactions between all predictors at once. Main conclusions Models and spatial predictions of 18 forest communities were significantly improved by using either: (1) cross‐validation as a model selection method, (2) weighted absences, (3) limited absences, (4) predictors accounting for spatial autocorrelation, or (5) a factor variable accounting for interactions between all predictors. The final choice of model strategy should depend on the nature of the available data and the specific study aims. Statistical evaluation is useful in searching for the best modelling practice. However, one should not neglect to consider the shapes and interpretability of response curves, as well as the resulting spatial predictions in the final assessment.  相似文献   

20.
Dynamic Model for Multivariate Markers of Fecundability   总被引:1,自引:0,他引:1  
Summary : Dynamic latent class models provide a flexible framework for studying biologic processes that evolve over time. Motivated by studies of markers of the fertile days of the menstrual cycle, we propose a discrete‐time dynamic latent class framework, allowing change points to depend on time, fixed predictors, and random effects. Observed data consist of multivariate categorical indicators, which change dynamically in a flexible manner according to latent class status. Given the flexibility of the framework, which incorporates semi‐parametric components using mixtures of betas, identifiability constraints are needed to define the latent classes. Such constraints are most appropriately based on the known biology of the process. The Bayesian method is developed particularly for analyzing mucus symptom data from a study of women using natural family planning.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号