首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In clinical trials, the comparison of two different populations is a common problem. Nonlinear (parametric) regression models are commonly used to describe the relationship between covariates, such as concentration or dose, and a response variable in the two groups. In some situations, it is reasonable to assume some model parameters to be the same, for instance, the placebo effect or the maximum treatment effect. In this paper, we develop a (parametric) bootstrap test to establish the similarity of two regression curves sharing some common parameters. We show by theoretical arguments and by means of a simulation study that the new test controls its significance level and achieves a reasonable power. Moreover, it is demonstrated that under the assumption of common parameters, a considerably more powerful test can be constructed compared with the test that does not use this assumption. Finally, we illustrate the potential applications of the new methodology by a clinical trial example.  相似文献   

2.
Research has shown that high blood glucose levels are important predictors of incident diabetes. However, they are also strongly associated with other cardiometabolic risk factors such as high blood pressure, adiposity, and cholesterol, which are also highly correlated with one another. The aim of this analysis was to ascertain how these highly correlated cardiometabolic risk factors might be associated with high levels of blood glucose in older adults aged 50 or older from wave 2 of the English Longitudinal Study of Ageing (ELSA). Due to the high collinearity of predictor variables and our interest in extreme values of blood glucose we proposed a new method, called quantile profile regression, to answer this question. Profile regression, a Bayesian nonparametric model for clustering responses and covariates simultaneously, is a powerful tool to model the relationship between a response variable and covariates, but the standard approach of using a mixture of Gaussian distributions for the response model will not identify the underlying clusters correctly, particularly with outliers in the data or heavy tail distribution of the response. Therefore, we propose quantile profile regression to model the response variable with an asymmetric Laplace distribution, allowing us to model more accurately clusters that are asymmetric and predict more accurately for extreme values of the response variable and/or outliers. Our new method performs more accurately in simulations when compared to Normal profile regression approach as well as robustly when outliers are present in the data. We conclude with an analysis of the ELSA.  相似文献   

3.
Summary .  In this article, we study the estimation of mean response and regression coefficient in semiparametric regression problems when response variable is subject to nonrandom missingness. When the missingness is independent of the response conditional on high-dimensional auxiliary information, the parametric approach may misspecify the relationship between covariates and response while the nonparametric approach is infeasible because of the curse of dimensionality. To overcome this, we study a model-based approach to condense the auxiliary information and estimate the parameters of interest nonparametrically on the condensed covariate space. Our estimators possess the double robustness property, i.e., they are consistent whenever the model for the response given auxiliary covariates or the model for the missingness given auxiliary covariate is correct. We conduct a number of simulations to compare the numerical performance between our estimators and other existing estimators in the current missing data literature, including the propensity score approach and the inverse probability weighted estimating equation. A set of real data is used to illustrate our approach.  相似文献   

4.
Dunson DB  Neelon B 《Biometrics》2003,59(2):286-295
In biomedical studies, there is often interest in assessing the association between one or more ordered categorical predictors and an outcome variable, adjusting for covariates. For a k-level predictor, one typically uses either a k-1 degree of freedom (df) test or a single df trend test, which requires scores for the different levels of the predictor. In the absence of knowledge of a parametric form for the response function, one can incorporate monotonicity constraints to improve the efficiency of tests of association. This article proposes a general Bayesian approach for inference on order-constrained parameters in generalized linear models. Instead of choosing a prior distribution with support on the constrained space, which can result in major computational difficulties, we propose to map draws from an unconstrained posterior density using an isotonic regression transformation. This approach allows flat regions over which increases in the level of a predictor have no effect. Bayes factors for assessing ordered trends can be computed based on the output from a Gibbs sampling algorithm. Results from a simulation study are presented and the approach is applied to data from a time-to-pregnancy study.  相似文献   

5.
Two-dimensional response curves are an important experimental outcome in speech kinematics and other areas of research. These parameterized curves are usually obtained by recording the two-dimensional location of an object over time. In this setting, time is the independent variable and the x and y locations on specified coordinate axes define the multivariate response. Collections of such parameterized curves can be obtained either from one subject or from a number of different subjects, each producing one or several realizations of the response curve. When only one dependent variable is observed over time and no parametric model is specified, self-modeling regression (SEMOR) is an attractive modeling approach. SEMOR assumes that each of a collection of curves differs from a smooth, average shape function by some simple parametric transformation of the coordinate axes (usually linear). We will describe the extension of SEMOR to two-dimensional parameterized curves using affine transformations of a two-dimensional, time-parameterized shape function.  相似文献   

6.
This paper focuses on the problems of estimation and variable selection in the functional linear regression model (FLM) with functional response and scalar covariates. To this end, two different types of regularization (L1 and L2) are considered in this paper. On the one hand, a sample approach for functional LASSO in terms of basis representation of the sample values of the response variable is proposed. On the other hand, we propose a penalized version of the FLM by introducing a P-spline penalty in the least squares fitting criterion. But our aim is to propose P-splines as a powerful tool simultaneously for variable selection and functional parameters estimation. In that sense, the importance of smoothing the response variable before fitting the model is also studied. In summary, penalized (L1 and L2) and nonpenalized regression are combined with a presmoothing of the response variable sample curves, based on regression splines or P-splines, providing a total of six approaches to be compared in two simulation schemes. Finally, the most competitive approach is applied to a real data set based on the graft-versus-host disease, which is one of the most frequent complications (30% –50%) in allogeneic hematopoietic stem-cell transplantation.  相似文献   

7.
ABSTRACT Ecologists often develop complex regression models that include multiple categorical and continuous variables, interactions among predictors, and nonlinear relationships between the response and predictor variables. Nomograms, which are graphical devices for presenting mathematical functions and calculating output values, can aid biologists in interpreting and presenting these complex models. To illustrate benefits of nomograms, we developed a logistic regression model of elk (Cervus elaphus) resource selection. With this model, we demonstrated how a nomogram helps scientists and managers interpret interactions among variables, compare the relative biological importance of variables, and examine predicted shapes of relationships (e.g., linear vs. nonlinear) between response and predictor variables. Although our example focused on logistic regression, nomograms are equally useful for other linear and nonlinear models. Regardless of the approach used for model development, nomograms and other graphical summaries can help scientists and managers develop, interpret, and apply statistical models.  相似文献   

8.
Ibrahim JG  Chen MH  Lipsitz SR 《Biometrics》1999,55(2):591-596
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.  相似文献   

9.
Semiparametric Regression in Size-Biased Sampling   总被引:1,自引:0,他引:1  
Ying Qing Chen 《Biometrics》2010,66(1):149-158
Summary .  Size-biased sampling arises when a positive-valued outcome variable is sampled with selection probability proportional to its size. In this article, we propose a semiparametric linear regression model to analyze size-biased outcomes. In our proposed model, the regression parameters of covariates are of major interest, while the distribution of random errors is unspecified. Under the proposed model, we discover that regression parameters are invariant regardless of size-biased sampling. Following this invariance property, we develop a simple estimation procedure for inferences. Our proposed methods are evaluated in simulation studies and applied to two real data analyses.  相似文献   

10.
Abstract. Vegetation models based on multiple logistic regression are of growing interest in environmental studies and decision making. The relatively simple sigmoid Gaussian optimum curves are most common in current vegetation models, although several different other response shapes are known. However, improvements in the technical means for handling statistical data now facilitate fast and interactive calculation of alternative complex, more data-related, non-parametric models. The aim in this study was to determine whether, and if so how often, a complex response shape could be more adequate than a linear or quadratic one. Using the framework of Generalized Additive Models, both parametric (linear and quadratic) and non-parametric (smoothed) stepwise multiple logistic regression techniques were applied to a large data set on wetlands and water plants and to six environmental variables: pH, chloride, orthophosphate, inorganic nitrogen, thickness of the sapropelium layer and depth of the water-body. All models were tested for their goodness-of-fit and significance. Of all 156 generalized additive models calculated, 77 % were found to contain at least one smoothed predictor variable, i.e. an environmental variable with a response better fitted by a complex, non-parametric, than by a linear or quadratic parametric curve. Chloride was the variable with the highest incidence of smoothed responses (48 %). Generally, a smoothed curve was preferable in 23 % of all species-variable correlations calculated, compared to 25 % and 18 % for sigmoid and Gaussian shaped curves, respectively. Regression models of two plant species are presented in detail to illustrate the potential of smoothers to produce good fitting and biologically sound response models in comparison to linear and polynomial regression models. We found Generalized Additive Modelling a useful and practical technique for improving current regression-based vegetation models by allowing for alternative, complex response shapes.  相似文献   

11.
Bornkamp B  Ickstadt K 《Biometrics》2009,65(1):198-205
Summary .  In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose–response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose–response analysis.  相似文献   

12.
Multiple linear regression analyses (also often referred to as generalized linear models – GLMs, or generalized linear mixed models – GLMMs) are widely used in the analysis of data in molecular ecology, often to assess the relative effects of genetic characteristics on individual fitness or traits, or how environmental characteristics influence patterns of genetic differentiation. However, the coefficients resulting from multiple regression analyses are sometimes misinterpreted, which can lead to incorrect interpretations and conclusions within individual studies, and can propagate to wider‐spread errors in the general understanding of a topic. The primary issue revolves around the interpretation of coefficients for independent variables when interaction terms are also included in the analyses. In this scenario, the coefficients associated with each independent variable are often interpreted as the independent effect of each predictor variable on the predicted variable. However, this interpretation is incorrect. The correct interpretation is that these coefficients represent the effect of each predictor variable on the predicted variable when all other predictor variables are zero. This difference may sound subtle, but the ramifications cannot be overstated. Here, my goals are to raise awareness of this issue, to demonstrate and emphasize the problems that can result and to provide alternative approaches for obtaining the desired information.  相似文献   

13.
Asymmetric regression is an alternative to conventional linear regression that allows us to model the relationship between predictor variables and the response variable while accommodating skewness. Advantages of asymmetric regression include incorporating realistic ecological patterns observed in data, robustness to model misspecification and less sensitivity to outliers. Bayesian asymmetric regression relies on asymmetric distributions such as the asymmetric Laplace (ALD) or asymmetric normal (AND) in place of the normal distribution used in classic linear regression models. Asymmetric regression concepts can be used for process and parameter components of hierarchical Bayesian models and have a wide range of applications in data analyses. In particular, asymmetric regression allows us to fit more realistic statistical models to skewed data and pairs well with Bayesian inference. We first describe asymmetric regression using the ALD and AND. Second, we show how the ALD and AND can be used for Bayesian quantile and expectile regression for continuous response data. Third, we consider an extension to generalize Bayesian asymmetric regression to survey data consisting of counts of objects. Fourth, we describe a regression model using the ALD, and show that it can be applied to add needed flexibility, resulting in better predictive models compared to Poisson or negative binomial regression. We demonstrate concepts by analyzing a data set consisting of counts of Henslow’s sparrows following prescribed fire and provide annotated computer code to facilitate implementation. Our results suggest Bayesian asymmetric regression is an essential component of a scientist’s statistical toolbox.  相似文献   

14.
Genome-Wide Regression and Prediction with the BGLR Statistical Package   总被引:1,自引:0,他引:1  
Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis.  相似文献   

15.
Summary Quantile regression, which models the conditional quantiles of the response variable given covariates, usually assumes a linear model. However, this kind of linearity is often unrealistic in real life. One situation where linear quantile regression is not appropriate is when the response variable is piecewise linear but still continuous in covariates. To analyze such data, we propose a bent line quantile regression model. We derive its parameter estimates, prove that they are asymptotically valid given the existence of a change‐point, and discuss several methods for testing the existence of a change‐point in bent line quantile regression together with a power comparison by simulation. An example of land mammal maximal running speeds is given to illustrate an application of bent line quantile regression in which this model is theoretically justified and its parameters are of direct biological interests.  相似文献   

16.
We develop a new method for variable selection in a nonlinear additive function-on-scalar regression (FOSR) model. Existing methods for variable selection in FOSR have focused on the linear effects of scalar predictors, which can be a restrictive assumption in the presence of multiple continuously measured covariates. We propose a computationally efficient approach for variable selection in existing linear FOSR using functional principal component scores of the functional response and extend this framework to a nonlinear additive function-on-scalar model. The proposed method provides a unified and flexible framework for variable selection in FOSR, allowing nonlinear effects of the covariates. Numerical analysis using simulation study illustrates the advantages of the proposed method over existing variable selection methods in FOSR even when the underlying covariate effects are all linear. The proposed procedure is demonstrated on accelerometer data from the 2003–2004 cohorts of the National Health and Nutrition Examination Survey (NHANES) in understanding the association between diurnal patterns of physical activity and demographic, lifestyle, and health characteristics of the participants.  相似文献   

17.
  1. The relationships between an environmental variable and an ecological response are usually estimated by models fitted through the conditional mean of the response given environmental stress. For example, nonparametric loess and parametric piecewise linear regression model (PLRM) are often used to represent simple to complex nonlinear relationships. In contrast, piecewise linear quantile regression models (PQRM) fitted across various quantiles of the response can reveal nonlinearities in its range of variation across the explanatory variable.
  2. We assess the number and positions of candidate breakpoints using loess and compare the relative efficiencies of PLRM and PQRM to quantitatively determine the breakpoints'' location and precision. We propose a nonparametric method to generate bootstrap confidence intervals for breakpoints using PQRM and prediction bands for loess and PQRM. We illustrated the applications using data from two aquatic studies suspected to exhibit multiple environmental breakpoints: relating a fish multimetric index of community health (MMI) to agricultural activity in wetlands'' adjacent drainage basins; and relating cyanobacterial biomass to total phosphorus concentration in Canadian lakes.
  3. Two statistically significant breakpoints were detected in each dataset, demarcating boundaries of three linear segments, each with markedly different slopes. PQRM generated less biased, more accurate, and narrower confidence intervals for the breakpoints and narrower prediction bands than PLRM, especially for small samples and large error variability. In both applications, the relationship between the response and environmental variables was weak/nonsignificant below the lower threshold, strong through the midrange of the environmental gradient, and weak/nonsignificant beyond the upper threshold.
  4. We describe several advantages of PQRM over PLRM in characterizing environmental relationships where the scatter of points represents natural environmental variation rather than measurement error. The proposed methodology will be useful for detecting multiple breakpoints in ecological applications where the limits of variation are as important as the conditional mean of a function.
  相似文献   

18.
基于Median函数的分段回归模型及其在生物学上的应用   总被引:1,自引:0,他引:1  
在生物学科研工作中,经常会遇到因变量和自变量之间存在着多种不同的趋势,分段回归模型可以很好的拟合变量间这种非线性趋势.本文介绍了基于Median函数的分段回归模型,可以同时对各项回归参数和转折点进行估计,最后,本文结合生物学上的实例运用SAS统计软件进行了分段回归模型的拟合.  相似文献   

19.
There are a number of applied settings where a response is measured repeatedly over time, and the impact of a stimulus at one time is distributed over several subsequent response measures. In the motivating application the stimulus is an air pollutant such as airborne particulate matter and the response is mortality. However, several other variables (e.g. daily temperature) impact the response in a possibly non-linear fashion. To quantify the effect of the stimulus in the presence of covariate data we combine two established regression techniques: generalized additive models and distributed lag models. Generalized additive models extend multiple linear regression by allowing for continuous covariates to be modeled as smooth, but otherwise unspecified, functions. Distributed lag models aim to relate the outcome variable to lagged values of a time-dependent predictor in a parsimonious fashion. The resultant, which we call generalized additive distributed lag models, are seen to effectively quantify the so-called 'mortality displacement effect' in environmental epidemiology, as illustrated through air pollution/mortality data from Milan, Italy.  相似文献   

20.
Neelon B  Dunson DB 《Biometrics》2004,60(2):398-406
In many applications, the mean of a response variable can be assumed to be a nondecreasing function of a continuous predictor, controlling for covariates. In such cases, interest often focuses on estimating the regression function, while also assessing evidence of an association. This article proposes a new framework for Bayesian isotonic regression and order-restricted inference. Approximating the regression function with a high-dimensional piecewise linear model, the nondecreasing constraint is incorporated through a prior distribution for the slopes consisting of a product mixture of point masses (accounting for flat regions) and truncated normal densities. To borrow information across the intervals and smooth the curve, the prior is formulated as a latent autoregressive normal process. This structure facilitates efficient posterior computation, since the full conditional distributions of the parameters have simple conjugate forms. Point and interval estimates of the regression function and posterior probabilities of an association for different regions of the predictor can be estimated from a single MCMC run. Generalizations to categorical outcomes and multiple predictors are described, and the approach is applied to an epidemiology application.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号