首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 33 毫秒
1.
Summary Quantile regression, which models the conditional quantiles of the response variable given covariates, usually assumes a linear model. However, this kind of linearity is often unrealistic in real life. One situation where linear quantile regression is not appropriate is when the response variable is piecewise linear but still continuous in covariates. To analyze such data, we propose a bent line quantile regression model. We derive its parameter estimates, prove that they are asymptotically valid given the existence of a change‐point, and discuss several methods for testing the existence of a change‐point in bent line quantile regression together with a power comparison by simulation. An example of land mammal maximal running speeds is given to illustrate an application of bent line quantile regression in which this model is theoretically justified and its parameters are of direct biological interests.  相似文献   

2.
Practitioners of current data analysis are regularly confronted with the situation where the heavy-tailed skewed response is related to both multiple functional predictors and high-dimensional scalar covariates. We propose a new class of partially functional penalized convolution-type smoothed quantile regression to characterize the conditional quantile level between a scalar response and predictors of both functional and scalar types. The new approach overcomes the lack of smoothness and severe convexity of the standard quantile empirical loss, considerably improving the computing efficiency of partially functional quantile regression. We investigate a folded concave penalized estimator for simultaneous variable selection and estimation by the modified local adaptive majorize-minimization (LAMM) algorithm. The functional predictors can be dense or sparse and are approximated by the principal component basis. Under mild conditions, the consistency and oracle properties of the resulting estimators are established. Simulation studies demonstrate a competitive performance against the partially functional standard penalized quantile regression. A real application using Alzheimer's Disease Neuroimaging Initiative data is utilized to illustrate the practicality of the proposed model.  相似文献   

3.
The successful development and implementation of precision immuno-oncology therapies requires a deeper understanding of the immune architecture at a patient level. T-cell receptor (TCR) repertoire sequencing is a relatively new technology that enables monitoring of T-cells, a subset of immune cells that play a central role in modulating immune response. These immunologic relationships are complex and are governed by various distributional aspects of an individual patient's tumor profile. We propose Bayesian QUANTIle regression for hierarchical COvariates (QUANTICO) that allows simultaneous modeling of hierarchical relationships between multilevel covariates, conducts explicit variable selection, estimates quantile and patient-specific coefficient effects, to induce individualized inference. We show QUANTICO outperforms existing approaches in multiple simulation scenarios. We demonstrate the utility of QUANTICO to investigate the effect of TCR variables on immune response in a cohort of lung cancer patients. At population level, our analyses reveal the mechanistic role of T-cell proportion on the immune cell abundance, with tumor mutation burden as an important factor modulating this relationship. At a patient level, we find several outlier patients based on their quantile-specific coefficient functions, who have higher mutational rates and different smoking history.  相似文献   

4.
Asymmetric regression is an alternative to conventional linear regression that allows us to model the relationship between predictor variables and the response variable while accommodating skewness. Advantages of asymmetric regression include incorporating realistic ecological patterns observed in data, robustness to model misspecification and less sensitivity to outliers. Bayesian asymmetric regression relies on asymmetric distributions such as the asymmetric Laplace (ALD) or asymmetric normal (AND) in place of the normal distribution used in classic linear regression models. Asymmetric regression concepts can be used for process and parameter components of hierarchical Bayesian models and have a wide range of applications in data analyses. In particular, asymmetric regression allows us to fit more realistic statistical models to skewed data and pairs well with Bayesian inference. We first describe asymmetric regression using the ALD and AND. Second, we show how the ALD and AND can be used for Bayesian quantile and expectile regression for continuous response data. Third, we consider an extension to generalize Bayesian asymmetric regression to survey data consisting of counts of objects. Fourth, we describe a regression model using the ALD, and show that it can be applied to add needed flexibility, resulting in better predictive models compared to Poisson or negative binomial regression. We demonstrate concepts by analyzing a data set consisting of counts of Henslow’s sparrows following prescribed fire and provide annotated computer code to facilitate implementation. Our results suggest Bayesian asymmetric regression is an essential component of a scientist’s statistical toolbox.  相似文献   

5.
In this paper, we propose a simple parametric modal linear regression model where the response variable is gamma distributed using a new parameterization of this distribution that is indexed by mode and precision parameters, that is, in this new regression model, the modal and precision responses are related to a linear predictor through a link function and the linear predictor involves covariates and unknown regression parameters. The main advantage of our new parameterization is the straightforward interpretation of the regression coefficients in terms of the mode of the positive response variable, as is usual in the context of generalized linear models, and direct inference in parametric mode regression based on the likelihood paradigm. Furthermore, we discuss residuals and influence diagnostic tools. A Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples with a discussion of the results. Finally, we illustrate the usefulness of the new model by two applications, to biology and demography.  相似文献   

6.
In this paper, we propose a frequentist model averaging method for quantile regression with high-dimensional covariates. Although research on these subjects has proliferated as separate approaches, no study has considered them in conjunction. Our method entails reducing the covariate dimensions through ranking the covariates based on marginal quantile utilities. The second step of our method implements model averaging on the models containing the covariates that survive the screening of the first step. We use a delete-one cross-validation method to select the model weights, and prove that the resultant estimator possesses an optimal asymptotic property uniformly over any compact (0,1) subset of the quantile indices. Our proof, which relies on empirical process theory, is arguably more challenging than proofs of similar results in other contexts owing to the high-dimensional nature of the problem and our relaxation of the conventional assumption of the weights summing to one. Our investigation of finite-sample performance demonstrates that the proposed method exhibits very favorable properties compared to the least absolute shrinkage and selection operator (LASSO) and smoothly clipped absolute deviation (SCAD) penalized regression methods. The method is applied to a microarray gene expression data set.  相似文献   

7.
We consider a regression model where the error term is assumed to follow a type of asymmetric Laplace distribution. We explore its use in the estimation of conditional quantiles of a continuous outcome variable given a set of covariates in the presence of random censoring. Censoring may depend on covariates. Estimation of the regression coefficients is carried out by maximizing a non‐differentiable likelihood function. In the scenarios considered in a simulation study, the Laplace estimator showed correct coverage and shorter computation time than the alternative methods considered, some of which occasionally failed to converge. We illustrate the use of Laplace regression with an application to survival time in patients with small cell lung cancer.  相似文献   

8.
Community resilience offers a conceptual framework for assessing a community's capacity for coping with environmental changes and emergency situations. It is perceived as a core element of sustainable lifestyle, helping to mitigate the community's reaction to crises by facilitating purposeful and collective action on the part of its’ members. The conjoint community resilience assessment measure (CCRAM) provides a standard measure of community resilience including five factors: leadership, collective efficacy, preparedness, place attachment, and social trust. The mean scores of each the factors portray a community resilience profile and the overall CCRAM score is calculated as the average of the scores of the 21 survey items with an equal weight.Two regression models were employed. Logistic regression, a commonly used tool in the field of applied statistics, and quantile regression, which is a non-parametric method that facilitates the detection of the effect of a regressor on various quantiles of the dependent variable.The study aims to demonstrate the innovative use of quantile regression modeling in community resilience analysis.The results demonstrate that the quantile regression was significantly more sensitive to sub-populations than the logistic regression.Having an income below average, which was negatively correlated with perceived community resilience in the logistic model was found to be significant only in the lower (Q10, Q25) resilience quantiles. Age (per year) and previous involvement in emergency situations which were not noted as significant in the logistic regression, were found to be positively associated with perceived community resilience in the lowest quantile. A difference between quantiles of perceived community resilience was noted in regard to size of community. The association between size of community and perceived community resilience which was negative in the logistic regression (residents of larger towns had lower community resilience), was found to be such only up to quantile 75, but it reversed in the highest quantile.It was concluded that the utilization of quantile regression analysis in studies of community resilience can facilitate the creation of tailored response plans, adapted to the needs of sub (such as weaker) populations and help enhance overall community resilience in crises.  相似文献   

9.
Quantile regression methods have been used to estimate upper and lower quantile reference curves as the function of several covariates. Especially, in survival analysis, median regression models to the right‐censored data are suggested with several assumptions. In this article, we consider a median regression model for interval‐censored data and construct an estimating equation based on weights derived from interval‐censored data. In a simulation study, the performances of the proposed method are evaluated for both symmetric and right‐skewed distributed failure times. A well‐known breast cancer data are analyzed to illustrate the proposed method.  相似文献   

10.
冠幅是反映单木生长状态及构建林木生长收获模型的重要变量。本研究以辽东山区大边沟林场10~55年生红松人工林为对象,基于66块固定样地的2763株红松的每木检尺数据,选取冠幅基础模型,采用再参数化的方法引入单木竞争指标(Rd),利用哑变量的方法引入了林分密度、林层变量,构建不同分位点(0.50、0.90、0.93、0.95、0.96、0.99)的冠幅分位数回归模型,并与传统方法进行比较,选取模拟林分最大冠幅的最优分位点。为反映林分中单木冠幅在林木个体之间的差异,建立了基于样地水平的最优分位点的线性混合效应分位数回归冠幅模型,分析各变量对单木冠幅的影响。结果表明: 基于F统计检验,不同林分密度和林层的冠幅模型具有显著差异,在基础模型中引入林层、林分密度和竞争后,模型Ra2提高0.0104,均方根误差降低0.0115,均方误差降低为7.4%;与最小二乘法比较,分位数回归模型能够较好地模拟林分状态下的单木最大冠幅,并选出0.96分位点和0.93分位点作为上林层和下林层的分位数回归模型的最优分位点。引入混合效应的线性分位数回归模型的赤池信息准则、贝叶斯信息准则、HQ信息准则等评价指标优于传统分位数回归,参数标准误显著降低,混合效应的引入很好地解释了样地之间的差异。就上林层和下林层而言,林分密度越大,最大冠幅越小;相对直径越大,最大冠幅越大,其中林分密度对下林层的冠幅影响大于上林层,当林分密度足够大时,冠幅随着胸径的增大先增大后降低。本研究构建的基于混合效应的分位数回归模型能有效提高模型的拟合优度,今后可通过调控林分密度、适度抚育间伐等措施,实现对辽东山区红松人工林的科学营建和可持续发展。  相似文献   

11.
ABSTRACT: BACKGROUND: Mass spectrometry (MS) data are often generated from various biological or chemical experiments and there may exist outlying observations, which are extreme due to technical reasons. The determination of outlying observations is important in the analysis of replicated MS data because elaborate pre-processing is essential for successful analysis with reliable results and manual outlier detection as one of pre-processing steps is time-consuming. The heterogeneity of variability and low replication are often obstacles to successful analysis, including outlier detection. Existing approaches, which assume constant variability, can generate many false positives (outliers) and/or false negatives non-outliers). Thus, a more powerful and accurate approach is needed to account for the heterogeneity of variability and low replication. FINDINGS: We proposed an outlier detection algorithm using projection and quantile regression in MS data from multiple experiments. The performance of the algorithm and program was demonstrated by using both simulated and real-life data. The projection approach with linear, nonlinear, or nonparametric quantile regression was appropriate in heterogeneous high-throughput data with low replication. CONCLUSION: Various quantile regression approaches combined with projection were proposed for detecting outliers. The choice among linear, nonlinear, and nonparametric regressions is dependent on the degree of heterogeneity of the data. The proposed approach was illustrated with MS data with two or more replicates.  相似文献   

12.
Ecologists often estimate population trends of animals in time series of counts using linear regression to estimate parameters in a linear transformation of multiplicative growth models, where logarithms of rates of change in counts in time intervals are used as response variables. We present quantile regression estimates for the median (0.50) and interquartile (0.25, 0.75) relationships as an alternative to mean regression estimates for common density-dependent and density-independent population growth models. We demonstrate that the quantile regression estimates are more robust to outliers and require fewer distributional assumptions than conventional mean regression estimates and can provide information on heterogeneous rates of change ignored by mean regression. We provide quantile regression trend estimates for 2 populations of greater sage-grouse (Centrocercus urophasianus) in Wyoming, USA, and for the Crawford population of Gunnison sage-grouse (Centrocercus minimus) in southwestern Colorado, USA. Our selected Gompertz models of density dependence for both populations of greater sage-grouse had smaller negative estimates of density-dependence terms and less variation in corresponding predicted growth rates (λ) for quantile than mean regression models. In contrast, our selected Gompertz models of density dependence with piecewise linear effects of years for the Crawford population of Gunnison sage-grouse had predicted changes in λ across years from quantile regressions that varied more than those from mean regression because of heterogeneity in estimated λs that were both less and greater than mean estimates. Our results add to literature establishing that quantile regression provides better behaved estimates than mean regression when there are outlying growth rates, including those induced by adjustments for zeros in the time series of counts. The 0.25 and 0.75 quantiles bracketing the median provide robust estimates of population changes (λ) for the central 50% of time series data and provide a 50% prediction interval for a single new prediction without making parametric distributional assumptions or assuming homogeneous λs. Compared to mean estimates, our quantile regression trend estimates for greater sage-grouse indicated less variation in density-dependent λs by minimizing sensitivity to outlying values, and for Gunnison sage-grouse indicated greater variation in density-dependent λs associated with heterogeneity among quantiles.  相似文献   

13.
In longitudinal studies, measurements of the same individuals are taken repeatedly through time. Often, the primary goal is to characterize the change in response over time and the factors that influence change. Factors can affect not only the location but also more generally the shape of the distribution of the response over time. To make inference about the shape of a population distribution, the widely popular mixed-effects regression, for example, would be inadequate, if the distribution is not approximately Gaussian. We propose a novel linear model for quantile regression (QR) that includes random effects in order to account for the dependence between serial observations on the same subject. The notion of QR is synonymous with robust analysis of the conditional distribution of the response variable. We present a likelihood-based approach to the estimation of the regression quantiles that uses the asymmetric Laplace density. In a simulation study, the proposed method had an advantage in terms of mean squared error of the QR estimator, when compared with the approach that considers penalized fixed effects. Following our strategy, a nearly optimal degree of shrinkage of the individual effects is automatically selected by the data and their likelihood. Also, our model appears to be a robust alternative to the mean regression with random effects when the location parameter of the conditional distribution of the response is of interest. We apply our model to a real data set which consists of self-reported amount of labor pain measurements taken on women repeatedly over time, whose distribution is characterized by skewness, and the significance of the parameters is evaluated by the likelihood ratio statistic.  相似文献   

14.
Flexible estimation of multiple conditional quantiles is of interest in numerous applications, such as studying the effect of pregnancy-related factors on low and high birth weight. We propose a Bayesian nonparametric method to simultaneously estimate noncrossing, nonlinear quantile curves. We expand the conditional distribution function of the response in I-spline basis functions where the covariate-dependent coefficients are modeled using neural networks. By leveraging the approximation power of splines and neural networks, our model can approximate any continuous quantile function. Compared to existing models, our model estimates all rather than a finite subset of quantiles, scales well to high dimensions, and accounts for estimation uncertainty. While the model is arbitrarily flexible, interpretable marginal quantile effects are estimated using accumulative local effect plots and variable importance measures. A simulation study shows that our model can better recover quantiles of the response distribution when the data are sparse, and an analysis of birth weight data is presented.  相似文献   

15.
We develop a new method for variable selection in a nonlinear additive function-on-scalar regression (FOSR) model. Existing methods for variable selection in FOSR have focused on the linear effects of scalar predictors, which can be a restrictive assumption in the presence of multiple continuously measured covariates. We propose a computationally efficient approach for variable selection in existing linear FOSR using functional principal component scores of the functional response and extend this framework to a nonlinear additive function-on-scalar model. The proposed method provides a unified and flexible framework for variable selection in FOSR, allowing nonlinear effects of the covariates. Numerical analysis using simulation study illustrates the advantages of the proposed method over existing variable selection methods in FOSR even when the underlying covariate effects are all linear. The proposed procedure is demonstrated on accelerometer data from the 2003–2004 cohorts of the National Health and Nutrition Examination Survey (NHANES) in understanding the association between diurnal patterns of physical activity and demographic, lifestyle, and health characteristics of the participants.  相似文献   

16.
Censored quantile regression models, which offer great flexibility in assessing covariate effects on event times, have attracted considerable research interest. In this study, we consider flexible estimation and inference procedures for competing risks quantile regression, which not only provides meaningful interpretations by using cumulative incidence quantiles but also extends the conventional accelerated failure time model by relaxing some of the stringent model assumptions, such as global linearity and unconditional independence. Current method for censored quantile regressions often involves the minimization of the L1‐type convex function or solving the nonsmoothed estimating equations. This approach could lead to multiple roots in practical settings, particularly with multiple covariates. Moreover, variance estimation involves an unknown error distribution and most methods rely on computationally intensive resampling techniques such as bootstrapping. We consider the induced smoothing procedure for censored quantile regressions to the competing risks setting. The proposed procedure permits the fast and accurate computation of quantile regression parameter estimates and standard variances by using conventional numerical methods such as the Newton–Raphson algorithm. Numerical studies show that the proposed estimators perform well and the resulting inference is reliable in practical settings. The method is finally applied to data from a soft tissue sarcoma study.  相似文献   

17.
Summary .  In this article, we study the estimation of mean response and regression coefficient in semiparametric regression problems when response variable is subject to nonrandom missingness. When the missingness is independent of the response conditional on high-dimensional auxiliary information, the parametric approach may misspecify the relationship between covariates and response while the nonparametric approach is infeasible because of the curse of dimensionality. To overcome this, we study a model-based approach to condense the auxiliary information and estimate the parameters of interest nonparametrically on the condensed covariate space. Our estimators possess the double robustness property, i.e., they are consistent whenever the model for the response given auxiliary covariates or the model for the missingness given auxiliary covariate is correct. We conduct a number of simulations to compare the numerical performance between our estimators and other existing estimators in the current missing data literature, including the propensity score approach and the inverse probability weighted estimating equation. A set of real data is used to illustrate our approach.  相似文献   

18.
Gaussian process functional regression modeling for batch data   总被引:2,自引:0,他引:2  
A Gaussian process functional regression model is proposed for the analysis of batch data. Covariance structure and mean structure are considered simultaneously, with the covariance structure modeled by a Gaussian process regression model and the mean structure modeled by a functional regression model. The model allows the inclusion of covariates in both the covariance structure and the mean structure. It models the nonlinear relationship between a functional output variable and a set of functional and nonfunctional covariates. Several applications and simulation studies are reported and show that the method provides very good results for curve fitting and prediction.  相似文献   

19.
高慧淋  董利虎  李凤日 《生态学杂志》2016,27(11):3420-3426
基于东北地区378块固定样地和415块临时样地的调查数据和Reineke方程,利用线性分位数回归技术建立了不同分位点(τ=0.90、0.95、0.99)下的长白落叶松人工林最大林分密度与林木平均胸径的关系模型,选出拟合长白落叶松人工林最大密度线的最优模型. 利用人为选取最大的拟合数据,采用最小二乘(OLS)和最大似然(ML)回归同时建立最大密度线模型. 采用极值统计理论的广义Pareto模型推算现实林分特定径阶的极限最大株数,进一步建立极限密度线模型. 将线性分位数回归模型与其他方法进行对比.结果表明: 在全部径阶范围内选取5个最大数据点拟合的方法能够得到现实林分的最大密度线,选取的样点过多会使模拟结果偏离最大密度线,且ML法要优于OLS法. 分位点为0.99的线性分位数回归模型能够取得与ML接近的拟合结果,但分位数回归模型参数的估计结果更稳定. 人为选取拟合数据具有一定的人为性,最终选取分位点为0.99的分位数回归模型为拟合最大密度线的最优模型,参数估计结果为k=11.790、β=-1.586,极限密度线模型的参数估计结果为k=11.820、β=-1.594. 所确定的极限密度线位置略高于最大密度线,但二者差异不明显. 由固定样地数据的验证结果可知,所建立的最大林分密度线及极限密度线能够对现实林分的最大密度及极限密度进行预测,为长白落叶松人工林的合理经营提供依据.  相似文献   

20.
A protective effect of breastfeeding on overweight (binary) has been reported by meta-analyses using logistic regression, whereas studies using linear regression and BMI (continuous) detected no significant association. To assess the relationship of these differences with different outcome classification, we compared results for linear, logistic, and quantile regression models in a cross-sectional data set of considerable size. Height, weight, and questionnaire data on 9,368 preschool children were collected during school-entry examinations in 1999 and 2002 in Bavaria, Southern Germany. We calculated multivariable linear, logistic, and quantile regression models with outcomes BMI, overweight, obesity, and BMI quantiles (as appropriate). Models considered the covariates breastfeeding (breastfed vs. never breastfed), gender, age, smoking in pregnancy, TV watching, maternal BMI, parental education, and early infant weight gain. No significant association was found in the linear regression model. In the logistic model, a significant association was observed for obesity (odds ratio: 0.72 (95% confidence interval (CI) 0.55, 0.94)). In quantile regression no significant point estimates were observed for the percentiles of 0.4-0.8. However, breastfeeding reduced the BMI of children having values on the 90th and 97th percentiles by -0.23 (95% CI -0.39, -0.07) and -0.26 (95% CI -0.45, -0.07) kg/m(2), respectively, on average. In contrast, breastfeeding was significantly associated with a low shift toward higher BMI values for BMI quantiles of 0.03 and from 0.1 to 0.3. The detection of associations between breastfeeding and childhood body composition might be related to the coding of the response variable (continuous or binary) and the statistical method used (linear, logistic, or quantile regression). Quantile regression should additionally be applied in such studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号