首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re‐weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (?F and F‐ratio) under ideal or non‐ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non‐ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions.  相似文献   

2.
Ordinary least square (OLS) in regression has been widely used to analyze patient-level data in cost-effectiveness analysis (CEA). However, the estimates, inference and decision making in the economic evaluation based on OLS estimation may be biased by the presence of outliers. Instead, robust estimation can remain unaffected and provide result which is resistant to outliers. The objective of this study is to explore the impact of outliers on net-benefit regression (NBR) in CEA using OLS and to propose a potential solution by using robust estimations, i.e. Huber M-estimation, Hampel M-estimation, Tukey''s bisquare M-estimation, MM-estimation and least trimming square estimation. Simulations under different outlier-generating scenarios and an empirical example were used to obtain the regression estimates of NBR by OLS and five robust estimations. Empirical size and empirical power of both OLS and robust estimations were then compared in the context of hypothesis testing.Simulations showed that the five robust approaches compared with OLS estimation led to lower empirical sizes and achieved higher empirical powers in testing cost-effectiveness. Using real example of antiplatelet therapy, the estimated incremental net-benefit by OLS estimation was lower than those by robust approaches because of outliers in cost data. Robust estimations demonstrated higher probability of cost-effectiveness compared to OLS estimation. The presence of outliers can bias the results of NBR and its interpretations. It is recommended that the use of robust estimation in NBR can be an appropriate method to avoid such biased decision making.  相似文献   

3.
In simple regression, two serious problems with the ordinary least squares (OLS) estimator are that its efficiency can be relatively poor when the error term is normal but heteroscedastic, and the usual confidence interval for the slope can have highly unsatisfactory probability coverage. When the error term is nonnormal, these problems become exacerbated. Two other concerns are that the OLS estimator has an unbounded influence function and a breakdown point of zero. Wilcox (1996) compared several estimators when there is heteroscedasticity and found two that have relatively good efficiency and simultaneously provide protection against outliers: an M-estimator with Schweppe weights and an estimator proposed by Cohen, Dalal and Tukey (1993). However, the M-estimator can handle only one outlier in the X-domain or among the Y values, and among the methods considered by Wilcox for computing confidence intervals for the slope, none performed well when working with the Cohen-Dalal-Tukey estimator. This note points out that the small-sample efficiency of theTheil-Sen estimator competes well with the estimators considered by Wilcox, and a method for computing a confidence interval was found that performs well in simulations. The Theil-Sen estimator has a reasonably high breakdown point, a bounded influence function, and in some cases its small-sample efficiency offers a substantial advantage over all of the estimators compared in Wilcox (1996).  相似文献   

4.
Physiological and ecological allometries often pose linear regression problems characterized by (1) noncausal, phylogenetically autocorrelated independent (x) and dependent (y) variables (characters); (2) random variation in both variables; and (3) a focus on regression slopes (allometric exponents). Remedies for the phylogenetic autocorrelation of species values (phylogenetically independent contrasts) and variance structure of the data (reduced major axis [RMA] regression) have been developed, but most functional allometries are reported as ordinary least squares (OLS) regression without use of phylogenetically independent contrasts. We simulated Brownian diffusive evolution of functionally related characters and examined the importance of regression methodologies and phylogenetic contrasts in estimating regression slopes for phylogenetically constrained data. Simulations showed that both OLS and RMA regressions exhibit serious bias in estimated regression slopes under different circumstances but that a modified orthogonal (least squares variance-oriented residual [LSVOR]) regression was less biased than either OLS or RMA regressions. For strongly phylogenetically structured data, failure to use phylogenetic contrasts as regression data resulted in overestimation of the strength of the regression relationship and a significant increase in the variance of the slope estimate. Censoring of data sets by simulated extinction of taxa did not affect the importance of appropriate regression models or the use of phylogenetic contrasts.  相似文献   

5.
高慧淋  董利虎  李凤日 《生态学杂志》2016,27(11):3420-3426
基于东北地区378块固定样地和415块临时样地的调查数据和Reineke方程,利用线性分位数回归技术建立了不同分位点(τ=0.90、0.95、0.99)下的长白落叶松人工林最大林分密度与林木平均胸径的关系模型,选出拟合长白落叶松人工林最大密度线的最优模型. 利用人为选取最大的拟合数据,采用最小二乘(OLS)和最大似然(ML)回归同时建立最大密度线模型. 采用极值统计理论的广义Pareto模型推算现实林分特定径阶的极限最大株数,进一步建立极限密度线模型. 将线性分位数回归模型与其他方法进行对比.结果表明: 在全部径阶范围内选取5个最大数据点拟合的方法能够得到现实林分的最大密度线,选取的样点过多会使模拟结果偏离最大密度线,且ML法要优于OLS法. 分位点为0.99的线性分位数回归模型能够取得与ML接近的拟合结果,但分位数回归模型参数的估计结果更稳定. 人为选取拟合数据具有一定的人为性,最终选取分位点为0.99的分位数回归模型为拟合最大密度线的最优模型,参数估计结果为k=11.790、β=-1.586,极限密度线模型的参数估计结果为k=11.820、β=-1.594. 所确定的极限密度线位置略高于最大密度线,但二者差异不明显. 由固定样地数据的验证结果可知,所建立的最大林分密度线及极限密度线能够对现实林分的最大密度及极限密度进行预测,为长白落叶松人工林的合理经营提供依据.  相似文献   

6.
土壤阳离子交换量(CEC)是土壤施肥、改良的主要依据和土壤质量的评价指标,研究土壤CEC的空间分布及模型预测可为土壤养分监测、管理及精准农业实施提供科学依据。本研究以中宁枸杞林地粉壤土为对象,在自相关、交互相关等分析基础上,采用协同克里格(CoKriging)、普通最小二乘法(OLS)、地理加权回归(GWR)和随机森林(RF)模型对土壤CEC进行回归分析,比较了制图效果及模型预测精度。结果表明:中宁枸杞林地粉壤土CEC平均值为13.12 cmol·kg^-1,属中等肥力;土壤CEC的空间分布具有自相关性,并与土壤pH、有机质、黏粒和电导率在不同滞后距离上存在不同的空间相互关系;RF模型预测图避免了CoKriging、OLS和GWR模型预测图中土壤CEC图斑边界两侧破碎程度大、突变明显的缺陷,使土壤CEC在空间变化上表现为自然、平缓的过渡;RF模型RMSE值分别比CoKriging、OLS和GWR模型减少33.82%、20.55%和19.81%,R^2分别提高8.84%、51.92%和7.69%。RF模型考虑了样点空间位置,明显提高了插值精度且制图效果更加平缓。  相似文献   

7.
福州市土壤铬含量高光谱预测的GWR模型研究   总被引:2,自引:0,他引:2  
江振蓝  杨玉盛  沙晋明 《生态学报》2017,37(23):8117-8127
通过系统分析不同光谱分辨率和光谱变换对土壤铬高光谱预测模型的不确定性影响,筛选出最优的光谱分辨率及光谱变量进行土壤铬含量预测的地理权重回归(GWR)模型构建,利用该模型进行福州市土壤铬含量预测,并将预测结果与普通最小二乘法回归(OLS)结果进行比较分析,探讨GWR模型在土壤铬高光谱预测中的适用性及局限性。结果表明:(1)在10 nm分辨率尺度下,以土壤全铬含量为因变量,反射率的二阶微分和反射率倒数的二阶微分为自变量构建的GWR模型对土壤铬预测的效果最好。GWR模型的R~2和调节R~2分别为0.821和0.716,较OLS模型分别提高了0.529和0.450,而AIC值为720.703,较OLS模型减少了22个单位,残差平方和仅为OLS模型的1/4,说明GWR模型的预测效果较OLS模型有了显著提高。(2)土壤铬预测模型的精度受光谱分辨率影响。对于OLS预测模型来说,3 nm分辨率的模型预测效果最好,而对于GWR预测模型来说,10nm分辨率的模型不仅预测效果最好,其相较于OLS模型的改善作用显著,为土壤铬含量GWR预测的最佳光谱分辨率。(3)光谱的一阶微分变换可以有效增强土壤铬的光谱特征,而其余的光谱变换对土壤铬的光谱特征则未起到增强作用,但可以很好地提高模型的预测效果。(4)研究得出土壤铬GWR模型预测的最佳光谱分辨率为10 nm,为EO-1 Hyperion影像的光谱分辨率,而且随着采样点的增加,GWR模型的预测效果趋于稳定,适合空间异质性大的区域尺度土壤铬预测。故该模型与高光谱影像结合,实现模型从实验室尺度向区域尺度的推广,为格网尺度土壤铬的空间预测提供可能。  相似文献   

8.
A dimensionless approach to the study of life-history evolution has been applied to a wide variety of variables in the search for life-history invariants. This approach usually employs ordinary least squares (OLS) regressions of log-transformed data. In several well-studied combinations of variables the range of values of one parameter is bounded or limited by the value of the other. In this situation, the null hypothesis normally applied to regression analysis is not appropriate. We generate the null expectations and confidence intervals (CI) for OLS and reduced major axis (RMA) regressions using random variables that are bounded in this way. Comparisons of these CI show that, for log-transformed data, the patterns generated by random data and those predicted by life history invariant theory often could not be distinguished because both predict a slope of 1. We recommend that tests based on the putative invariant ratios and not the correlations between the two variables be used in the exploration of life-history invariants using bounded data. Because empirical data are often not normally distributed randomization test may be more appropriate than standard statistical tests.  相似文献   

9.
Longitudinal studies are often applied in biomedical research and clinical trials to evaluate the treatment effect. The association pattern within the subject must be considered in both sample size calculation and the analysis. One of the most important approaches to analyze such a study is the generalized estimating equation (GEE) proposed by Liang and Zeger, in which “working correlation structure” is introduced and the association pattern within the subject depends on a vector of association parameters denoted by ρ. The explicit sample size formulas for two‐group comparison in linear and logistic regression models are obtained based on the GEE method by Liu and Liang. For cluster randomized trials (CRTs), researchers proposed the optimal sample sizes at both the cluster and individual level as a function of sampling costs and the intracluster correlation coefficient (ICC). In these approaches, the optimal sample sizes depend strongly on the ICC. However, the ICC is usually unknown for CRTs and multicenter trials. To overcome this shortcoming, Van Breukelen et al. consider a range of possible ICC values identified from literature reviews and present Maximin designs (MMDs) based on relative efficiency (RE) and efficiency under budget and cost constraints. In this paper, the optimal sample size and number of repeated measurements using GEE models with an exchangeable working correlation matrix is proposed under the considerations of fixed budget, where “optimal” refers to maximum power for a given sampling budget. The equations of sample size and number of repeated measurements for a known parameter value ρ are derived and a straightforward algorithm for unknown ρ is developed. Applications in practice are discussed. We also discuss the existence of the optimal design when an AR(1) working correlation matrix is assumed. Our proposed method can be extended under the scenarios when the true and working correlation matrix are different.  相似文献   

10.
Aim The objective of this paper is to obtain a net primary production (NPP) regression model based on the geographically weighted regression (GWR) method, which includes spatial non‐stationarity in the parameters estimated for forest ecosystems in China. Location We used data across China. Methods We examine the relationships between NPP of Chinese forest ecosystems and environmental variables, specifically altitude, temperature, precipitation and time‐integrated normalized difference vegetation index (TINDVI) based on the ordinary least squares (OLS) regression, the spatial lag model and GWR methods. Results The GWR method made significantly better predictions of NPP in simulations than did OLS, as indicated both by corrected Akaike Information Criterion (AICc) and R2. GWR provided a value of 4891 for AICc and 0.66 for R2, compared with 5036 and 0.58, respectively, by OLS. GWR has the potential to reveal local patterns in the spatial distribution of a parameter, which would be ignored by the OLS approach. Furthermore, OLS may provide a false general relationship between spatially non‐stationary variables. Spatial autocorrelation violates a basic assumption of the OLS method. The spatial lag model with the consideration of spatial autocorrelation had improved performance in the NPP simulation as compared with OLS (5001 for AICc and 0.60 for R2), but it was still not as good as that via the GWR method. Moreover, statistically significant positive spatial autocorrelation remained in the NPP residuals with the spatial lag model at small spatial scales, while no positive spatial autocorrelation across spatial scales can be found in the GWR residuals. Conclusions We conclude that the regression analysis for Chinese forest NPP with respect to environmental factors and based alternatively on OLS, the spatial lag model, and GWR methods indicated that there was a significant improvement in model performance of GWR over OLS and the spatial lag model.  相似文献   

11.
When it comes to fitting simple allometric slopes through measurement data, evolutionary biologists have been torn between regression methods. On the one hand, there is the ordinary least squares (OLS) regression, which is commonly used across many disciplines of biology to fit lines through data, but which has a reputation for underestimating slopes when measurement error is present. On the other hand, there is the reduced major axis (RMA) regression, which is often recommended as a substitute for OLS regression in studies of allometry, but which has several weaknesses of its own. Here, we review statistical theory as it applies to evolutionary biology and studies of allometry. We point out that the concerns that arise from measurement error for OLS regression are small and straightforward to deal with, whereas RMA has several key properties that make it unfit for use in the field of allometry. The recommended approach for researchers interested in allometry is to use OLS regression on measurements taken with low (but realistically achievable) measurement error. If measurement error is unavoidable and relatively large, it is preferable to correct for slope attenuation rather than to turn to RMA regression, or to take the expected amount of attenuation into account when interpreting the data.  相似文献   

12.
Aim   Although parameter estimates are not as affected by spatial autocorrelation as Type I errors, the change from classical null hypothesis significance testing to model selection under an information theoretic approach does not completely avoid problems caused by spatial autocorrelation. Here we briefly review the model selection approach based on the Akaike information criterion (AIC) and present a new routine for Spatial Analysis in Macroecology (SAM) software that helps establishing minimum adequate models in the presence of spatial autocorrelation.
Innovation    We illustrate how a model selection approach based on the AIC can be used in geographical data by modelling patterns of mammal species in South America represented in a grid system ( n  = 383) with 2° of resolution, as a function of five environmental explanatory variables, performing an exhaustive search of minimum adequate models considering three regression methods: non-spatial ordinary least squares (OLS), spatial eigenvector mapping and the autoregressive (lagged-response) model. The models selected by spatial methods included a smaller number of explanatory variables than the one selected by OLS, and minimum adequate models contain different explanatory variables, although model averaging revealed a similar rank of explanatory variables.
Main conclusions    We stress that the AIC is sensitive to the presence of spatial autocorrelation, generating unstable and overfitted minimum adequate models to describe macroecological data based on non-spatial OLS regression. Alternative regression techniques provided different minimum adequate models and have different uncertainty levels. Despite this, the averaged model based on Akaike weights generates consistent and robust results across different methods and may be the best approach for understanding of macroecological patterns.  相似文献   

13.
森林碳储量对于全球气候变化具有重要影响,以往的模型估算未考虑到模型残差的空间相关性和碳储量数据的非平稳性,影响模型的预测精度.本研究基于东北林业大学帽儿山实验林场的ETM+遥感影像数据和193块固定样地,利用地理加权克里格回归(GWRK)建立森林碳储量与遥感和地形因子的回归模型,同时对比最小二乘模型(OLS)、地理加权回归模型(GWR)的预测精度.结果表明: 对于帽儿山地区的森林碳储量估算,GWRK的平均绝对误差(MAE)、均方根误差(RMSE)低于OLS模型和GWR模型,GWRK模型的平均误差(ME)低于GWR模型,与OLS模型相近.GWRK模型的预测精度为83.2%,较OLS模型(73.7%)和GWR模型(77.3%)分别提高6%和10%,拟合精度明显提高,说明GWRK模型是森林碳储量估算的有效方法.利用GWRK模型预测的研究区森林碳储量平均值为70.31 t·hm-2,在海拔较高的地区,森林碳储量值相对较高,说明海拔对其有较大影响.  相似文献   

14.
1. Despite a substantial body of work there remains much disagreement about the form of the relationship between organism abundance and body size. In an attempt at resolving these disagreements the shape and slope of samples from simulated and real abundance–mass distributions were assessed by ordinary least squares regression (OLS) and the reduced major axis method (RMA).
2. It is suggested that the data gathered by ecologists to assess these relationships are usually truncated in respect of density. Under these conditions RMA gives slope estimates which are consistently closer to the true slopes than OLS regression.
3. The triangular relationships reported by some workers are found over smaller mass and abundance ranges than linear relations. Scatter in slope estimates is much greater and positive slopes more common at small sample sizes and sample ranges. These results support the notion that inadequate and truncated sampling is responsible for much of the disagreement reported in the literature.
4. The results strongly support the notion that density declines with increasing body mass in a broad, linear band with a slope around −1. However there is some evidence to suggest that this overall relation results from a series of component relations with slopes which differ from the overall slope.  相似文献   

15.
Pop‐Inference is an educational tool designed to help teaching of hypothesis testing using populations. The application allows for the statistical comparison of demographic parameters among populations. Input demographic data are projection matrices or raw demographic data. Randomization tests are used to compare populations. The tests evaluate the hypothesis that demographic parameters differ among groups of individuals more that should be expected from random allocation of individuals to populations. Confidence intervals for demographic parameters are obtained using the bootstrap. Tests may be global or pairwise. In addition to tests on differences, one‐way life table response experiments (LTRE) are available for random and fixed factors. Planned (a priori) comparisons are possible. Power of comparison tests is evaluated by constructing the distribution of the test statistic when the null hypothesis is true and when it is false. The relationship between power and sample size is explored by evaluating differences among populations at increasing population sizes, while keeping vital rates constant.  相似文献   

16.
Determination of material parameters for soft tissue frequently involves regression of material parameters for nonlinear, anisotropic constitutive models against experimental data from heterogeneous tests. Here, parameter estimation based on membrane inflation is considered. A four parameter nonlinear, anisotropic hyperelastic strain energy function was used to model the material, in which the parameters are cast in terms of key response features. The experiment was simulated using finite element (FE) analysis in order to predict the experimental measurements of pressure versus profile strain. Material parameter regression was automated using inverse FE analysis; parameter values were updated by use of both local and global techniques, and the ability of these techniques to efficiently converge to a best case was examined. This approach provides a framework in which additional experimental data, including surface strain measurements or local structural information, may be incorporated in order to quantify heterogeneous nonlinear material properties.  相似文献   

17.
H C Thode  S J Finch  N R Mendell 《Biometrics》1988,44(4):1195-1201
We find the percentage points of the likelihood ratio test of the null hypothesis that a sample of n observations is from a normal distribution with unknown mean and variance against the alternative that the sample is from a mixture of two distinct normal distributions, each with unknown mean and unknown (but equal) variance. The mixing proportion pi is also unknown under the alternative hypothesis. For 2,500 samples of sizes n = 15, 20, 25, 40, 50, 70, 75, 80, 100, 150, 250, 500, and 1,000, we calculated the likelihood ratio statistic, and from these values estimated the percentage points of the null distributions. Our algorithm for the calculation of the maximum likelihood estimates of the unknown parameters included precautions against convergence of the maximization algorithm to a local rather than global maximum. Investigations for convergence to an asymptotic distribution indicated that convergence was very slow and that stability was not apparent for samples as large as 1,000. Comparisons of the percentage points to the commonly assumed chi-squared distribution with 2 degrees of freedom indicated that this assumption is too liberal; i.e., one's P-value is greater than that indicated by chi 2(2). We conclude then that one would need what is usually an unfeasibly large sample size (n greater than 1,000) for the use of large-sample approximations to be justified.  相似文献   

18.
Shin Y  Raudenbush SW 《Biometrics》2007,63(4):1262-1268
The development of model-based methods for incomplete data has been a seminal contribution to statistical practice. Under the assumption of ignorable missingness, one estimates the joint distribution of the complete data for thetainTheta from the incomplete or observed data y(obs). Many interesting models involve one-to-one transformations of theta. For example, with y(i) approximately N(mu, Sigma) for i= 1, ... , n and theta= (mu, Sigma), an ordinary least squares (OLS) regression model is a one-to-one transformation of theta. Inferences based on such a transformation are equivalent to inferences based on OLS using data multiply imputed from f(y(mis) | y(obs), theta) for missing y(mis). Thus, identification of theta from y(obs) is equivalent to identification of the regression model. In this article, we consider a model for two-level data with continuous outcomes where the observations within each cluster are dependent. The parameters of the hierarchical linear model (HLM) of interest, however, lie in a subspace of Theta in general. This identification of the joint distribution overidentifies the HLM. We show how to characterize the joint distribution so that its parameters are a one-to-one transformation of the parameters of the HLM. This leads to efficient estimation of the HLM from incomplete data using either the transformation method or the method of multiple imputation. The approach allows outcomes and covariates to be missing at either of the two levels, and the HLM of interest can involve the regression of any subset of variables on a disjoint subset of variables conceived as covariates.  相似文献   

19.
Kolassa JE  Tanner MA 《Biometrics》1999,55(4):1291-1294
This article presents an algorithm for small-sample conditional confidence regions for two or more parameters for any discrete regression model in the generalized linear interactive model family. Regions are constructed by careful inversion of conditional hypothesis tests. This method presupposes the use of approximate or exact techniques for enumerating the sample space for some components of the vector of sufficient statistics conditional on other components. Such enumeration may be performed exactly or by exact or approximate Monte Carlo, including the algorithms of Kolassa and Tanner (1994, Journal of the American Statistical Association 89, 697-702; 1999, Biometrics 55, 246-251). This method also assumes that one can compute certain conditional probabilities for a fixed value of the parameter vector. Because of a property of exponential families, one can use this set of conditional probabilities to directly compute the conditional probabilities associated with any other value of the vector of the parameters of interest. This observation dramatically reduces the computational effort required to invert the hypothesis test to obtain the confidence region. To construct a region with confidence level 1 - alpha, the algorithm begins with a grid of values for the parameters of interest. For each parameter vector on the grid (corresponding to the current null hypothesis), one transforms the initial set of conditional probabilities using exponential tilting and then calculates the p value for this current null hypothesis. The confidence region is the set of parameter values for which the p value is at least alpha.  相似文献   

20.
In this article, we have considered two families of predictors for the simultaneous prediction of actual and average values of study variable in a linear regression model when a set of stochastic linear constraints binding the regression coefficients is available. These families arise from the method of mixed regression estimation. Performance properties of these families are analyzed when the objective is to predict values outside the sample and within the sample.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号