首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
By treating the nonlinear model as if it were linear in the parameterization θ in the neighbourhood of the least squares estimate θC, two-sided nominally-q-prediction intervals can be constructed by applying the usual linear model theory. The quadratic approximation of the expected coverage of the prediction intervals is derived for a p-parameter nonlinear model. An adjustment of the nominally-q-prediction intervals is proposed. It is shown that, to the extent that quadratic approximation is adequate, the actual expected coverage of the adjusted prediction intervals is q.  相似文献   

2.
This article derives generalized prediction intervals for random effects in linear random‐effects models. For balanced and unbalanced data in two‐way layouts, models are considered with and without interaction. Coverage of the proposed generalized prediction intervals was estimated in a simulation study based on an agricultural field experiment. Generalized prediction intervals were compared with prediction intervals based on the restricted maximum likelihood (REML) procedure and the approximate methods of Satterthwaite and Kenward and Roger. The simulation study showed that coverage of generalized prediction intervals was closer to the nominal level 0.95 than coverage of prediction intervals based on the REML procedure.  相似文献   

3.
ARIMA与SVM组合模型在害虫预测中的应用   总被引:2,自引:0,他引:2  
向昌盛  周子英 《昆虫学报》2010,53(9):1055-1060
害虫发生是一种复杂、 动态时间序列数据, 单一预测模型都是基于线性或非线性数据, 不能同时捕捉害虫发生的线性和非线性规律, 很难达到理想的预测精度。本研究首先采用差分自回归移动平均模型对昆虫发生时间序列进行线性建模, 然后采用支持向量机对非线性部分进行建模, 最后得到两种模型的组合预测结果。将组合模型应用到松毛虫Dendrolimus punctatus发生面积的预测, 实验结果表明组合模型的预测精度明显优于单一模型, 发挥了两种模型各自的优势。组合模型是一种切实可行的害虫预测预报方法。  相似文献   

4.
We propose a method to construct simultaneous confidence intervals for a parameter vector from inverting a series of randomization tests (RT). The randomization tests are facilitated by an efficient multivariate Robbins–Monro procedure that takes the correlation information of all components into account. The estimation method does not require any distributional assumption of the population other than the existence of the second moments. The resulting simultaneous confidence intervals are not necessarily symmetric about the point estimate of the parameter vector but possess the property of equal tails in all dimensions. In particular, we present the constructing the mean vector of one population and the difference between two mean vectors of two populations. Extensive simulation is conducted to show numerical comparison with four methods. We illustrate the application of the proposed method to test bioequivalence with multiple endpoints on some real data.  相似文献   

5.
目的:建立一种预测精度较高的定量构效关系(QSAR)模型,为设计和合成活性更高的头孢菌素类抗生素提供理论依据。方法:发展了一种基于支持向量回归(SVR)和k-最近邻(KNN)的非线性组合预测方法(SVR-KNN),系统研究了48种抗流感嗜血杆菌头孢菌素衍生物的QSAR。结果:留一法预测结果表明,非线性筛选描述符和子模型能明显提高预测精度,汰选子模型后的组合预测精度优于单一子模型,SVR-KNN的MSE、MAPE分别为0.019、1.81%;独立样本预测结果显示,SVR-KNN在所有参比模型中具有最优的预测精度及稳定性,其MSE、MAPE分别为0.010、1.33%。结论:SVR-KNN模型具有较强的预测能力和优异的泛化推广能力,在抗生素及其他药物的QSAR研究中有广泛应用前景。  相似文献   

6.
删失数据下非线性半参数回归模型中参数的经验似然推断   总被引:1,自引:0,他引:1  
考察了响应变量在随机删失情形下的非线性半参数回归模型,构造了未知参数的经验对数似然比统计量和调整经验对数似然比统计量,证明在一定条件下,所构造的经验似然比统计量渐近于X~2分布,并由此构造出未知参数的置信域.此外,又构造了未知参数的最小二乘估计量,证明了它的渐近性质.通过模拟研究表明,经验似然方法在置信域的覆盖概率以及精度方面要优于最小二乘法.  相似文献   

7.
Fay MP  Tiwari RC  Feuer EJ  Zou Z 《Biometrics》2006,62(3):847-854
The annual percent change (APC) is often used to measure trends in disease and mortality rates, and a common estimator of this parameter uses a linear model on the log of the age-standardized rates. Under the assumption of linearity on the log scale, which is equivalent to a constant change assumption, APC can be equivalently defined in three ways as transformations of either (1) the slope of the line that runs through the log of each rate, (2) the ratio of the last rate to the first rate in the series, or (3) the geometric mean of the proportional changes in the rates over the series. When the constant change assumption fails then the first definition cannot be applied as is, while the second and third definitions unambiguously define the same parameter regardless of whether the assumption holds. We call this parameter the percent change annualized (PCA) and propose two new estimators of it. The first, the two-point estimator, uses only the first and last rates, assuming nothing about the rates in between. This estimator requires fewer assumptions and is asymptotically unbiased as the size of the population gets large, but has more variability since it uses no information from the middle rates. The second estimator is an adaptive one and equals the linear model estimator with a high probability when the rates are not significantly different from linear on the log scale, but includes fewer points if there are significant departures from that linearity. For the two-point estimator we can use confidence intervals previously developed for ratios of directly standardized rates. For the adaptive estimator, we show through simulation that the bootstrap confidence intervals give appropriate coverage.  相似文献   

8.
The low complexity of minimotif patterns results in a high false-positive prediction rate, hampering protein function prediction. A multi-filter algorithm, trained and tested on a linear regression model, support vector machine model, and neural network model, using a large dataset of verified minimotifs, vastly improves minimotif prediction accuracy while generating few false positives. An optimal threshold for the best accuracy reaches an overall accuracy above 90%, while a stringent threshold for the best specificity generates less than 1% false positives or even no false positives and still produces more than 90% true positives for the linear regression and neural network models. The minimotif multi-filter with its excellent accuracy represents the state-of-the-art in minimotif prediction and is expected to be very useful to biologists investigating protein function and how missense mutations cause disease.  相似文献   

9.
In this article, we provide a method of estimation for the treatment effect in the adaptive design for censored survival data with or without adjusting for risk factors other than the treatment indicator. Within the semiparametric Cox proportional hazards model, we propose a bias-adjusted parameter estimator for the treatment coefficient and its asymptotic confidence interval at the end of the trial. The method for obtaining an asymptotic confidence interval and point estimator is based on a general distribution property of the final test statistic from the weighted linear rank statistics at the interims with or without considering the nuisance covariates. The computation of the estimates is straightforward. Extensive simulation studies show that the asymptotic confidence intervals have reasonable nominal probability of coverage, and the proposed point estimators are nearly unbiased with practical sample sizes.  相似文献   

10.
L Hu  GW Wei 《Biophysical journal》2012,103(4):758-766
The Poisson equation is a widely accepted model for electrostatic analysis. However, the Poisson equation is derived based on electric polarizations in a linear, isotropic, and homogeneous dielectric medium. This article introduces a nonlinear Poisson equation to take into consideration of hyperpolarization effects due to intensive charges and possible nonlinear, anisotropic, and heterogeneous media. Variational principle is utilized to derive the nonlinear Poisson model from an electrostatic energy functional. To apply the proposed nonlinear Poisson equation for the solvation analysis, we also construct a nonpolar solvation energy functional based on the nonlinear Poisson equation by using the geometric measure theory. At a fixed temperature, the proposed nonlinear Poisson theory is extensively validated by the electrostatic analysis of the Kirkwood model and a set of 20 proteins, and the solvation analysis of a set of 17 small molecules whose experimental measurements are also available for a comparison. Moreover, the nonlinear Poisson equation is further applied to the solvation analysis of 21 compounds at different temperatures. Numerical results are compared to theoretical prediction, experimental measurements, and those obtained from other theoretical methods in the literature. A good agreement between our results and experimental data as well as theoretical results suggests that the proposed nonlinear Poisson model is a potentially useful model for electrostatic analysis involving hyperpolarization effects.  相似文献   

11.
This study compares the performance of statistical methods for predicting age-standardized cancer incidence, including Poisson generalized linear models, age-period-cohort (APC) and Bayesian age-period-cohort (BAPC) models, autoregressive integrated moving average (ARIMA) time series, and simple linear models. The methods are evaluated via leave-future-out cross-validation, and performance is assessed using the normalized root mean square error, interval score, and coverage of prediction intervals. Methods were applied to cancer incidence from the three Swiss cancer registries of Geneva, Neuchatel, and Vaud combined, considering the five most frequent cancer sites: breast, colorectal, lung, prostate, and skin melanoma and bringing all other sites together in a final group. Best overall performance was achieved by ARIMA models, followed by linear regression models. Prediction methods based on model selection using the Akaike information criterion resulted in overfitting. The widely used APC and BAPC models were found to be suboptimal for prediction, particularly in the case of a trend reversal in incidence, as it was observed for prostate cancer. In general, we do not recommend predicting cancer incidence for periods far into the future but rather updating predictions regularly.  相似文献   

12.
支持向量机在害虫发生量预测中的应用   总被引:6,自引:0,他引:6  
害虫发生量与其影响因子之间具有复杂的非线性和时滞性关系,传统方法不能很好的分析和拟合高度非线性的害虫发生量变化规律,导致预测精度不理想。为了有效构建害虫发生量与其影响因子之间复杂的非线性关系模型,提高害虫发生量预测精度,提出一种基于支持向量机的害虫发生量预测方法。该方法首先通过F测验对害虫发生量的最佳时滞阶数进行确定,并利用最佳时滞阶数对样本进行重构;然后利用前向浮动因子筛选法对害虫发生量的影响因子进行筛选,筛选出对预测结果贡献大的影响因子;最后采用10折交叉验证得到害虫发生量的最优预测模型。采用粘虫的幼虫发生密度数据在Mat-lab7.0平台下对该方法进行测试与分析,实验结果表明,相对于其它预测方法,支持向量机提高了害虫发生量的预测精度,克服了传统方法的缺陷,更适合于非线性、小样本的害虫发生量预测。  相似文献   

13.
Accurate prediction of the phenotypic performance of a hybrid plant based on the molecular fingerprints of its parents should lead to a more cost-effective breeding programme as it allows to reduce the number of expensive field evaluations. The construction of a reliable prediction model requires a representative sample of hybrids for which both molecular and phenotypic information are accessible. This phenotypic information is usually readily available as typical breeding programmes test numerous new hybrids in multi-location field trials on a yearly basis. Earlier studies indicated that a linear mixed model analysis of this typically unbalanced phenotypic data allows to construct ɛ-insensitive support vector machine regression and best linear prediction models for predicting the performance of single-cross maize hybrids. We compare these prediction methods using different subsets of the phenotypic and marker data of a commercial maize breeding programme and evaluate the resulting prediction accuracies by means of a specifically designed field experiment. This balanced field trial allows to assess the reliability of the cross-validation prediction accuracies reported here and in earlier studies. The limits of the predictive capabilities of both prediction methods are further examined by reducing the number of training hybrids and the size of the molecular fingerprints. The results indicate a considerable discrepancy between prediction accuracies obtained by cross-validation procedures and those obtained by correlating the predictions with the results of a validation field trial. The prediction accuracy of best linear prediction was less sensitive to a reduction of the number of training examples compared with that of support vector machine regression. The latter was, however, better at predicting hybrid performance when the size of the molecular fingerprints was reduced, especially if the initial set of markers had a low information content.  相似文献   

14.
The present paper reports the results of a Monte Carlo simulation study to examine the performance of several approximate confidence intervals for the Relative Risk Ratio (RRR) parameter in an epidemiologic study, involving two groups of individuals. The first group consists of n1 individuals, called the experimental group, who are exposed to some carcinogen, say radiation, whose effect on the incidence of some form of cancer, say skin cancer, is being investigated. The second group consists of n2 individuals (called the control group) who are not exposed to the carcinogen. Two cases are considered in which the life times (or time to cancer) in the two groups follow (i) the exponential and (ii) the Weibull distributions. The case when the life times follow a Rayleigh distribution follows as a particular case. A general random censorship model is considered in which the life times of the individuals are censored on the right by random censoring times following (i) the exponential and (ii) the Weibull distributions. The Relative Risk Ratio parameter in the study is defined as the ratio of the hazard rates in the two distributions of the times to cancer. Approximate confidence intervals are constructed for the RRR parameter using its maximum likelihood estimator (m.l.e) and several other methods, including a method due to FIELLER. SPROTT'S (1973) and Cox's (1953) suggestions, as well as the Box-Cox (1964) transformation, are also utilized to construct approximate confidence intervals. The performance of these confidence intervals in small samples is investigated by means of some Monte Carlo simulations based on 500 random samples. Our simulation study indicates that many of these confidence intervals perform quite well in samples of size 10 and 15, in terms of the coverage probability and expected length of the interval.  相似文献   

15.
 We recorded the electric organ discharges of resting Gymnotus carapo specimens. We analyzed the time series formed by the sequence of interdischarge intervals. Nonlinear prediction, false nearest neighbor analyses, and comparison between the performance of nonlinear and linear autoregressive models fitted to the data indicated that nonlinear correlations between intervals were absent, or were present to a minor extent only. Following these analyses, we showed that linear autoregressive models with combined Gaussian and shot noise reproduced the variability and correlations of the resting discharge pattern. We discuss the implications of our findings for the mechanisms underlying the timing of electric organ discharge generation. We also argue that autoregressive models can be used to evaluate the changes arising during a wide variety of behaviors, such as the modification in the discharge intervals during interaction between fish pairs. Received: 14 March 2000 / Accepted in revised form: 9 October 2000  相似文献   

16.
Summary The study of dependence between random variables is a mainstay in statistics. In many cases, the strength of dependence between two or more random variables varies according to the values of a measured covariate. We propose inference for this type of variation using a conditional copula model where the copula function belongs to a parametric copula family and the copula parameter varies with the covariate. In order to estimate the functional relationship between the copula parameter and the covariate, we propose a nonparametric approach based on local likelihood. Of importance is also the choice of the copula family that best represents a given set of data. The proposed framework naturally leads to a novel copula selection method based on cross‐validated prediction errors. We derive the asymptotic bias and variance of the resulting local polynomial estimator, and outline how to construct pointwise confidence intervals. The finite‐sample performance of our method is investigated using simulation studies and is illustrated using a subset of the Matched Multiple Birth data.  相似文献   

17.
A byproduct of genome-wide association studies is the possibility of carrying out genome-enabled prediction of disease risk or of quantitative traits. This study is concerned with predicting two quantitative traits, milk yield in dairy cattle and grain yield in wheat, using dense molecular markers as predictors. Two support vector regression (SVR) models, ε-SVR and least-squares SVR, were explored and compared to a widely applied linear regression model, the Bayesian Lasso, the latter assuming additive marker effects. Predictive performance was measured using predictive correlation and mean squared error of prediction. Depending on the kernel function chosen, SVR can model either linear or nonlinear relationships between phenotypes and marker genotypes. For milk yield, where phenotypes were estimated breeding values of bulls (a linear combination of the data), SVR with a Gaussian radial basis function (RBF) kernel had a slightly better performance than with a linear kernel, and was similar to the Bayesian Lasso. For the wheat data, where phenotype was raw grain yield, the RBF kernel provided clear advantages over the linear kernel, e.g., a 17.5% increase in correlation when using the ε-SVR. SVR with a RBF kernel also compared favorably to the Bayesian Lasso in this case. It is concluded that a nonlinear RBF kernel may be an optimal choice for SVR, especially when phenotypes to be predicted have a nonlinear dependency on genotypes, as it might have been the case in the wheat data.  相似文献   

18.
19.
Qiu J  Hwang JT 《Biometrics》2007,63(3):767-776
Summary Simultaneous inference for a large number, N, of parameters is a challenge. In some situations, such as microarray experiments, researchers are only interested in making inference for the K parameters corresponding to the K most extreme estimates. Hence it seems important to construct simultaneous confidence intervals for these K parameters. The naïve simultaneous confidence intervals for the K means (applied directly without taking into account the selection) have low coverage probabilities. We take an empirical Bayes approach (or an approach based on the random effect model) to construct simultaneous confidence intervals with good coverage probabilities. For N= 10,000 and K= 100, typical for microarray data, our confidence intervals could be 77% shorter than the naïve K‐dimensional simultaneous intervals.  相似文献   

20.
单核苷酸多态性(single nucleotide polymorphism,SNP)是法医遗传学个体识别和族群推断常用的遗传标记. 本研究集合文献和公共库中祖先信息SNP位点(ancestry informative SNPs,AISNPs),应用softmax回归、支持向量机和随机森林3种算法,研究东亚北方的3个主体人群(中国北方汉族人、日本人和韩国人)的族群推断效果. 我们分析了来自千人基因组计划的103份中国北方汉族人样本、104份日本人样本和亚洲多样性计划的100份韩国人样本的428个AISNP位点分型,采用多元线性回归共线性诊断筛选出67个高信息量的AISNPs位点组合,构建了softmax回归和支持向量机算法的两种族群推断模型,采用随机森林平均降准分析筛选出42个高信息量的AISNPs位点组合,并构建了随机森林算法的族群推断模型,将softmax回归、支持向量机与随机森林3种模型用于北方汉族人、日本人、韩国人的族群推断,五次十折交叉验证(training∶testing=9∶1)测试3种模型的平均准确率分别为95.19%、95.77%、94.53%. 本研究建立的3种族群推断模型均可用于东亚北方三大人群的遗传推断,42 AISNPs组合的位点数目较少,更适于构建法医检测体系,具有较高的实际应用价值.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号