首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
A method for fitting regression models to data that exhibit spatial correlation and heteroskedasticity is proposed. It is well known that ignoring a nonconstant variance does not bias least-squares estimates of regression parameters; thus, data analysts are easily lead to the false belief that moderate heteroskedasticity can generally be ignored. Unfortunately, ignoring nonconstant variance when fitting variograms can seriously bias estimated correlation functions. By modeling heteroskedasticity and standardizing by estimated standard deviations, our approach eliminates this bias in the correlations. A combination of parametric and nonparametric regression techniques is used to iteratively estimate the various components of the model. The approach is demonstrated on a large data set of predicted nitrogen runoff from agricultural lands in the Midwest and Northern Plains regions of the U.S.A. For this data set, the model comprises three main components: (1) the mean function, which includes farming practice variables, local soil and climate characteristics, and the nitrogen application treatment, is assumed to be linear in the parameters and is fitted by generalized least squares; (2) the variance function, which contains a local and a spatial component whose shapes are left unspecified, is estimated by local linear regression; and (3) the spatial correlation function is estimated by fitting a parametric variogram model to the standardized residuals, with the standardization adjusting the variogram for the presence of heteroskedasticity. The fitting of these three components is iterated until convergence. The model provides an improved fit to the data compared with a previous model that ignored the heteroskedasticity and the spatial correlation.  相似文献   

3.
The weights used in iterative weighted least squares (IWLS) regression are usually estimated parametrically using a working model for the error variance. When the variance function is misspecified, the IWLS estimates of the regression coefficients β are still asymptotically consistent but there is some loss in efficiency. Since second moments can be quite hard to model, it makes sense to estimate the error variances nonparametrically and to employ weights inversely proportional to the estimated variances in computing the WLS estimate for β. Surprisingly, this approach had not received much attention in the literature. The aim of this note is to demonstrate that such a procedure can be implemented easily in S-plus using standard functions with default options making it suitable for routine applications. The particular smoothing method that we use is local polynomial regression applied to the logarithm of the squared residuals but other smoothers can be tried as well. The proposed procedure is applied to data on the use of two different assay methods for a hormone. Efficiency calculations based on the estimated model show that the nonparametric IWLS estimates are more efficient than the parametric IWLS estimates based on three different plausible working models for the variance function. The proposed estimators also perform well in a simulation study using both parametric and nonparametric variance functions as well as normal and gamma errors.  相似文献   

4.
Assessing animal population growth curves is an essential feature of field studies in ecology and wildlife management. We used five models to assess population growth rates with a number of sets of population growth rate data. A 'generalized' logistic curve provides a better model than do four other popular models. Use of difference equations for fitting was checked by a comparison of that method and direct fitting of the analytical (integrated) solution for three of the models. Fits to field data indicate that estimates of the asymptote, K, from the 'generalized logistic' and the ordinary logistic agree well enough to support use of estimates of K from the ordinary logistic on data that cannot be satisfactorily fitted with the generalized logistic. Akaike's information criterion is widely used, often with a small sample version AICc. Our study of five models indicated a bias in the AICc criterion, so we recommend checking results with estimates of variance about regression for fitted models. Fitting growth curves provides a valuable supplement to, and check on computer models of populations.  相似文献   

5.
Yuan Y  Yin G 《Biometrics》2011,67(4):1543-1554
In the estimation of a dose-response curve, parametric models are straightforward and efficient but subject to model misspecifications; nonparametric methods are robust but less efficient. As a compromise, we propose a semiparametric approach that combines the advantages of parametric and nonparametric curve estimates. In a mixture form, our estimator takes a weighted average of the parametric and nonparametric curve estimates, in which a higher weight is assigned to the estimate with a better model fit. When the parametric model assumption holds, the semiparametric curve estimate converges to the parametric estimate and thus achieves high efficiency; when the parametric model is misspecified, the semiparametric estimate converges to the nonparametric estimate and remains consistent. We also consider an adaptive weighting scheme to allow the weight to vary according to the local fit of the models. We conduct extensive simulation studies to investigate the performance of the proposed methods and illustrate them with two real examples.  相似文献   

6.
Wildlife monitoring for open populations can be performed using a number of different survey methods. Each survey method gives rise to a type of data and, in the last five decades, a large number of associated statistical models have been developed for analyzing these data. Although these models have been parameterized and fitted using different approaches, they have all been designed to either model the pattern with which individuals enter and/or exit the population, or to estimate the population size by accounting for the corresponding observation process, or both. However, existing approaches rely on a predefined model structure and complexity, either by assuming that parameters linked to the entry and exit pattern (EEP) are specific to sampling occasions, or by employing parametric curves to describe the EEP. Instead, we propose a novel Bayesian nonparametric framework for modeling EEPs based on the Polya tree (PT) prior for densities. Our Bayesian nonparametric approach avoids overfitting when inferring EEPs, while simultaneously allowing more flexibility than is possible using parametric curves. Finally, we introduce the replicate PT prior for defining classes of models for these data allowing us to impose constraints on the EEPs, when required. We demonstrate our new approach using capture–recapture, count, and ring-recovery data for two different case studies.  相似文献   

7.
8.
Previous work has shown that it is often essential to account for the variation in rates at different sites in phylogenetic models in order to avoid phylogenetic artifacts such as long branch attraction. In most current models, the gamma distribution is used for the rates-across-sites distributions and is implemented as an equal-probability discrete gamma. In this article, we introduce discrete distribution estimates with large numbers of equally spaced rate categories allowing us to investigate the appropriateness of the gamma model. With large numbers of rate categories, these discrete estimates are flexible enough to approximate the shape of almost any distribution. Likelihood ratio statistical tests and a nonparametric bootstrap confidence-bound estimation procedure based on the discrete estimates are presented that can be used to test the fit of a parametric family. We applied the methodology to several different protein data sets, and found that although the gamma model often provides a good parametric model for this type of data, rate estimates from an equal-probability discrete gamma model with a small number of categories will tend to underestimate the largest rates. In cases when the gamma model assumption is in doubt, rate estimates coming from the discrete rate distribution estimate with a large number of rate categories provide a robust alternative to gamma estimates. An alternative implementation of the gamma distribution is proposed that, for equal numbers of rate categories, is computationally more efficient during optimization than the standard gamma implementation and can provide more accurate estimates of site rates.  相似文献   

9.
Cadigan NG  Brattey J 《Biometrics》2003,59(4):869-876
We present a semiparametric likelihood approach to estimating reporting rates and tag-loss rates from the tags returned from capture-recapture studies. Such studies are commonly used to estimate critical population parameters. Tag loss rates are estimated using double-tagged animals, while reporting rates are estimated using information from high-reward tags. A likelihood function is constructed based on the conditional distribution of the type of tag returned (low or high reward, single or double tag), given that a tag has been returned. This involves many sparse 5 x 1 tag-return contingency tables, and choosing a good functional form for the tag loss rate is difficult with such data. We model tag-loss rates using monotone-smoothing splines, and use these nonparametric estimates to diagnose the parametric form of the tag-loss rate. The nonparametric methods can also be used directly to model tag-loss rates.  相似文献   

10.
Lian H  Chen X  Yang JY 《Biometrics》2012,68(2):437-445
The additive model is a semiparametric class of models that has become extremely popular because it is more flexible than the linear model and can be fitted to high-dimensional data when fully nonparametric models become infeasible. We consider the problem of simultaneous variable selection and parametric component identification using spline approximation aided by two smoothly clipped absolute deviation (SCAD) penalties. The advantage of our approach is that one can automatically choose between additive models, partially linear additive models and linear models, in a single estimation step. Simulation studies are used to illustrate our method, and we also present its applications to motif regression.  相似文献   

11.
Huggins RM  Hall P  Yip PS  Bui QM 《Biometrics》2007,63(3):708-713
A semivarying coefficient model for the monthly numbers of suicides in Hong Kong is developed and a new estimation procedure for estimating the parametric component is proposed. The estimators are examined in a small simulation study and fitted to monthly suicide data to estimate a nonparametric long-term trend and parametric seasonal and socioeconomic effects. Fitting the model detected interpretable structure in the data that is consistent with that driving public health policy. While exploratory, the analysis motivates the collection of more detailed data and the development of more sophisticated models to help determine target groups and strategies to reduce the suicide rate in Hong Kong.  相似文献   

12.
A relatively simple method is proposed for the estimation of parameters of stage-structured populations from sample data for situation where (a) unit time survival rates may vary with time, and (b) the distribution of entry times to stage 1 is too complicated to be fitted with a simple parametric model such as a normal or gamma distribution. The key aspects of this model are that the entry time distribution is approximated by an exponential function withp parameters, the unit time survival rates in stages are approximated by anr parameter exponential polynomial in the stage number, and the durations of stages are assumed to be the same for all individuals. The new method is applied to four Zooplankton data sets, with parametric bootstrapping used to assess the bias and variation in estimates. It is concluded that good estimates of demographic parameters from stagefrequency data from natural populations will usually only be possible if extra information such as the durations of stages is known.  相似文献   

13.
Cao J  Wang L  Xu J 《Biometrics》2011,67(4):1305-1313
Applied scientists often like to use ordinary differential equations (ODEs) to model complex dynamic processes that arise in biology, engineering, medicine, and many other areas. It is interesting but challenging to estimate ODE parameters from noisy data, especially when the data have some outliers. We propose a robust method to address this problem. The dynamic process is represented with a nonparametric function, which is a linear combination of basis functions. The nonparametric function is estimated by a robust penalized smoothing method. The penalty term is defined with the parametric ODE model, which controls the roughness of the nonparametric function and maintains the fidelity of the nonparametric function to the ODE model. The basis coefficients and ODE parameters are estimated in two nested levels of optimization. The coefficient estimates are treated as an implicit function of ODE parameters, which enables one to derive the analytic gradients for optimization using the implicit function theorem. Simulation studies show that the robust method gives satisfactory estimates for the ODE parameters from noisy data with outliers. The robust method is demonstrated by estimating a predator-prey ODE model from real ecological data.  相似文献   

14.
Peng Y  Dear KB 《Biometrics》2000,56(1):237-243
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.  相似文献   

15.
Summary Given a large number of t‐statistics, we consider the problem of approximating the distribution of noncentrality parameters (NCPs) by a continuous density. This problem is closely related to the control of false discovery rates (FDR) in massive hypothesis testing applications, e.g., microarray gene expression analysis. Our methodology is similar to, but improves upon, the existing approach by Ruppert, Nettleton, and Hwang (2007, Biometrics, 63, 483–495). We provide parametric, nonparametric, and semiparametric estimators for the distribution of NCPs, as well as estimates of the FDR and local FDR. In the parametric situation, we assume that the NCPs follow a distribution that leads to an analytically available marginal distribution for the test statistics. In the nonparametric situation, we use convex combinations of basis density functions to estimate the density of the NCPs. A sequential quadratic programming procedure is developed to maximize the penalized likelihood. The smoothing parameter is selected with the approximate network information criterion. A semiparametric estimator is also developed to combine both parametric and nonparametric fits. Simulations show that, under a variety of situations, our density estimates are closer to the underlying truth and our FDR estimates are improved compared with alternative methods. Data‐based simulations and the analyses of two microarray datasets are used to evaluate the performance in realistic situations.  相似文献   

16.
The use of faecal marker concentration curves, in conjunction with compartmental analysis, is examined as a method for predicting faecal output in ruminants. Formulae for faecal production are derived for the various multicompartment models currently used to interpret marker concentration data. A comparison of observed and model-derived estimates of faecal dry matter production using three different markers is given for sheep consuming hay or a mixed diet.  相似文献   

17.
This paper reviews a general framework for the modelling of longitudinal data with random measurement times based on marked point processes and presents a worked example. We construct a quite general regression models for longitudinal data, which may in particular include censoring that only depend on the past and outside random variation, and dependencies between measurement times and measurements. The modelling also generalises statistical counting process models. We review a non-parametric Nadarya-Watson kernel estimator of the regression function, and a parametric analysis that is based on a conditional least squares (CLS) criterion. The parametric analysis presented, is a conditional version of the generalised estimation equations of LIANG and ZEGER (1986). We conclude that the usual nonparametric and parametric regression modelling can be applied to this general set-up, with some modifications. The presented framework provides an easily implemented and powerful tool for model building for repeated measurements.  相似文献   

18.
李涛  王鹏 《生态学报》2013,33(1):286-293
分别利用参数模型和无参数估计法预测南海陆坡沉积物柱MD05-2896中的细菌丰度.基于非培养的PCR-RFLP的16SrRNA基因分子技术,扩增了沉积物柱中的细菌16S rRNA基因序列,并构建16S rRNA基因文库.系统发育分析表明16S rRNA基因文库中,大多数序列属于17个已知的“门”.分别以99%、97%、90%和80%序列一致性作为分类单元分界点,将16SrRNA基因序列组群为分类单元.使用逆高斯分布模型、对数正态分布模型、负二项式分布模型、帕雷托分布模型、双指数分布模型以及ACE、ACE-1等估计方法预测不同分类单元分类水平下的细菌丰度.结果表明在“种”级分类水平上,负二项式分布为最优估计模型,估计细菌丰度为244±10(SE).不过,受实验条件的限制,该估计值可能偏低.  相似文献   

19.
Hanson T  Yang M 《Biometrics》2007,63(1):88-95
Methodology for implementing the proportional odds regression model for survival data assuming a mixture of finite Polya trees (MPT) prior on baseline survival is presented. Extensions to frailties and generalized odds rates are discussed. Although all manner of censoring and truncation can be accommodated, we discuss model implementation, regression diagnostics, and model comparison for right-censored data. An advantage of the MPT model is the relative ease with which predictive densities, survival, and hazard curves are generated. Much discussion is devoted to practical implementation of the proposed models, and a novel MCMC algorithm based on an approximating parametric normal model is developed. A modest simulation study comparing the small sample behavior of the MPT model to a rank-based estimator and a real data example is presented.  相似文献   

20.
Nummi T  Pan J  Siren T  Liu K 《Biometrics》2011,67(3):871-875
Summary In most research on smoothing splines the focus has been on estimation, while inference, especially hypothesis testing, has received less attention. By defining design matrices for fixed and random effects and the structure of the covariance matrices of random errors in an appropriate way, the cubic smoothing spline admits a mixed model formulation, which places this nonparametric smoother firmly in a parametric setting. Thus nonlinear curves can be included with random effects and random coefficients. The smoothing parameter is the ratio of the random‐coefficient and error variances and tests for linear regression reduce to tests for zero random‐coefficient variances. We propose an exact F‐test for the situation and investigate its performance in a real pine stem data set and by simulation experiments. Under certain conditions the suggested methods can also be applied when the data are dependent.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号