首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Time-varying parametric linear and time-varying nonparametric regression models as well as a time-varying nonparametric median regression model are developed to predict the daily pollen concentration for Szeged in Hungary using previous-day meteorological parameters and the daily pollen concentration. The models are applied to rainy days and non-rainy days, respectively. The most important predictor is the previous-day pollen concentration level, and the only other predictor retained by a stepwise regression procedure is the daily mean global solar flux for rainy days and the daily mean temperature for non-rainy days. Although the variance percentage explained by these two predictors is higher for non-rainy (55.2%) days than for rainy (51.9%) days, the prediction rate is slightly better for rainy than for non-rainy days. Nonparametric regression yields substantially better estimates, especially for rainy days indicating a nonlinear relationship between the predictors and the pollen concentration. The explained variance percentage is 71.4 and 64.6% for rainy and non-rainy days, respectively. Concerning the mean absolute error, the nonparametric median regression provides the best estimate. The quantile regression shows that probability distribution of daily ragweed concentration is much more skewed for non-rainy days, while the more concentrated probability distribution for rainy days exhibits relatively stable ragweed pollen concentrations. The possible lowest limits of concentrations are also calculated. Under highly favorable conditions for peak concentrations, the pollen level reaches at least 350 grains m−3 and 450 grains m−3 for rainy and non-rainy days, respectively. These values again underline the excessive ragweed pollen load over the area of Szeged.  相似文献   

2.
Dunson DB  Neelon B 《Biometrics》2003,59(2):286-295
In biomedical studies, there is often interest in assessing the association between one or more ordered categorical predictors and an outcome variable, adjusting for covariates. For a k-level predictor, one typically uses either a k-1 degree of freedom (df) test or a single df trend test, which requires scores for the different levels of the predictor. In the absence of knowledge of a parametric form for the response function, one can incorporate monotonicity constraints to improve the efficiency of tests of association. This article proposes a general Bayesian approach for inference on order-constrained parameters in generalized linear models. Instead of choosing a prior distribution with support on the constrained space, which can result in major computational difficulties, we propose to map draws from an unconstrained posterior density using an isotonic regression transformation. This approach allows flat regions over which increases in the level of a predictor have no effect. Bayes factors for assessing ordered trends can be computed based on the output from a Gibbs sampling algorithm. Results from a simulation study are presented and the approach is applied to data from a time-to-pregnancy study.  相似文献   

3.
ABSTRACT Ecologists often develop complex regression models that include multiple categorical and continuous variables, interactions among predictors, and nonlinear relationships between the response and predictor variables. Nomograms, which are graphical devices for presenting mathematical functions and calculating output values, can aid biologists in interpreting and presenting these complex models. To illustrate benefits of nomograms, we developed a logistic regression model of elk (Cervus elaphus) resource selection. With this model, we demonstrated how a nomogram helps scientists and managers interpret interactions among variables, compare the relative biological importance of variables, and examine predicted shapes of relationships (e.g., linear vs. nonlinear) between response and predictor variables. Although our example focused on logistic regression, nomograms are equally useful for other linear and nonlinear models. Regardless of the approach used for model development, nomograms and other graphical summaries can help scientists and managers develop, interpret, and apply statistical models.  相似文献   

4.
Ecological indicators are often collected to detect and monitor environmental change. Statistical models are used to estimate natural variability, pre-existing trends, and environmental predictors of baseline indicator conditions. Establishing standard models for baseline characterization is critical to the effective design and implementation of environmental monitoring programs. An anthropogenic activity that requires monitoring is the development of Marine Renewable Energy sites. Currently, there are no standards for the analysis of environmental monitoring data for these development sites. Marine Renewable Energy monitoring data are used as a case study to develop and apply a model evaluation to establish best practices for characterizing baseline ecological indicator data. We examined a range of models, including six generalized regression models, four time series models, and three nonparametric models. Because monitoring data are not always normally distributed, we evaluated model ability to characterize normal and non-normal data using hydroacoustic metrics that serve as proxies for ecological indicator data. The nonparametric support vector regression and random forest models, and parametric state-space time series models generally were the most accurate in interpolating the normal metric data. Support vector regression and state-space models best interpolated the non-normally distributed data. If parametric results are preferred, then state-space models are the most robust for baseline characterization. Evaluation of a wide range of models provides a comprehensive characterization of the case study data, and highlights advantages of models rarely used in Marine Renewable Energy environmental monitoring. Our model findings are relevant for any ecological indicator data with similar properties, and the evaluation approach is applicable to any monitoring program.  相似文献   

5.
Bornkamp B  Ickstadt K 《Biometrics》2009,65(1):198-205
Summary .  In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose–response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose–response analysis.  相似文献   

6.
Addressing the forecasting issues is one of the core objectives of developing and restructuring of electric power industry in China. However, there are not enough efforts that have been made to develop an accurate electricity consumption forecasting procedure. In this paper, a panel semiparametric quantile regression neural network (PSQRNN) is developed by combining an artificial neural network and semiparametric quantile regression for panel data. By embedding penalized quantile regression with least absolute shrinkage and selection operator (LASSO), ridge regression and backpropagation, PSQRNN keeps the flexibility of nonparametric models and the interpretability of parametric models simultaneously. The prediction accuracy is evaluated based on China's electricity consumption data set, and the results indicate that PSQRNN performs better compared with three benchmark methods including BP neural network (BP), Support Vector Machine (SVM) and Quantile Regression Neural Network (QRNN).  相似文献   

7.
ABSTRACT Habitat suitability is often used as a surrogate for demographic responses (i.e., abundance, survival, fecundity, or population viability) in the application of habitat suitability index (HSI) models. Whether habitat suitability actually relates to demographics, however, has rarely been evaluated. We validated HSI models of breeding habitat suitability for wood thrush (Hylocichla mustelina) and yellow-breasted chat (Icteria virens) in Missouri, USA. First, we evaluated HSI models as a predictor of 3 demographic responses: within-site territory density, site-level territory density, and nest success. We demonstrated a link between HSI values and all 3 types of demographic responses for the yellow-breasted chat and site-level territory density for the wood thrush. Second, we evaluated support for models containing HSI values, models containing measured habitat features (e.g., tree age, tree species, ecological land type), and models containing management treatments (e.g., even-aged and uneven-aged forest regeneration treatments) for each demographic response using model selection. Models containing HSI values received more support, in general, than models containing only habitat features or management treatments for all 3 types of wildlife response. The assumption that changes in habitat suitability represent wildlife demographic response to vegetation change is supported by our models. However, differences in species ecology may contribute to the degree to which HSI values are related to specific demographic responses. We recommend validation of HSI models with the particular demographic data of interest (i.e., density, productivity) to increase confidence in the model used for conservation planning.  相似文献   

8.
The half-maximal inhibitory concentration IC is an important pharmacodynamic index of drug effectiveness. To estimate this value, the dose response relationship needs to be established, which is generally achieved by fitting monotonic sigmoidal models. However, recent studies on Human Immunodeficiency Virus (HIV) mutants developing resistance to antiviral drugs show that the dose response curve may not be monotonic. Traditional models can fail for nonmonotonic data and ignore observations that may be of biologic significance. Therefore, we propose a nonparametric model to describe the dose response relationship and fit the curve using local polynomial regression. The nonparametric approach is shown to be promising especially for estimating the IC of some HIV inhibitory drugs, in which there is a dose-dependent stimulation of response for mutant strains. This model strategy may be applicable to general pharmacologic, toxicologic, or other biomedical data that exhibits a nonmonotonic dose response relationship for which traditional parametric models fail.  相似文献   

9.
Summary Naive use of misclassified covariates leads to inconsistent estimators of covariate effects in regression models. A variety of methods have been proposed to address this problem including likelihood, pseudo‐likelihood, estimating equation methods, and Bayesian methods, with all of these methods typically requiring either internal or external validation samples or replication studies. We consider a problem arising from a series of orthopedic studies in which interest lies in examining the effect of a short‐term serological response and other covariates on the risk of developing a longer term thrombotic condition called deep vein thrombosis. The serological response is an indicator of whether the patient developed antibodies following exposure to an antithrombotic drug, but the seroconversion status of patients is only available at the time of a blood sample taken upon the discharge from hospital. The seroconversion time is therefore subject to a current status observation scheme, or Case I interval censoring, and subjects tested before seroconversion are misclassified as nonseroconverters. We develop a likelihood‐based approach for fitting regression models that accounts for misclassification of the seroconversion status due to early testing using parametric and nonparametric estimates of the seroconversion time distribution. The method is shown to reduce the bias resulting from naive analyses in simulation studies and an application to the data from the orthopedic studies provides further illustration.  相似文献   

10.
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.  相似文献   

11.
Qu A  Li R 《Biometrics》2006,62(2):379-391
Nonparametric smoothing methods are used to model longitudinal data, but the challenge remains to incorporate correlation into nonparametric estimation procedures. In this article, we propose an efficient estimation procedure for varying-coefficient models for longitudinal data. The proposed procedure can easily take into account correlation within subjects and deal directly with both continuous and discrete response longitudinal data under the framework of generalized linear models. The proposed approach yields a more efficient estimator than the generalized estimation equation approach when the working correlation is misspecified. For varying-coefficient models, it is often of interest to test whether coefficient functions are time varying or time invariant. We propose a unified and efficient nonparametric hypothesis testing procedure, and further demonstrate that the resulting test statistics have an asymptotic chi-squared distribution. In addition, the goodness-of-fit test is applied to test whether the model assumption is satisfied. The corresponding test is also useful for choosing basis functions and the number of knots for regression spline models in conjunction with the model selection criterion. We evaluate the finite sample performance of the proposed procedures with Monte Carlo simulation studies. The proposed methodology is illustrated by the analysis of an acquired immune deficiency syndrome (AIDS) data set.  相似文献   

12.
In functional linear models (FLMs), the relationship between the scalar response and the functional predictor process is often assumed to be identical for all subjects. Motivated by both practical and methodological considerations, we relax this assumption and propose a new class of functional regression models that allow the regression structure to vary for different groups of subjects. By projecting the predictor process onto its eigenspace, the new functional regression model is simplified to a framework that is similar to classical mixture regression models. This leads to the proposed approach named as functional mixture regression (FMR). The estimation of FMR can be readily carried out using existing software implemented for functional principal component analysis and mixture regression. The practical necessity and performance of FMR are illustrated through applications to a longevity analysis of female medflies and a human growth study. Theoretical investigations concerning the consistent estimation and prediction properties of FMR along with simulation experiments illustrating its empirical properties are presented in the supplementary material available at Biostatistics online. Corresponding results demonstrate that the proposed approach could potentially achieve substantial gains over traditional FLMs.  相似文献   

13.
Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.  相似文献   

14.
Research has shown that high blood glucose levels are important predictors of incident diabetes. However, they are also strongly associated with other cardiometabolic risk factors such as high blood pressure, adiposity, and cholesterol, which are also highly correlated with one another. The aim of this analysis was to ascertain how these highly correlated cardiometabolic risk factors might be associated with high levels of blood glucose in older adults aged 50 or older from wave 2 of the English Longitudinal Study of Ageing (ELSA). Due to the high collinearity of predictor variables and our interest in extreme values of blood glucose we proposed a new method, called quantile profile regression, to answer this question. Profile regression, a Bayesian nonparametric model for clustering responses and covariates simultaneously, is a powerful tool to model the relationship between a response variable and covariates, but the standard approach of using a mixture of Gaussian distributions for the response model will not identify the underlying clusters correctly, particularly with outliers in the data or heavy tail distribution of the response. Therefore, we propose quantile profile regression to model the response variable with an asymmetric Laplace distribution, allowing us to model more accurately clusters that are asymmetric and predict more accurately for extreme values of the response variable and/or outliers. Our new method performs more accurately in simulations when compared to Normal profile regression approach as well as robustly when outliers are present in the data. We conclude with an analysis of the ELSA.  相似文献   

15.
We propose a new class of semiparametric generalized linear models. As with existing models, these models are specified via a linear predictor and a link function for the mean of response Y as a function of predictors X. Here, however, the "baseline" distribution of Y at a given reference mean mu(0) is left unspecified and is estimated from the data. The response distribution when the mean differs from mu(0) is then generated via exponential tilting of the baseline distribution, yielding a response model that is a natural exponential family, with corresponding canonical link and variance functions. The resulting model has a level of flexibility similar to the popular proportional odds model. Maximum likelihood estimation is developed for response distributions with finite support, and the new model is studied and illustrated through simulations and example analyses from aging research.  相似文献   

16.
MOTIVATION: One particular application of microarray data, is to uncover the molecular variation among cancers. One feature of microarray studies is the fact that the number n of samples collected is relatively small compared to the number p of genes per sample which are usually in the thousands. In statistical terms this very large number of predictors compared to a small number of samples or observations makes the classification problem difficult. An efficient way to solve this problem is by using dimension reduction statistical techniques in conjunction with nonparametric discriminant procedures. RESULTS: We view the classification problem as a regression problem with few observations and many predictor variables. We use an adaptive dimension reduction method for generalized semi-parametric regression models that allows us to solve the 'curse of dimensionality problem' arising in the context of expression data. The predictive performance of the resulting classification rule is illustrated on two well know data sets in the microarray literature: the leukemia data that is known to contain classes that are easy 'separable' and the colon data set.  相似文献   

17.
Many empirical studies have revealed considerable differences between nonparametric bootstrapping and Bayesian posterior probabilities in terms of the support values for branches, despite claimed predictions about their approximate equivalence. We investigated this problem by simulating data, which were then analyzed by maximum likelihood bootstrapping and Bayesian phylogenetic analysis using identical models and reoptimization of parameter values. We show that Bayesian posterior probabilities are significantly higher than corresponding nonparametric bootstrap frequencies for true clades, but also that erroneous conclusions will be made more often. These errors are strongly accentuated when the models used for analyses are underparameterized. When data are analyzed under the correct model, nonparametric bootstrapping is conservative. Bayesian posterior probabilities are also conservative in this respect, but less so.  相似文献   

18.
We propose a generalization of the varying coefficient modelfor longitudinal data to cases where not only current but alsorecent past values of the predictor process affect current response.More precisely, the targeted regression coefficient functionsof the proposed model have sliding window supports around currenttime t. A variant of a recently proposed two-step estimationmethod for varying coefficient models is proposed for estimationin the context of these generalized varying coefficient models,and is found to lead to improvements, especially for the caseof additive measurement errors in both response and predictors.The proposed methodology for estimation and inference is alsoapplicable for the case of additive measurement error in thecommon versions of varying coefficient models that relate onlycurrent observations of predictor and response processes toeach other. Asymptotic distributions of the proposed estimatorsare derived, and the model is applied to the problem of predictingprotein concentrations in a longitudinal study. Simulation studiesdemonstrate the efficacy of the proposed estimation procedure.  相似文献   

19.
We introduce an unsupervised competitive learning rule, called the extended Maximum Entropy learning Rule (eMER), for topographic map formation. Unlike Kohonen's Self-Organizing Map (SOM) algorithm, the presence of a neighborhood function is not a prerequisite for achieving topology-preserving mappings, but instead it is intended: (1) to speed up the learning process and (2) to perform nonparametric regression. We show that, when the neighborhood function vanishes, the neural weigh t density at convergence approaches a linear function of the input density so that the map can be regarded as a nonparametric model of the input density. We apply eMER to density estimation and compare its performance with that of the SOM algorithm and the variable kernel method. Finally, we apply the ‘batch’ version of eMER to nonparametric projection pursuit regression and compare its performance with that of back-propagation learning, projection pursuit learning, constrained topolog ical mapping, and the Heskes and Kappen approach. Received: 12 August 1996 / Accepted in revised form: 9 April 1997  相似文献   

20.
Two two-parameter models are developed for testing the hypothesis of no treatment effect against the alternative that a subset of the treated patients will show an improvement. To keep the range of measurements the same for treated and control patients, Lehmann alternatives are used in both models. Locally most powerful rank tests are developed for each model and each parameter. The asymptotic relative efficiency leads to a test that uses the scores s(i) = [i/(N + 1)]4. Two examples that support the usefulness of this nonparametric test are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号