首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT Ecologists often develop complex regression models that include multiple categorical and continuous variables, interactions among predictors, and nonlinear relationships between the response and predictor variables. Nomograms, which are graphical devices for presenting mathematical functions and calculating output values, can aid biologists in interpreting and presenting these complex models. To illustrate benefits of nomograms, we developed a logistic regression model of elk (Cervus elaphus) resource selection. With this model, we demonstrated how a nomogram helps scientists and managers interpret interactions among variables, compare the relative biological importance of variables, and examine predicted shapes of relationships (e.g., linear vs. nonlinear) between response and predictor variables. Although our example focused on logistic regression, nomograms are equally useful for other linear and nonlinear models. Regardless of the approach used for model development, nomograms and other graphical summaries can help scientists and managers develop, interpret, and apply statistical models.  相似文献   

2.
ABSTRACT Count data with means <2 are often assumed to follow a Poisson distribution. However, in many cases these kinds of data, such as number of young fledged, are more appropriately considered to be multinomial observations due to naturally occurring upper truncation of the distribution. We evaluated the performance of several versions of multinomial regression, plus Poisson and normal regression, for analysis of count data with means <2 through Monte Carlo simulations. Simulated data mimicked observed counts of number of young fledged (0, 1, 2, or 3) by California spotted owls (Strix occidentalis occidentalis). We considered size and power of tests to detect differences among 10 levels of a categorical predictor, as well as tests for trends across 10-year periods. We found regular regression and analysis of variance procedures based on a normal distribution to perform satisfactorily in all cases we considered, whereas failure rate of multinomial procedures was often excessively high, and the Poisson model demonstrated inappropriate test size for data where the variance/mean ratio was <1 or >1.2. Thus, managers can use simple statistical methods with which they are likely already familiar to analyze the kinds of count data we described here.  相似文献   

3.
A general class of sequential models for the analysis of ordered categorical variables is developed and discussed. The models apply if the ordinal response may be subdivided into two or more meaningful sets of response categories. The parametrization explicitly makes use of this subdivision. The models furnish a linear alternative to non-linear models which incorporate a scale parameter. They are shown to be special cases of multivariate generalized linear models. Applications are discussed with the use of several examples.  相似文献   

4.
5.
Behavioural research often produces data that have a complicated structure. For instance, data can represent repeated observations of the same individual and suffer from heteroscedasticity as well as other technical snags. The regression analysis of such data is often complicated by the fact that the observations (response variables) are mutually correlated. The correlation structure can be quite complex and might or might not be of direct interest to the user. In any case, one needs to take correlations into account (e.g. by means of random‐effect specification) in order to arrive at correct statistical inference (e.g. for construction of the appropriate test or confidence intervals). Over the last decade, such data have been more and more frequently analysed using repeated‐measures ANOVA and mixed‐effects models. Some researchers invoke the heavy machinery of mixed‐effects modelling to obtain the desired population‐level (marginal) inference, which can be achieved by using simpler tools – namely marginal models. This paper highlights marginal modelling (using generalized least squares [GLS] regression) as an alternative method. In various concrete situations, such marginal models can be based on fewer assumptions and directly generate estimates (population‐level parameters) which are of immediate interest to the behavioural researcher (such as population mean). Sometimes, they might be not only easier to interpret but also easier to specify than their competitors (e.g. mixed‐effects models). Using five examples from behavioural research, we demonstrate the use, advantages, limits and pitfalls of marginal and mixed‐effects models implemented within the functions of the ‘nlme’ package in R.  相似文献   

6.
If a dependent variable in a regression analysis is exceptionally expensive or hard to obtain the overall sample size used to fit the model may be limited. To avoid this one may use a cheaper or more easily collected “surrogate” variable to supplement the expensive variable. The regression analysis will be enhanced to the degree the surrogate is associated with the costly dependent variable. We develop a Bayesian approach incorporating surrogate variables in regression based on a two‐stage experiment. Illustrative examples are given, along with comparisons to an existing frequentist method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

7.
We propose a mixed-effect linear model, as a particular case of the two-level regression model, for analyzing repeated measures made at completely irregular time points. The model allows for subject-level covariates, so as to study the trend and the variability of the individual growth curves. Application of this model is illustrated on a published data set.  相似文献   

8.
The present study begins with a discussion of a topic frequently mentioned in the relevant literature, namely the division of multiple regression models with two predictors into five mutually excluding categories. The theoretical basis for this classification is criticized and a system of three mutually excluding categories is suggested which is free of the criticized inconsistencies.  相似文献   

9.
The dose-response model concerns to establish a relationship between a dose and the magnitude of the response produced by the dose. A common complication in the dose-response model for jejunal crypts cell surviving data is overdispersion, where the observed variation exceeds that predicted from the binomial distribution. In this study, two different methods for analyzing jejunal crypts cell survival after regimens of several fractions are contrasted and compared. One method is the logistic regression approach, where the numbers of surviving crypts are predicted by the logistic function of a single dose of radiation. The other one is the transform-both-sides approach, where the arcsine transformation family is applied based on the first-order variance-stabilizing transformation. This family includes the square root, arcsine, and hyperbolic arcsine transformations, which have been used for Poisson, binomial, and negative binomial count data, as special cases. These approaches are applied to a data set from radiobiology. Simulation study indicates that the arcsine transformation family is more efficient than the logistic regression when there exists moderate overdispersion.  相似文献   

10.
Regression tree analysis, a non-parametric method, was undertaken to identify predictors of the serum concentration of polychlorinated biphenyls (sum of marker PCB 1 ABBREVIATIONS: BMI: body-mass index, CV: cross validation, ln: natural logarithm, ns: not significant, PCAHs: polychlorinated aromatic hydrocarbons, PCBs: polychlorinated biphenyls, R2 a: adjusted coefficient of determination, VIF: variance inflation factor. View all notes 138, 153, and 180) in humans. This method was applied on biomonitoring data of the Flemish Environment and Health study (2002–2006) and included 1679 adolescents and 1583 adults. Potential predictor variables were collected via a self-administered questionnaire, assessing information on lifestyle, food intake, use of tobacco and alcohol, residence history, health, education, hobbies, and occupation. Relevant predictors of human PCB exposure were identified with regression tree analysis using ln-transformed sum of PCBs, separately in adolescents and adults. The obtained results were compared with those from a standard linear regression approach. The results of the non-parametric analysis confirm the selection of the covariates in the multiple regression models. In both analyses, blood fat, gender, age, body-mass index (BMI) or change in bodyweight, former breast-feeding, and a number of nutritional factors were identified as statistically significant predictors in the serum PCB concentration, either in adolescents, in adults or in both. Regression trees can be used as an explorative analysis in combination with multiple linear regression models, where relationships between the determinants and the biomarkers can be quantified.  相似文献   

11.
Aim We modelled the relationship of breeding evidence for five species of forest songbirds (ruby-crowned kinglet (Regulus calendula) Blackburnian warbler (Dendroica fusca), black-throated blue warbler (Dendroica caerulescens), bay-breasted warbler (Dendrioca castanea) and Connecticut warbler (Oporornis agilis)) and a variety of macro-climate variables to examine the importance of climate as a factor determining distribution of breeding in these species and to assess the usefulness of spatial predictions generated from these models. Location Modelling was conducted over the entire province of Ontario, Canada, an area of ≈900,000 km2. Methods Data on the distribution of breeding in the province was derived from the Breeding Bird Atlas of Ontario. We used logistic regression to model the relationship between the probability of breeding (assessed in 10 km×10 km blocks) and estimates of a variety of climate variables at the same scale. Models were selected that had the least number of explanatory variables while at the same time having close to the best possible classification accuracy. Results The final models for these five species had from one to six explanatory variables and an overall concordance of 70.4% to 86.3% indicating a good classification accuracy. Results from subsampling 50% of the original data ten times indicate that (1) the classification accuracy of the model for data used to generate the model is not very sensitive to the specific observations used to generate the model (2) the classification accuracy of test data is close to the classification accuracy of the model data and (3) the classification accuracy of the test data is not dependent on the specific observations used to generate the model. We generated a spatial prediction of the probability of occurrence of each species for Ontario using the relationships defined by the logistic regression models and using 1 km gridded estimates of the necessary climate variables. These probability maps closely matched the maps of observed evidence of breeding from the Atlas. Main conclusions Although mechanisms controlling breeding distribution cannot be determined using this method, we can conclude that (1) macro-climate is an important factor directly and/or indirectly determining distribution of breeding in these species and (2) spatial predictions of probability of breeding are accurate enough to be useful in predicting probability of breeding in unsampled areas.  相似文献   

12.
Abstract I provide a brief introduction to the concept of spatial autocorrelation and its incorporation into regression-type models. Spatial autocorrelation occurs when the response variable is correlated with itself at other locations in the region of interest. The autocorrelation usually takes a specific form where observations close in space are more correlated than those farther apart, and the rate of decay of the correlation is a function of the distance separating 2 locations. I present 2 commonly used models: 1) geostatistical modeling in which data are collected at points in the study region and 2) conditional autoregression (lattice) models in which data are aggregated over small nonoverlapping sub-areas of the study region. I also describe incorporation of explanatory covariates, such as habitat or physico-chemical attributes. I emphasize frequentist methods, but I briefly describe Bayesian approaches. I also provide some advantages, such as obtaining correct standard errors for estimators, and disadvantages, such as requirements for larger sample sizes, of incorporating spatial autocorrelation into the modeling effort. This information can aid researchers in designing and analyzing models of the relationships between species distributions and habitat. As a result, more informative models can be developed which further aid in management of wildlife.  相似文献   

13.
As the molecular marker density grows, there is a strong need in both genome-wide association studies and genomic selection to fit models with a large number of parameters. Here we present a computationally efficient generalized ridge regression (RR) algorithm for situations in which the number of parameters largely exceeds the number of observations. The computationally demanding parts of the method depend mainly on the number of observations and not the number of parameters. The algorithm was implemented in the R package bigRR based on the previously developed package hglm. Using such an approach, a heteroscedastic effects model (HEM) was also developed, implemented, and tested. The efficiency for different data sizes were evaluated via simulation. The method was tested for a bacteria-hypersensitive trait in a publicly available Arabidopsis data set including 84 inbred lines and 216,130 SNPs. The computation of all the SNP effects required <10 sec using a single 2.7-GHz core. The advantage in run time makes permutation test feasible for such a whole-genome model, so that a genome-wide significance threshold can be obtained. HEM was found to be more robust than ordinary RR (a.k.a. SNP-best linear unbiased prediction) in terms of QTL mapping, because SNP-specific shrinkage was applied instead of a common shrinkage. The proposed algorithm was also assessed for genomic evaluation and was shown to give better predictions than ordinary RR.  相似文献   

14.
A goodness-of-fit test for multinomial logistic regression   总被引:1,自引:0,他引:1  
Goeman JJ  le Cessie S 《Biometrics》2006,62(4):980-985
This article presents a score test to check the fit of a logistic regression model with two or more outcome categories. The null hypothesis that the model fits well is tested against the alternative that residuals of samples close to each other in covariate space tend to deviate from the model in the same direction. We propose a test statistic that is a sum of squared smoothed residuals, and show that it can be interpreted as a score test in a random effects model. By specifying the distance metric in covariate space, users can choose the alternative against which the test is directed, making it either an omnibus goodness-of-fit test or a test for lack of fit of specific model variables or outcome categories.  相似文献   

15.
By using deviance standardized residuals, the seemingly unrelated regression estimation procedure is extended to generalized linear models, and fitted by an iterative procedure. The matrix of cross products of standardized residuals is asymptotically multivariate normal, and can be used for further multivariate analyses and for hypothesis testing.  相似文献   

16.
Models and estimention procedures are given for linear regression models in discrete distributions when the regression contains both fixed and random effects. The methods are developed for discrete variables with typically a small number of possible outcomes such as occurs in ordinal regression. The method is applied to a problem arising in the comparison of microbiological test methods.  相似文献   

17.
Summary .  Regression models are often used to test for cause-effect relationships from data collected in randomized trials or experiments. This practice has deservedly come under heavy scrutiny, because commonly used models such as linear and logistic regression will often not capture the actual relationships between variables, and incorrectly specified models potentially lead to incorrect conclusions. In this article, we focus on hypothesis tests of whether the treatment given in a randomized trial has any effect on the mean of the primary outcome, within strata of baseline variables such as age, sex, and health status. Our primary concern is ensuring that such hypothesis tests have correct type I error for large samples. Our main result is that for a surprisingly large class of commonly used regression models, standard regression-based hypothesis tests (but using robust variance estimators) are guaranteed to have correct type I error for large samples, even when the models are incorrectly specified. To the best of our knowledge, this robustness of such model-based hypothesis tests to incorrectly specified models was previously unknown for Poisson regression models and for other commonly used models we consider. Our results have practical implications for understanding the reliability of commonly used, model-based tests for analyzing randomized trials.  相似文献   

18.
On occasion, generalized linear models for counts based on Poisson or overdispersed count distributions may encounter lack of fit due to disproportionately large frequencies of zeros. Three alternative types of regression models that utilize all the information and explicitly account for excess zeros are examined and given general formulations. A simple mechanism for added zeros is assumed that directly motivates one type of model, here called the added-zero type, particular forms of which have been proposed independently by D. LAMBERT (1992) and in unpublished work by the author. An original regression formulation (the zero-altered model) is presented as a reduced form of the two-part model for count data, which is also discussed. It is suggested that two-part models be used to aid in development of an added-zero model when the latter is thought to be appropriate.  相似文献   

19.
20.
Royston P  Ferreira A 《Biometrics》1999,55(4):1005-1013
Standard conception probabilities models assume that different acts of intercourse make independent contributions to the probability of conception in viable cycles. We propose an alternative, approximate model based on the assumption that the act of intercourse closest to the time of maximum fertility is the one most likely to have caused conception. We describe an adaptive algorithm [the most fertile intercourse day (MFID) algorithm] that estimates the most fertile intercourse day in each cycle. The approach is easily extended to include covariates and random between-couple differences in fecundability that affect the probability of conception in a given cycle. Reanalyses of two data sets reported in the literature are presented. Estimates of the probability of conception during the most fertile period of the cycle and of the effects of covariates are similar to estimates found using standard models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号