首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We discuss a log-linear model for series of regular bird counts taken at a number of survey sites. The model is parameterized in terms of annual growth rates rather than actual indices of abundance, as is more frequently done. This not only permits easy estimation of and inference about these rates, but also allows us to model the effects upon population growth of covariates, such as the local presence of a competitor or predator, which may themselves vary in space and over time. A recursive relationship permits the expected count at a site to be functionally dependent upon the expected count at the previous visit. We discuss the advantages of using this relationship, rather than replacing the latter with their observed counterparts, as has been used previously.  相似文献   

2.
Biological data are often intrinsically hierarchical (e.g., species from different genera, plants within different mountain regions), which made mixed‐effects models a common analysis tool in ecology and evolution because they can account for the non‐independence. Many questions around their practical applications are solved but one is still debated: Should we treat a grouping variable with a low number of levels as a random or fixed effect? In such situations, the variance estimate of the random effect can be imprecise, but it is unknown if this affects statistical power and type I error rates of the fixed effects of interest. Here, we analyzed the consequences of treating a grouping variable with 2–8 levels as fixed or random effect in correctly specified and alternative models (under‐ or overparametrized models). We calculated type I error rates and statistical power for all‐model specifications and quantified the influences of study design on these quantities. We found no influence of model choice on type I error rate and power on the population‐level effect (slope) for random intercept‐only models. However, with varying intercepts and slopes in the data‐generating process, using a random slope and intercept model, and switching to a fixed‐effects model, in case of a singular fit, avoids overconfidence in the results. Additionally, the number and difference between levels strongly influences power and type I error. We conclude that inferring the correct random‐effect structure is of great importance to obtain correct type I error rates. We encourage to start with a mixed‐effects model independent of the number of levels in the grouping variable and switch to a fixed‐effects model only in case of a singular fit. With these recommendations, we allow for more informative choices about study design and data analysis and make ecological inference with mixed‐effects models more robust for small number of levels.  相似文献   

3.
N-alkanes are long-chain saturated hydrocarbons occurring in plant cuticles that can be used as chemical markers for estimating the diet composition of herbivores. An important constraint of using n-alkanes to estimate diet composition with currently employed mathematical procedures is that the number of markers must be equal or larger than the number of diet components. This is a considerable limitation when dealing with free-ranging herbivores feeding on complex plant communities. We present a novel approach for the estimation of diet composition using n-alkanes which applies equally to cases where the number of markers is lower, equal or greater than the number of plant species in the diet. The model uses linear programming to estimate the minimum and maximum proportions of each plant in the diet, and avoids the need for grouping species in order to reduce the number of estimated dietary components. We illustrate the model with two data sets of n-alkane content of plants and faeces obtained from a sheep grazing experiment conducted in Australia and a red deer study in Portugal. Our results are consistent with previous studies on those data sets and provide additional information on the proportions of individual species in the diet. Results show that sheep included in the diet high proportions of white clover (from 0.25 to 0.37), and relatively high proportions of grasses (e.g. brome from 0.14 to 0.26) but tended to avoid Lotus spp. (always less than 0.04 of the diet). For red deer we found high proportions of legumes (e.g. Trifolium angustifolium and Vicia sativa reaching maximum proportions of 0.42 and 0.30 of the diet, respectively) with grasses being less important and Cistus ladanifer, a browse, also having relevance (from 0.21 to 0.42 of the diet).  相似文献   

4.
Genotype-by-environment interaction is caused by variation in genetic environmental sensitivity (GES), which can be subdivided into macro- and micro-GES. Macro-GES is genetic sensitivity to macro-environments (definable environments often shared by groups of animals), while micro-GES is genetic sensitivity to micro-environments (individual environments). A combined reaction norm and double hierarchical generalised linear model (RN-DHGLM) allows for simultaneous estimation of base genetic, macro- and micro-GES effects. The accuracy of variance components estimated using a RN-DHGLM has been explicitly studied for balanced data and recommendation of a data size with a minimum of 100 sires with at least 100 offspring each have been made. In the current study, the data size (numbers of sires and progeny) and structure requirements of the RN-DHGLM were investigated for two types of unbalanced datasets. Both datasets had a variable number of offspring per sire, but one dataset also had a variable number of offspring within macro-environments. The accuracy and bias of the estimated macro- and micro-GES effects and the estimated breeding values (EBVs) obtained using the RN-DHGLM depended on the data size. Reasonably accurate and unbiased estimates were obtained with data containing 500 sires with 20 offspring or 100 sires with 50 offspring, regardless of the data structure. Variable progeny group sizes, alone or in combination with an unequal number of offspring within macro-environments, had little impact on the dispersion of the EBVs or the bias and accuracy of variance component estimation, but resulted in lower accuracies of the EBVs. Compared to genetic correlations of zero, a genetic correlation of 0.5 between base genetic, macro- and micro-GES components resulted in a slight decrease in the percentage of replicates that converged out of 100 replicates, but had no effect on the dispersion and accuracy of variance component estimation or the dispersion of the EBVs. The results show that it is possible to apply the RN-DHGLM to unbalanced datasets to obtain estimates of variance due to macro- and micro-GES. Furthermore, the levels of accuracy and bias of variance estimates when analysing macro- and micro-GES simultaneously are determined by average family size, with limited impact from variability in family size and/or cohort size. This creates opportunities for the use of field data from populations with unbalanced data structures when estimating macro- and micro-GES.  相似文献   

5.
ABSTRACT

Proportion data from dose-response experiments are often overdispersed, characterised by a larger variance than assumed by the standard binomial model. Here, we present different models proposed in the literature that incorporate overdispersion. We also discuss how to select the best model to describe the data and present, using R software, specific code used to fit and interpret binomial, quasi-binomial, beta-binomial, and binomial-normal models, as well as to assess goodness-of-fit. We illustrate applications of these generalized linear models and generalized linear mixed models with a case study from a biological control experiment, where different isolates of Isaria fumosorosea (Hypocreales: Cordycipitaceae) were used to assess which ones presented higher resistance to UV-B radiation. We show how to test for differences between isolates and also how to statistically group isolates presenting a similar behaviour.  相似文献   

6.
In the present paper the linear logistic extension of latent class analysis is described. Thereby it is assumed that the item latent probabilities as well as the class sizes can be attributed to some explanatory variables. The basic equations of the model state the decomposition of the log-odds of the item latent probabilities and of the class sizes into weighted sums of basic parameters representing the effects of the predictor variables. Further, the maximum likelihood equations for these effect parameters and statistical tests for goodness-of-fit are given. Finally, an example illustrates the practical application of the model and the interpretation of the model parameters.  相似文献   

7.
We propose an extension to the estimating equations in generalized linear models to estimate parameters in the link function and variance structure simultaneously with regression coefficients. Rather than focusing on the regression coefficients, the purpose of these models is inference about the mean of the outcome as a function of a set of covariates, and various functionals of the mean function used to measure the effects of the covariates. A commonly used functional in econometrics, referred to as the marginal effect, is the partial derivative of the mean function with respect to any covariate, averaged over the empirical distribution of covariates in the model. We define an analogous parameter for discrete covariates. The proposed estimation method not only helps to identify an appropriate link function and to suggest an underlying distribution for a specific application but also serves as a robust estimator when no specific distribution for the outcome measure can be identified. Using Monte Carlo simulations, we show that the resulting parameter estimators are consistent. The method is illustrated with an analysis of inpatient expenditure data from a study of hospitalists.  相似文献   

8.
Extensions of linear models are very commonly used in the analysis of biological data. Whereas goodness of fit measures such as the coefficient of determination (R2) or the adjusted R2 are well established for linear models, it is not obvious how such measures should be defined for generalized linear and mixed models. There are by now several proposals but no consensus has yet emerged as to the best unified approach in these settings. In particular, it is an open question how to best account for heteroscedasticity and for covariance among observations present in residual error or induced by random effects. This paper proposes a new approach that addresses this issue and is universally applicable for arbitrary variance‐covariance structures including spatial models and repeated measures. It is exemplified using three biological examples.  相似文献   

9.
To facilitate decision support in freshwater ecosystem protection and restoration management, habitat suitability models can be very valuable. Data driven methods such as artificial neural networks (ANNs) are particularly useful in this context, seen their time-efficient development and relatively high reliability. However, specialized and technical literature on neural network modelling offers a variety of model development criteria to select model architecture, training procedure, etc. This may lead to confusion among ecosystem modellers and managers regarding the optimal training and validation methodology. This paper focuses on the analysis of ANN development and application for predicting macroinvertebrate communities, a species group commonly used in freshwater assessment worldwide. This review reflects on the different aspects regarding model development and application based on a selection of 26 papers reporting the use of ANN models for the prediction of macroinvertebrates. This analysis revealed that the applied model training and validation methodologies can often be improved and moreover crucial steps in the modelling process are often poorly documented. Therefore, suggestions to improve model development, assessment and application in ecological river management are presented. In particular, data pre-processing determines to a high extent the reliability of the induced models and their predictive relevance. This also counts for the validation criteria, that need to be better tuned to the practical simulation requirements. Moreover, the use of sensitivity methods can help to extract knowledge on the habitat preference of species and allow peer-review by ecological experts. The selection of relevant input variables remains a critical challenge as well. Model coupling is a missing crucial step to link human activities, hydrology, physical habitat conditions, water quality and ecosystem status. This last aspect is probably the most valuable aspect to enable decision support in water management based on ANN models.  相似文献   

10.
Model-based geostatistical design involves the selection of locations to collect data to minimize an expected loss function over a set of all possible locations. The loss function is specified to reflect the aim of data collection, which, for geostatistical studies, could be to minimize the prediction uncertainty at unobserved locations. In this paper, we propose a new approach to design such studies via a loss function derived through considering the entropy about the model predictions and the parameters of the model. The approach includes a multivariate extension to generalized linear spatial models, and thus can be used to design experiments with more than one response. Unfortunately, evaluating our proposed loss function is computationally expensive so we provide an approximation such that our approach can be adopted to design realistically sized geostatistical studies. This is demonstrated through a simulated study and through designing an air quality monitoring program in Queensland, Australia. The results show that our designs remain highly efficient in achieving each experimental objective individually, providing an ideal compromise between the two objectives. Accordingly, we advocate that our approach could be adopted more generally in model-based geostatistical design.  相似文献   

11.
Simple ratios in which a measurement variable is divided by a size variable are commonly used but known to be inadequate for eliminating size correlations from morphometric data. Deficiencies in the simple ratio can be alleviated by incorporating regression coefficients describing the bivariate relationship between the measurement and size variables. Recommendations have included: 1) subtracting the regression intercept to force the bivariate relationship through the origin (intercept-adjusted ratios); 2) exponentiating either the measurement or the size variable using an allometry coefficient to achieve linearity (allometrically adjusted ratios); or 3) both subtracting the intercept and exponentiating (fully adjusted ratios). These three strategies for deriving size-adjusted ratios imply different data models for describing the bivariate relationship between the measurement and size variables (i.e., the linear, simple allometric, and full allometric models, respectively). Algebraic rearrangement of the equation associated with each data model leads to a correctly formulated adjusted ratio whose expected value is constant (i.e., size correlation is eliminated). Alternatively, simple algebra can be used to derive an expected value function for assessing whether any proposed ratio formula is effective in eliminating size correlations. Some published ratio adjustments were incorrectly formulated as indicated by expected values that remain a function of size after ratio transformation. Regression coefficients incorporated into adjusted ratios must be estimated using least-squares regression of the measurement variable on the size variable. Use of parameters estimated by any other regression technique (e.g., major axis or reduced major axis) results in residual correlations between size and the adjusted measurement variable. Correctly formulated adjusted ratios, whose parameters are estimated by least-squares methods, do control for size correlations. The size-adjusted results are similar to those based on analysis of least-squares residuals from the regression of the measurement on the size variable. However, adjusted ratios introduce size-related changes in distributional characteristics (variances) that differentially alter relationships among animals in different size classes. © 1993 Wiley-Liss, Inc.  相似文献   

12.
用非线性模型估测恒温和变温下棉铃虫蛹的发育率   总被引:4,自引:3,他引:1  
为了深入分析和探讨昆虫发育与环境温度的关系, 在恒温(15~37℃)和交替变温(12/18~34/40℃)下测定了棉铃虫Helicoverpa armigera蛹的发育历期(d),分别用线性模型和非线性模型(Logan模型﹑Lactin模型和王氏模型)拟合其发育率(1/d)数据。结果表明,这3个非线性模型能更准确地描述发育率与温度之间的曲线关系,判定系数(R2)在0.9878~0.9991之间。对全部观测数据的进一步研究表明,只要有6个分布合适的观测数据,就可以用这些非线性模型获得相当满意的估测效果。如果缺乏高温下的测定数据,用非线性模型预测的昆虫发育率可能失真。分析了蛹在恒温和变温下发育率差异的可能原因,讨论了应用这3个非线性模型预测蛹期发育的优点和缺点,指出用非线性模型取代线性日·度模型进行害虫发生预测和益虫饲养管理的合理性和必要性。  相似文献   

13.
Modeling plant growth using functional traits is important for understanding the mechanisms that underpin growth and for predicting new situations. We use three data sets on plant height over time and two validation methods—in‐sample model fit and leave‐one‐species‐out cross‐validation—to evaluate non‐linear growth model predictive performance based on functional traits. In‐sample measures of model fit differed substantially from out‐of‐sample model predictive performance; the best fitting models were rarely the best predictive models. Careful selection of predictor variables reduced the bias in parameter estimates, and there was no single best model across our three data sets. Testing and comparing multiple model forms is important. We developed an R package with a formula interface for straightforward fitting and validation of hierarchical, non‐linear growth models. Our intent is to encourage thorough testing of multiple growth model forms and an increased emphasis on assessing model fit relative to a model's purpose.  相似文献   

14.
广义模型及分类回归树在物种分布模拟中的应用与比较   总被引:19,自引:0,他引:19  
曹铭昌  周广胜  翁恩生 《生态学报》2005,25(8):2031-2040
比较3个应用较广的模拟物种地理分布模型:广义线性模型(GLM)、广义加法模型(GAM)与分类回归树(CART)对中国树种地理分布模拟的优劣,以提出更为合适的模拟物种地理分布模型,并用于预测气候变化对物种地理分布的影响。3个模型对中国15种树种地理分布的模拟研究表明:除对油松、辽东栎分布的模拟精度稍差外,对其余树种分布的模拟精度均较高,其中以GAM模型最好。结合地理信息系统(GIS),比较分析了这3个模型对青冈、木荷、红松和油松4种树种的地理分布模拟效果,结果亦表明:这3个模型均能很好模拟青冈和木荷的地理分布,而GLM模型对红松分布的模拟结果不太理想,3个模型对油松分布的模拟结果均不甚理想,其中以GLM模型最差。基于3个模型对未来气候变化下青冈与蒙古栎地理分布的预测表明:GLM模型与GAM模型对青冈分布的预测结果较为接近,青冈在未来气候变化情景下向西和向北扩展,而CART模型预测青冈在未来气候变化情景下除有向西、向北扩展趋势外,广东和广西南部的青冈分布区将消失;3个模型均预测蒙古栎在未来气候变化情景下向西扩展,扩展面积的大小为:模型的模拟面积>模型>模型。  相似文献   

15.
Summary An equivalence between restricted best linear unbiased prediction (and thus restricted selection index) and a particular example of a selection model is presented. Specifically, the equivalence is between restricted selection and a model of selection on the residuals of the general mixed linear model. This result illustrates that restricted selection acts by nonrandomly sampling those genes that act pleiotropically in multiple trait genetic models. An expression for a mixed linear model which includes restrictions is also presented.  相似文献   

16.
The identification of core habitat areas and resulting prediction maps are vital tools for land managers. Often, agencies have large datasets from multiple studies over time that could be combined for a more informed and complete picture of a species. Colorado Parks and Wildlife has a large database for greater sage-grouse (Centrocercus urophasianus) including 11 radio-telemetry studies completed over 12 years (1997–2008) across northwestern Colorado. We divided the 49,470-km2 study area into 1-km2 grids with the number of sage-grouse locations in each grid cell that contained at least 1 location counted as the response variable. We used a generalized linear mixed model (GLMM) using land cover variables as fixed effects and individual birds and populations as random effects to predict greater sage-grouse location counts during breeding, summer, and winter seasons. The mixed effects model enabled us to model correlations that may exist in grouped data (e.g., correlations among individuals and populations). We found only individual groupings accounted for variation in the summer and breeding seasons, but not the winter season. The breeding and summer seasonal models predicted sage-grouse presence in the currently delineated populations for Colorado, but we found little evidence supporting a winter season model. According to our models, about 50% of the study area in Colorado is considered highly or moderately suitable habitat in both the breeding and summer seasons. As oil and gas development and other landscape changes occur in this portion of Colorado, knowledge of where management actions can be accomplished or possible restoration can occur becomes more critical. These seasonal models provide data-driven, distribution maps that managers and biologists can use for identification and exploration when investigating greater sage-grouse issues across the Colorado range. Using historic data for future decisions on species management while accounting for issues found from combining datasets allows land managers the flexibility to use all information available. © 2013 The Wildlife Society.  相似文献   

17.
Restricted breeding seasons used in beef cattle produce censored data for reproduction traits measured in regard to these seasons. To analyze these data, adequate methods must be used. The objective of this paper was to compare three approaches aiming to evaluate sexual precocity in Nellore cattle. The final data set contained 6699 records of age at first conception (AFC14) (in days) and of heifer pregnancy (HP14) (binary) obtained from females exposed to the bulls for the first time at about 14 months of age. Records of females that did not calve in the following year after being exposed to a sire were considered censored (77.5% of total). The models used to obtain genetic parameters and expected progeny differences (EPDs) were a Weibull mixed and a censored linear model for AFC14 and threshold model for HP14. The mean heritabilities obtained were 0.76 and 0.44, respectively, for survival and censored linear models (for AFC14), and 0.58 for HP14. Ranking and Pearson correlations varied (in absolute values) from 0.54 to 0.99 (considering different percentages of sires selected), indicating moderate changes in the classification. Considering survival analysis as the best selection criterion (that would result in the best response to selection), it was observed that selection for HP14 would lead to a more significant decrease in selection response if compared with selection for AFC14 analysed by censored linear model, from which results were very similar to the survival analysis.  相似文献   

18.
In clinical research and practice, landmark models are commonly used to predict the risk of an adverse future event, using patients' longitudinal biomarker data as predictors. However, these data are often observable only at intermittent visits, making their measurement times irregularly spaced and unsynchronized across different subjects. This poses challenges to conducting dynamic prediction at any post-baseline time. A simple solution is the last-value-carry-forward method, but this may result in bias for the risk model estimation and prediction. Another option is to jointly model the longitudinal and survival processes with a shared random effects model. However, when dealing with multiple biomarkers, this approach often results in high-dimensional integrals without a closed-form solution, and thus the computational burden limits its software development and practical use. In this article, we propose to process the longitudinal data by functional principal component analysis techniques, and then use the processed information as predictors in a class of flexible linear transformation models to predict the distribution of residual time-to-event occurrence. The measurement schemes for multiple biomarkers are allowed to be different within subject and across subjects. Dynamic prediction can be performed in a real-time fashion. The advantages of our proposed method are demonstrated by simulation studies. We apply our approach to the African American Study of Kidney Disease and Hypertension, predicting patients' risk of kidney failure or death by using four important longitudinal biomarkers for renal functions.  相似文献   

19.
In this paper we consider balanced three-treatment three-period crossover designs. Using a strategy similar to that applied in the analysis of split-plot experiments we describe both the within and the between sample units models, as well as the corresponding Analysis of Variance. We illustrate these procedures with a numerical example and discuss their implementation through computer programs designed for the analysis of the general linear model.  相似文献   

20.
Huang X 《Biometrics》2009,65(2):361-368
Summary .  Generalized linear mixed models (GLMMs) are widely used in the analysis of clustered data. However, the validity of likelihood-based inference in such analyses can be greatly affected by the assumed model for the random effects. We propose a diagnostic method for random-effect model misspecification in GLMMs for clustered binary response. We provide a theoretical justification of the proposed method and investigate its finite sample performance via simulation. The proposed method is applied to data from a longitudinal respiratory infection study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号