Selecting discriminant function models for predicting the expected richness of aquatic macroinvertebrates期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Selecting discriminant function models for predicting the expected richness of aquatic macroinvertebrates

Authors:	JOHN VAN SICKLE DAVID D HUFF CHARLES P HAWKINS

Institution:	US Environmental Protection Agency, National Health and Environmental Effects Laboratory, Western Ecology Division, Corvallis, OR, U.S.A.; Oregon Department of Environmental Quality, Watershed Assessment Section, Portland, OR, U.S.A.; Western Center for Monitoring and Assessment of Freshwater Ecosystems, Department of Aquatic, Watershed and Earth Resources, Utah State University, Logan, UT, U.S.A.

Abstract:	1. The predictive modelling approach to bioassessment estimates the macroinvertebrate assemblage expected at a stream site if it were in a minimally disturbed reference condition. The difference between expected and observed assemblages then measures the departure of the site from reference condition. 2. Most predictive models employ site classification, followed by discriminant function (DF) modelling, to predict the expected assemblage from a suite of environmental variables. Stepwise DF analysis is normally used to choose a single subset of DF predictor variables with a high accuracy for classifying sites. An alternative is to screen all possible combinations of predictor variables, in order to identify several ‘best’ subsets that yield good overall performance of the predictive model. 3. We applied best‐subsets DF analysis to assemblage and environmental data from 199 reference sites in Oregon, U.S.A. Two sets of 66 best DF models containing between one and 14 predictor variables (that is, having model orders from one to 14) were developed, for five‐group and 11‐group site classifications. 4. Resubstitution classification accuracy of the DF models increased consistently with model order, but cross‐validated classification accuracy did not improve beyond seventh or eighth‐order models, suggesting that the larger models were overfitted. 5. Overall predictive model performance at model training sites, measured by the root‐mean‐squared error of the observed/expected species richness ratio, also improved steadily with DF model order. But high‐order DF models usually performed poorly at an independent set of validation sites, another sign of model overfitting. 6. Models selected by stepwise DF analysis showed evidence of overfitting and were outperformed by several of the best‐subsets models. 7. The group separation strength of a DF model, as measured by Wilks’Λ, was more strongly correlated with overall predictive model performance at training sites than was DF classification accuracy. 8. Our results suggest improved strategies for developing reliable, parsimonious predictive models. We emphasise the value of independent validation data for obtaining a realistic picture of model performance. We also recommend assessing not just one or two, but several, candidate models based on their overall performance as well as the performance of their DF component. 9. We provide links to our free software for stepwise and best‐subsets DF analysis.

Keywords:	discriminant function expected richness model selection model validation O/E model RIVPACS

设为首页 | 免责声明 | 关于勤云 | 加入收藏