首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
Wenkai Li  Qinghua Guo 《Ecography》2013,36(7):788-799
It is very common that only presence data are available in ecological niche modeling. However, most existing methods for evaluating the accuracy of presence–absence (binary) predictions of species require presence–absence data. The aim of this study is to present a new method for accuracy assessment that does not rely on absence data. Two new statistics Fpb and Fcpb were derived based on presence–background data. With generated six virtual species, we used DOMAIN, generalized linear modeling (GLM), and maximum entropy (MAXENT) to produce different species presence–absence predictions. To investigate the effectiveness of the new statistics in accuracy assessment, we used Fpb, Fcpb, the traditional F‐measure (F), kappa coefficient, true skill statistic (TSS), area under the receiver operating characteristic curve (AUC), and the contrast validation index (CVI) to evaluate the accuracy of predictions, and the behaviors of these accuracy measures were compared. The effectiveness of Fpb for threshold selection and estimation of species prevalence was also investigated. Experimental results show that Fcpb is an estimate of F. The Pearson's correlation coefficient (COR) between Fcpb and F is 0.9882, with a root‐mean‐square error (RMSE) of 0.0171. In general, Fpb, Fcpb, F, kappa coefficient, TSS, and CVI can sort models by the accuracy of binary prediction, but AUC is not appropriate to evaluate the accuracy of binary prediction. For DOMAIN, GLM, and MAXENT, finding the threshold by maximizing Fpb and by maximizing F result in similar accuracies. In addition, the estimation of species prevalence based on binary output with maximizing Fpb as the thresholding method is significantly more accurate than simply averaging the original continuous output. The best estimate of prevalence is provided by the binary output of MAXENT, with an RMSE of 0.0116. Finally, we conclude that the new method is promising in accuracy assessment, threshold selection, and estimation of species prevalence, all of which are important but challenging problems with presence‐only data. Because it does not require absence data, the new method will have important applications in ecological niche modeling.  相似文献   

2.
In the development of a species distribution model based on regression techniques such as generalized linear or additive modelling (GLM/GAM), a basic assumption is that records of species presence and absence are real. However, a common concern in many studies examining species distributions is that absences cannot be inferred with certainty. This is particularly the case where the species is rare, difficult to detect and/or does not occupy all available habitat considered suitable. The western ground parrot ( Pezoporus wallicus flaviventris ) of southern Western Australia, Australia, is a case in point, as not only is it rare and difficult to detect, but it is also unlikely to occupy all available suitable habitat. A recent survey of ground parrots provided the opportunity to develop a predictive distribution model. As the data were susceptible to false absences, these were replaced with randomly selected 'pseudo' absences and modelled using GLM. As a comparison, presence-only information was modelled using a relatively new approach, MAXENT, a machine-learning technique that has been shown to perform comparatively well. The predictive performance of both models, as assessed by the receiver operating characteristic plot (ROC) was high (AUC > 0.8), with MAXENT performing only marginally better than the GLM. These approaches both indicated that the ground parrot prefers areas relatively high in altitude, distant from rivers, gently sloping to level habitat, with an intermediate cover of vegetation and where there is a mosaic of vegetation ages. In this case, the use of presence-only information resulted in the identification of important environmental attributes defining the occurrence of the ground parrot, but additional factors that account for the inability of the bird to occupy all suitable habitat should be a component of model refinement.  相似文献   

3.
Approaches for modelling the distribution of animals in relation to their environment can be divided into two basic types, those which use records of absence as well as records of presence and those which use only presence records. For terrestrial species, presence–absence approaches have been found to produce models with greater predictive ability than presence-only approaches. This study compared the predictive ability of both approaches for a marine animal, the harbour porpoise (Phoceoena phocoena). Using data on the occurrence of harbour porpoises in the Sea of Hebrides, Scotland, the predictive abilities of one presence–absence approach (generalised linear modelling—GLM) and three presence-only approaches (Principal component analysis—PCA, ecological niche factor analysis—ENFA and genetic algorithm for rule-set prediction—GARP) were compared. When the predictive ability of the models was assessed using receiver operating characteristic (ROC) plots, the presence–absence approach (GLM) was found to have the greatest predictive ability. However, all approaches were found to produce models that predicted occurrence significantly better than a random model and the GLM model did not perform significantly better than ENFA and GARP. The PCA had a significantly lower predictive ability than GLM but not the other approaches. In addition, all models predicted a similar spatial distribution. Therefore, while models constructed using presence–absence approaches are likely to provide the best understanding of species distribution within a surveyed area, presence-only models can perform almost as well. However, careful consideration of the potential limitations and biases in the data, especially with regards to representativeness, is needed if the results of presence-only models are to be used for conservation and/or management purposes. Guest editor: V. D. Valavanis Essential Habitat Mapping in the Mediterranean  相似文献   

4.
Saccharomyces cerevisiae alcohol dehydrogenases responsible for NADH-, and NADPH-specific reduction of the furaldehydes 5-hydroxymethyl-furfural (HMF) and furfural have previously been identified. In the present study, strains overexpressing the corresponding genes (mut-ADH1 and ADH6), together with a control strain, were compared in defined medium for anaerobic fermentation of glucose in the presence and absence of HMF. All strains showed a similar fermentation pattern in the absence of HMF. In the presence of HMF, the strain overexpressing ADH6 showed the highest HMF reduction rate and the highest specific ethanol productivity, followed by the strain overexpressing mut-ADH1. This correlated with in vitro HMF reduction capacity observed in the ADH6 overexpressing strain. Acetate and glycerol yields per biomass increased considerably in the ADH6 strain. In the other two strains, only the overall acetate yield per biomass was affected. When compared in batch fermentation of spruce hydrolysate, strains overexpressing ADH6 and mut-ADH1 had five times higher HMF uptake rate than the control strain and improved specific ethanol productivity. Overall, our results demonstrate that (1) the cofactor usage in the HMF reduction affects the product distribution, and (2) increased HMF reduction activity results in increased specific ethanol productivity in defined mineral medium and in spruce hydrolysate.  相似文献   

5.
Dynamical modeling has proven useful for understanding how complex biological processes emerge from the many components and interactions composing genetic regulatory networks (GRNs). However, the development of models is hampered by large uncertainties in both the network structure and parameter values. To remedy this problem, the models are usually developed through an iterative process based on numerous simulations, confronting model predictions with experimental data and refining the model structure and/or parameter values to repair the inconsistencies. In this paper, we propose an alternative to this generate-and-test approach. We present a four-step method for the systematic construction and analysis of discrete models of GRNs by means of a declarative approach. Instead of instantiating the models as in classical modeling approaches, the biological knowledge on the network structure and its dynamics is formulated in the form of constraints. The compatibility of the network structure with the constraints is queried and in case of inconsistencies, some constraints are relaxed. Common properties of the consistent models are then analyzed by means of dedicated languages. Two such languages are introduced in the paper. Removing questionable constraints or adding interesting ones allows to further analyze the models. This approach allows to identify the best experiments to be carried out, in order to discriminate sets of consistent models and refine our knowledge on the system functioning. We test the feasibility of our approach, by applying it to the re-examination of a model describing the nutritional stress response in the bacterium Escherichia coli.  相似文献   

6.
The breeding system of an annual Cruciferae, Arabidopsis kamchatica subsp. kawasakiana, was studied in three natural populations. We applied four experimental treatments, open pollination, bagging, emasculation + bagging, and emasculation + hand-pollination + bagging. None of the emasculated flowers with bags produced fruits but we observed high fruit sets in the other three treatments. The results confirmed that A. kamchatica subsp. kawasakiana is a self-compatible, non-apomictic species that can produce seeds through auto-pollination. Considering the life cycle as an annual, increased reproductive assurance through auto-pollination should be critical for the maintenance of populations of A. kamchatica subsp. kawasakiana.  相似文献   

7.
A common feature of ecological data sets is their tendency to contain many zero values. Statistical inference based on such data are likely to be inefficient or wrong unless careful thought is given to how these zeros arose and how best to model them. In this paper, we propose a framework for understanding how zero-inflated data sets originate and deciding how best to model them. We define and classify the different kinds of zeros that occur in ecological data and describe how they arise: either from 'true zero' or 'false zero' observations. After reviewing recent developments in modelling zero-inflated data sets, we use practical examples to demonstrate how failing to account for the source of zero inflation can reduce our ability to detect relationships in ecological data and at worst lead to incorrect inference. The adoption of methods that explicitly model the sources of zero observations will sharpen insights and improve the robustness of ecological analyses.  相似文献   

8.
张雷  刘世荣  孙鹏森  王同立 《生态学报》2011,31(19):5749-5761
物种分布模型是预测评估气候变化对物种分布影响的主要工具。为了降低物种分布模型在预测过程中的不确定性,近期有学者提出了采用组合预测的新方法,即采用多套建模数据、模型技术,模型参数,以及环境情景数据对物种分布进行预测,构成物种分布预测集合。但是,组合预测中各组分对变异的贡献还知之甚少,因此有必要把变异组分来源进行分割,以更有效地利用组合预测方法来降低模型预测中的不确定性。以油松为例,采用8个生态位模型,9套模型训练数据,3个GCM模型和一个SRES(A2)排放情景,模型分析了油松当前(1961-1990年)和未来气候条件下3个时间段(2010-2039年,2040-2069年,2070-2099年)的潜在分布。共计得到当前分布预测数据72套,未来每个时间段分布数据216套。采用开发的ClimateChina软件进行当前和未来气候数据的降尺度处理。采用Kappa、真实技巧统计方法(TSS)和接收机工作特征曲线下的面积(AUC)对模型预测能力进行评估。结果表明,随机森林(RF)、广义线性模型(GLM),广义加法模型(GAM)、多元自适应样条函数(MARS)以及助推法(GBM)预测效果较好,几乎不受建模数据之间差异的影响。混合判别分析模型(MDA)对建模数据之间的差异非常敏感,甚至出现建模失败现象。采用三因素方差分析方法对组合预测中的不确定性来源进行变异分割,结果表明,模型之间的差异对模拟预测结果不确定性的贡献最大且所占比例极高,而建模数据之间的差异贡献最小,GCM贡献居中。研究将有助于加深对物种分布模拟预测中不确定性的认识。  相似文献   

9.
梁红艳  姜效雷  孔玉华  杨喜田 《生态学报》2018,38(23):8345-8353
为了阐明气候变暖背景下春兰(Cymbidium goeringii)和蕙兰(C. faberi)在我国的适生区分布变化情况,根据157条分布记录和19个生物气候变量,应用最大熵物种分布模型,对2070年4种温室气体排放情景下春兰和蕙兰在我国的适生区分布进行预测,并筛选影响其地理分布的主要气候因子。结果表明:(1)2070年春兰和蕙兰分布点的年均温(bio1)、最冷月最低温度(bio6)和最冷季平均温度(bio11)等均升高,气候有变暖趋势;(2)受试者工作特征曲线下面积(AUC)值在0.9—1.0之间,模型预测结果可信度较高;(3)影响春兰、蕙兰当前和2070年地理分布的限制性气候因子主要有最冷月最低温度(bio6)、最冷季平均温度(bio11)、年均降水量(bio12)和最干月份降水量(bio14);(4)气候变暖将会对春兰和蕙兰的适宜生境范围和面积产生影响。预测2070年春兰的适宜生境面积将会有所减小,而蕙兰的适宜生境面积将会增加,且整体有向北迁移的趋势。研究结果为野生春兰和蕙兰的生态风险评价和引种提供了重要依据。  相似文献   

10.
Filtering and feeding rates of cyclopoid copepods feeding on phytoplankton   总被引:1,自引:0,他引:1  
Rita Adrian 《Hydrobiologia》1991,210(3):217-223
The algal biomass ingested by omnivorous cyclopoid copepods (Cyclops kolensis and C. vicinus) was measured by two methods in the hypertrophic Heiligensee in Berlin (West Germany). The clearance and ingestion rates inferred from measurements of natural populations of 14C labelled phytoplankton were compared with those obtained from chlorophyll a determinations using the presence/absence method (observed chlorophyll a content of natural lake phytoplankton with and without addition of cyclopoids). Both methods gave similar results. Nevertheless, the radio tracer method is preferred, mainly because the short feeding duration excludes high variations in both the food composition and food concentration that limit the presence/absence method.  相似文献   

11.
Can we model the probability of presence of species without absence data?   总被引:1,自引:0,他引:1  
In ecological studies, it is useful to estimate the probability that a species occurs at given locations. The probability of presence can be modeled by traditional statistical methods, if both presence and absence data are available. However, the challenge is that most species records contain only presence data, without reliable absence data. Previous presence‐only methods can estimate a relative index of habitat suitability, but cannot estimate the actual probability of presence. In this study, we develop a presence and background learning algorithm (PBL) that is successful in modeling the conditional probability of presence of a simulated species. The model is trained by two completely separate sets: observed presence and background data. Assuming that the probability of presence is one for ‘prototypical presence’ locations where the habitats are maximally suitable for a species, we can estimate a constant that can calibrate the trained model into the actual probability of presence. Experimental results show that the PBL method performs similarly to a presence‐absence method, and significantly better than the widely used maximum entropy method. The new algorithm enables us to model the probability that a species occurs conditional on environmental covariates without absence data. Hence, it has potential to improve modeling of the geographical distributions of species.  相似文献   

12.
The expected presence/absence of a target species outside the area of actual observations is commonly estimated using statistical models or decision criteria. This investigation demonstrates a similarity-based solution for predictive mapping as an alternative to generalizing models. The maps of the expected distribution of 12 orchids were created using find sites from field observations and absence sites generated onto the observation track. The expected presence/absence of a species in a location was calculated according to the similarity between the location and selected examples of presence and absence sites. A machine learning system selected the best predictive sets for each species out of 161 cartographic and remote sensing features. The usefulness of the predictive distribution maps was expressed as the ratio of the density of find sites per track in the predicted presence area relative to the density per track in the predicted absence area. The predictive mapping was more efficient for Dactylorhiza incarnata, D. russowii, Gymnadenia conopsea, and Goodyera repens. Soil properties and the proportion of find sites for the other species in the vicinity were the most indicative site characteristics. The rarer species were found to be better indicators of the occurrence of the other species than were the more common orchids. The proposed approach—to direct subsequent field observations to sites where the occurrence of the target species was predicted but has not yet been recorded—helped discover new populations of orchids and enhance the representativeness of absence sites.  相似文献   

13.
Since its introduction into North America in the late 19th century, Celastrus orbiculatus (Thumb.) has become a serious ecological threat to native ecosystems. Development of a method to accurately map the occurrence of invasive plants, including C. orbiculatus, would greatly assist in their assessment and control. Using an innovative map regression model, we predicted 85% of presence and absence of C. orbiculatus within our study area. We identify environmental characteristics associated with C. orbiculatus and demonstrate the use of this information to predict occurrence of C. orbiculatus across a broad area in Southern Illinois, USA. Presence and absence information were obtained at sample points within discrete areas of C. orbiculatus occurrence. Forest cover, elevation, slope gradient and aspect, soil pH and texture, distance to nearest road, and potential annual direct incident radiation were recorded for invaded and adjacent non-invaded areas. Presence of oak, elevation, slope gradient, soil pH, soil texture, and distance to road were significant factors associated with the presence or absence of C. orbiculatus. Probability of occurrence of C. orbiculatus was highest on gently sloping interfluves with successional forest canopy not dominated by oak, and less acidic, mesic soil. A logistic regression model was developed and extrapolated over a raster GIS data layer using map algebra to predict current invasion throughout the study area. The model correctly predicted at least 85% occurrence of C. orbiculatus. When combined with logistic regression, map algebra is a potentially powerful tool for evaluating the spatial distribution of invasive plants provided sound statistical principles are applied in extrapolating validated regression models.  相似文献   

14.
The interaction between three independent data sets (anatomy/morphology, cytology, molecules) has been evaluated within the controversial genusTrichomanes(Hymenophyllaceae). Anatomy/morphology, cytology, andrbcL sequences, despite their high and significant level of incongruence, were thus empirically combined with differential weighting in a cladistic analysis withinTrichomanesin order to give an appreciation of the contribution of each data set in the resulting topologies and to study more precisely the nature of potential conflicts. Results show that any standard statistics values (such as bootstrap) do not appear to be objectively useful for the choice of the “best” topology or the “good” clades provided by the combination. This weighting approach reveals three cases: (i) some clades (such as subgenusDidymoglossum) are always retrieved and correspond to the absence of conflicts between the different data, (ii) some new clades (such as subgenusAchomanes) are either provided or reenforced as a “synergetic” result of the combination of the data and (iii) that remaining conflicting clades reflect the persistence of incongruence between data whatever the weighting.  相似文献   

15.
The dynamics of allele frequencies changing under migration and heterogeneous selection in a subdivided population are investigated. Using perturbation techniques, a stationary state is obtained when migration and selection are both small. Heterogeneous selection leads to a positive correlation between values of F-statistics and heterozygosities when these are compared among sets of subdivided populations. This contrasts with a negative value of the correlation obtained under Wright's classical model of homogeneous selection, and with the absence of correlation in the completely neutral situation.Research supported in part by NIH grants GM 28016 and GM 10452 and a grant from the John D. and Catherine T. MacArthur Foundation  相似文献   

16.
Ecological systems are governed by complex interactions which are mainly nonlinear. In order to capture the inherent complexity and nonlinearity of ecological, and in general biological systems, empirical models recently gained popularity. However, although these models, particularly connectionist approaches such as multilayered backpropagation networks, are commonly applied as predictive models in ecology to a wide variety of ecosystems and questions, there are no studies to date aiming to assess the performance, both in terms of data fitting and generalizability, and applicability of empirical models in ecology. Our aim is hence to provide an overview for nature of the wide range of the data sets and predictive variables, from both aquatic and terrestrial ecosystems with different scales of time-dependent dynamics, and the applicability and robustness of predictive modeling methods on such data sets by comparing different empirical modeling approaches. The models used in this study range from predicting the occurrence of submerged plants in shallow lakes to predicting nest occurrence of bird species from environmental variables and satellite images. The methods considered include k-nearest neighbor (k-NN), linear and quadratic discriminant analysis (LDA and QDA), generalized linear models (GLM) feedforward multilayer backpropagation networks and pseudo-supervised network ARTMAP.Our results show that the predictive performances of the models on training data could be misleading, and one should consider the predictive performance of a given model on an independent test set for assessing its predictive power. Moreover, our results suggest that for ecosystems involving time-dependent dynamics and periodicities whose frequency are possibly less than the time scale of the data considered, GLM and connectionist neural network models appear to be most suitable and robust, provided that a predictive variable reflecting these time-dependent dynamics included in the model either implicitly or explicitly. For spatial data, which does not include any time-dependence comparable to the time scale covered by the data, on the other hand, neighborhood based methods such as k-NN and ARTMAP proved to be more robust than other methods considered in this study. In addition, for predictive modeling purposes, first a suitable, computationally inexpensive method should be applied to the problem at hand a good predictive performance of which would render the computational cost and efforts associated with complex variants unnecessary.  相似文献   

17.
Qihuang Zhang  Grace Y. Yi 《Biometrics》2023,79(2):1089-1102
Zero-inflated count data arise frequently from genomics studies. Analysis of such data is often based on a mixture model which facilitates excess zeros in combination with a Poisson distribution, and various inference methods have been proposed under such a model. Those analysis procedures, however, are challenged by the presence of measurement error in count responses. In this article, we propose a new measurement error model to describe error-contaminated count data. We show that ignoring the measurement error effects in the analysis may generally lead to invalid inference results, and meanwhile, we identify situations where ignoring measurement error can still yield consistent estimators. Furthermore, we propose a Bayesian method to address the effects of measurement error under the zero-inflated Poisson model and discuss the identifiability issues. We develop a data-augmentation algorithm that is easy to implement. Simulation studies are conducted to evaluate the performance of the proposed method. We apply our method to analyze the data arising from a prostate adenocarcinoma genomic study.  相似文献   

18.
Allozyme variation was examined inCarex sect.Phyllostachys (Cyperaceae) to provide insight into phylogenetic relationships hypothesized in an earlier study and to determine the degree of genetic differentiation within and between taxa. Genetic identity values are concordant with the morphological differences found between species. The lowest values are found between species with the greatest morphological dissimilarity. Conversely, the highest values are associated with species pairs distinguished by relatively few morphological differences. Conspecific populations possess high genetic identities, although interpopulation differentiation has characterized the evolutionary history of some species. Geographic patterning is also evident within species, with geographically proximate populations often having the highest identity values. Phylogenetic trees produced using different cladistic methods were poorly supported and varied in their depiction of relationships among species. One cladogram produced using presence/absence allelic data is more or less congruent with a topology recovered from an earlier analysis utilizing molecular and morphological data. The wide- and narrow-scaled clades are maintained as are the sister species pairsC. backii/C. saximontana, C. basiantha/C. superata, andC. jamesii/C. juniperorum. Contrary to the finding of our previous study, however,C. willdenowii is aligned withC. jamesii/C. juniperorum.  相似文献   

19.
目的 中药马钱子(Strychnos nux-vomica L.,SN)在临床上具有消肿止痛的功效,然而,由于含有生物碱类成分,马钱子具有一定毒性。人们对马钱子毒性所引起的大鼠内源性代谢变化及其对肠道微生物群代谢失调的潜在影响知之甚少,因此,马钱子的毒理学研究对其安全性评价具有重要意义。本研究将代谢组学和16S rRNA基因测序技术相结合来探索马钱子的致毒机制。方法 通过急性、蓄积性和亚急性毒性试验,分别确定马钱子的中毒剂量、毒性强度和毒性靶器官。超高效液相色谱-质谱联用技术用于分析大鼠灌胃马钱子后的血清、肝脏和肾脏样本。利用基于装袋算法的决策树和K最近邻(K nearest neighbor,KNN)模型对组学数据进行分类。从大鼠粪便中提取样本后,使用高通量测序平台对细菌的16s rRNA V3-V4区域进行分析。结果 装袋算法提高了样本分类的准确率。共鉴定出12个生物标志物,这些生物标志物的代谢失调可能是马钱子致体内毒性的原因。拟杆菌、粪厌氧棒菌、颤螺菌、双茎体菌等与肾肝功能的生理指标密切相关,这表明马钱子引起的肝肾损害可能与这些肠道细菌的代谢紊乱有关。结论 本文揭示了马钱子的体内致毒机制,为马钱子临床上的安全合理使用提供了科学依据。  相似文献   

20.
An understanding of the influence of climate change on Ixodes scapularis, the main vector of Lyme disease in North America, is a fundamental component in assessing changes in the spatial distribution of human risk for the disease. We used a climate suitability model of I. scapularis to examine the potential effects of global climate change on future Lyme disease risk in North America. A climate-based logistic model was first used to explain the current distribution of I. scapularis in North America. Climate-change scenarios were then applied to extrapolate the model in time and to forecast vector establishment. The spatially modeled relationship between I. scapularis presence and large-scale environmental data generated the current pattern of I. scapularis across North America with an accuracy of 89% (P < 0.0001). Extrapolation of the model revealed a significant expansion of I. scapularis north into Canada with an increase in suitable habitat of 213% by the 2080s. Climate change will also result in a retraction of the vector from the southern U.S. and movement into the central U.S. This report predicts the effect of climate change on Lyme disease risk and specifically forecasts the emergence of a tickborne infectious disease in Canada. Our modeling approach could thus be used to outline where future control strategies and prevention efforts need to be applied.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号