首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Species distribution models (SDMs) are now being widely used in ecology for management and conservation purposes across terrestrial, freshwater, and marine realms. The increasing interest in SDMs has drawn the attention of ecologists to spatial models and, in particular, to geostatistical models, which are used to associate observations of species occurrence or abundance with environmental covariates in a finite number of locations in order to predict where (and how much of) a species is likely to be present in unsampled locations. Standard geostatistical methodology assumes that the choice of sampling locations is independent of the values of the variable of interest. However, in natural environments, due to practical limitations related to time and financial constraints, this theoretical assumption is often violated. In fact, data commonly derive from opportunistic sampling (e.g., whale or bird watching), in which observers tend to look for a specific species in areas where they expect to find it. These are examples of what is referred to as preferential sampling, which can lead to biased predictions of the distribution of the species. The aim of this study is to discuss a SDM that addresses this problem and that it is more computationally efficient than existing MCMC methods. From a statistical point of view, we interpret the data as a marked point pattern, where the sampling locations form a point pattern and the measurements taken in those locations (i.e., species abundance or occurrence) are the associated marks. Inference and prediction of species distribution is performed using a Bayesian approach, and integrated nested Laplace approximation (INLA) methodology and software are used for model fitting to minimize the computational burden. We show that abundance is highly overestimated at low abundance locations when preferential sampling effects not accounted for, in both a simulated example and a practical application using fishery data. This highlights that ecologists should be aware of the potential bias resulting from preferential sampling and account for it in a model when a survey is based on non‐randomized and/or non‐systematic sampling.  相似文献   

2.
Many publications make use of opportunistic data, such as citizen science observation data, to infer large‐scale properties of species’ distributions. However, the few publications that use opportunistic citizen science data to study animal ecology at a habitat level do so without accounting for spatial biases in opportunistic records or using methods that are difficult to generalize. In this study, we explore the biases that exist in opportunistic observations and suggest an approach to correct for them. We first examined the extent of the biases in opportunistic citizen science observations of three wild ungulate species in Norway by comparing them to data from GPS telemetry. We then quantified the extent of the biases by specifying a model of the biases. From the bias model, we sampled available locations within the species’ home range. Along with opportunistic observations, we used the corrected availability locations to estimate a resource selection function (RSF). We tested this method with simulations and empirical datasets for the three species. We compared the results of our correction method to RSFs obtained using opportunistic observations without correction and to RSFs using GPS‐telemetry data. Finally, we compared habitat suitability maps obtained using each of these models. Opportunistic observations are more affected by human access and visibility than locations derived from GPS telemetry. This has consequences for drawing inferences about species’ ecology. Models naïvely using opportunistic observations in habitat‐use studies can result in spurious inferences. However, sampling availability locations based on the spatial biases in opportunistic data improves the estimation of the species’ RSFs and predicted habitat suitability maps in some cases. This study highlights the challenges and opportunities of using opportunistic observations in habitat‐use studies. While our method is not foolproof it is a first step toward unlocking the potential of opportunistic citizen science data for habitat‐use studies.  相似文献   

3.
Contemporary small-molecule drug discovery frequently involves the screening of large compound files as a core activity. Subsequently cost, speed, and safety become critical issues. In order to meet this need, numerous technologies have been developed to allow mix and measure approaches, facilitate miniaturization, and to increase speed and to minimize the use of potentially hazardous reagents such as radioactive materials. However, despite the on-paper advantages of these new technologies, risks can remain undefined. For example, the question of whether the novel method will facilitate identification of active chemical series in a way that is comparable with conventional methods arises. In order to address this question, we have taken the approach of carrying out experiments to directly compare the output of high-throughput screens using a given novel approach and a traditional method. The concordance between the screening methods can then be determined via comparison of the numbers and structures of the active molecules identified. This article describes the approach taken in our laboratory to minimize variability in such experiments and shows data that exemplifies the general result of lower than expected concordance. Statistical modeling was subsequently used to facilitate this interpretation. The model used beta-distribution function to generate a real-activity frequency relationship with added normal random error and occasional outliers to represent assay variability. Hence, the effect of assay parameters such as the threshold, the number of real actives, and the number of outliers and the standard deviation could readily be explored. The model was found to describe the data reasonably and moreover was found to be of great utility when it came to planning further optimal experiments. A key conclusion from the model was that concordance between screening methods could appear poor even when one approach is compared with itself. This occurs simply because the result is a function of assay threshold, standard deviation and the true compound % activity. In response to this finding we have adopted alternative experimental designs that more reliably measure the concordance between screening methods.  相似文献   

4.
How many membrane proteins are there?   总被引:9,自引:1,他引:8  
One of the basic issues that arises in functional genomics is the ability to predict the subcellular location of proteins that are deduced from gene and genome sequencing. In particular, one would like to be able to readily specify those proteins that are soluble and those that are inserted in a membrane. Traditional methods of distinguishing between these two locations have relied on extensive, time-consuming biochemical studies. The alternative approach has been to make inferences based on a visual search of the amino acid sequences of presumed gene products for stretches of hydrophobic amino acids. This numerical, sequence-based approach is usually seen as a first approximation pending more reliable biochemical data. The recent availability of large and complete sequence data sets for several organisms allows us to determine just how accurate such a numerical approach could be, and to attempt to minimize and quantify the error involved. We have optimized a statistical approach to protein location determination. Using our approach, we have determined that surprisingly few proteins are misallocated using the numerical method. We also examine the biological implications of the success of this technique.  相似文献   

5.
Abstract I provide a brief introduction to the concept of spatial autocorrelation and its incorporation into regression-type models. Spatial autocorrelation occurs when the response variable is correlated with itself at other locations in the region of interest. The autocorrelation usually takes a specific form where observations close in space are more correlated than those farther apart, and the rate of decay of the correlation is a function of the distance separating 2 locations. I present 2 commonly used models: 1) geostatistical modeling in which data are collected at points in the study region and 2) conditional autoregression (lattice) models in which data are aggregated over small nonoverlapping sub-areas of the study region. I also describe incorporation of explanatory covariates, such as habitat or physico-chemical attributes. I emphasize frequentist methods, but I briefly describe Bayesian approaches. I also provide some advantages, such as obtaining correct standard errors for estimators, and disadvantages, such as requirements for larger sample sizes, of incorporating spatial autocorrelation into the modeling effort. This information can aid researchers in designing and analyzing models of the relationships between species distributions and habitat. As a result, more informative models can be developed which further aid in management of wildlife.  相似文献   

6.
At the end of an enzymic hydrolysis process involving a solid lignocellulosic substrate, enzymes are found both in solution and absorbed to the substrate residue. Removal of residue from the system will result in loss of some of the enzymes, the extent of which will depend on the design of the process. To minimize enzyme loss, a study has been conducted in which six process models have been formulated and an enzyme loss function derived for each model based on the total amount of enzymes lost through residue removal. Model 1 is a reference model, characterized by an uninterrupted hydrolysis throughout the entire hydrolysis period. The residue is then washed in order to recover both sugar and adsorbed enzymes before the residue is discarded. Models 2-6 are all characterized by the removal of hydrolysate three times during the process, recirculation of dissolved and adsorbed enzymes to various points in the process and selection of a stage at which the residue is removed. The following conclusions could be drawn from the derived enzyme loss functions: Increased enzyme adsorption leads to increased enzyme loss.The enzyme loss decreases if the solid residue is removed late in the process.Both adsorbed and dissolved enzymes should be introduced at the starting point of the process. This is particularly important for dissolved enzymes. Three models were chosen for experimental studies, which are reported in a second, accompanying article. The experimental results obtained are compared with the theoretical study reported here.  相似文献   

7.
A generalized case-control (GCC) study, like the standard case-control study, leverages outcome-dependent sampling (ODS) to extend to nonbinary responses. We develop a novel, unifying approach for analyzing GCC study data using the recently developed semiparametric extension of the generalized linear model (GLM), which is substantially more robust to model misspecification than existing approaches based on parametric GLMs. For valid estimation and inference, we use a conditional likelihood to account for the biased sampling design. We describe analysis procedures for estimation and inference for the semiparametric GLM under a conditional likelihood, and we discuss problems with estimation and inference under a conditional likelihood when the response distribution is misspecified. We demonstrate the flexibility of our approach over existing ones through extensive simulation studies, and we apply the methodology to an analysis of the Asset and Health Dynamics Among the Oldest Old study, which motives our research. The proposed approach yields a simple yet versatile solution for handling ODS in a wide variety of possible response distributions and sampling schemes encountered in practice.  相似文献   

8.
Abstract

The spatial distribution of vital root tips and ectomycorrhizal (ECM) communities in forest soils is characterized by patchiness at a microscale level, mostly related to the distribution patterns of biotic and abiotic factors. A geostatistical model was applied to verify if spatial analyses could be useful in identifying an appropriate sampling method to study root tip vitality, ectomycorrhization and the ECM community. Root samples were collected from two high mountain Norway spruce forests (Trentino province, Italy) following a geometrical design. Laboratory microscopic and geostatistical ordinary kriging analyses were used to map tip vitality and ectomycorrhization degree, ECM richness and distribution grouped in “exploration types” (amount of emanating hyphae or presence and differentiation of rhizomorphs). Spatial gradients of the examined features existed at plant level, associated to the up-downslope direction (root tip vitality and ectomycorrhization, ECM richness) and distance from the stem base (ECM exploration types). The effectiveness of the geostatistical model used demonstrates that a geometrical sampling design, associated to spatial mapping techniques, can be useful in research where the tree, and not the forest, is the subject (mycological and phytopathological studies).  相似文献   

9.
Many critical ecological issues require the analysis of large spatial point data sets – for example, modelling species distributions, abundance and spread from survey data. But modelling spatial relationships, especially in large point data sets, presents major computational challenges. We use a novel Bayesian hierarchical statistical approach, 'spatial predictive process' modelling, to predict the distribution of a major invasive plant species, Celastrus orbiculatus , in the northeastern USA. The model runs orders of magnitude faster than traditional geostatistical models on a large data set of c . 4000 points, and performs better than generalized linear models, generalized additive models and geographically weighted regression in cross-validation. We also use this approach to model simultaneously the distributions of a set of four major invasive species in a spatially explicit multivariate model. This multispecies analysis demonstrates that some pairs of species exhibit negative residual spatial covariation, suggesting potential competitive interaction or divergent responses to unmeasured factors.  相似文献   

10.
The combination of population pharmacokinetic studies   总被引:4,自引:0,他引:4  
Wakefield J  Rahman N 《Biometrics》2000,56(1):263-270
Pharmacokinetic data consist of drug concentrations with associated known sampling times and are collected following the administration of known dosage regimens. Population pharmacokinetic data consist of such data on a number of individuals, possibly along with individual-specific characteristics. During drug development, a number of population pharmacokinetic studies are typically carried out and the combination of such studies is of great importance for characterizing the drug and, in particular, for the design of future studies. In this paper, we describe a model that may be used to combine population pharmacokinetic data. The model is illustrated using six phase I studies of the antiasthmatic drug fluticasone propionate. Our approach is Bayesian and computation is carried out using Markov chain Monte Carlo. We provide a number of simplifications to the model that may be made in order to ease simulation from the posterior distribution.  相似文献   

11.
Recent technological advances have made it possible to collect high-dimensional genomic data along with clinical data on a large number of subjects. In the studies of chronic diseases such as cancer, it is of great interest to integrate clinical and genomic data to build a comprehensive understanding of the disease mechanisms. Despite extensive studies on integrative analysis, it remains an ongoing challenge to model the interaction effects between clinical and genomic variables, due to high dimensionality of the data and heterogeneity across data types. In this paper, we propose an integrative approach that models interaction effects using a single-index varying-coefficient model, where the effects of genomic features can be modified by clinical variables. We propose a penalized approach for separate selection of main and interaction effects. Notably, the proposed methods can be applied to right-censored survival outcomes based on a Cox proportional hazards model. We demonstrate the advantages of the proposed methods through extensive simulation studies and provide applications to a motivating cancer genomic study.  相似文献   

12.
Disease incidence or mortality data are typically available as rates or counts for specified regions, collected over time. We propose Bayesian nonparametric spatial modeling approaches to analyze such data. We develop a hierarchical specification using spatial random effects modeled with a Dirichlet process prior. The Dirichlet process is centered around a multivariate normal distribution. This latter distribution arises from a log-Gaussian process model that provides a latent incidence rate surface, followed by block averaging to the areal units determined by the regions in the study. With regard to the resulting posterior predictive inference, the modeling approach is shown to be equivalent to an approach based on block averaging of a spatial Dirichlet process to obtain a prior probability model for the finite dimensional distribution of the spatial random effects. We introduce a dynamic formulation for the spatial random effects to extend the model to spatio-temporal settings. Posterior inference is implemented through Gibbs sampling. We illustrate the methodology with simulated data as well as with a data set on lung cancer incidences for all 88 counties in the state of Ohio over an observation period of 21 years.  相似文献   

13.
动物通道是缓解高速公路对其周边野生动物生境隔离的有效措施,通道的位置是影响其使用效率的关键因素,然而现有研究对通道的选址却甚少涉及。以武深高速为例,推荐一种基于物种运动路径识别的通道选址方法,选取影响动物生境选择的环境因子构建评价体系,借助GIS手段对公路周边野生动物生境适宜性进行分析,在此基础上借鉴水文分析原理快速准确地刻画出物种在生境中的潜在活动路径,从而确定了5处高速公路上建设动物通道的理想位置。结果表明,该方法能定量地反映出生境的质量格局对于物种运动的影响,准确定位出物种运动受到阻碍的关键区域,在景观层次上,提出的通道位置能有效地缓解栖息地破碎化造成的生态压力;研究不但能弥补目前研究的不足,同时亦为道路网设计、城市生态规划等相关领域研究提供科学参考。  相似文献   

14.
准确预测土壤有机碳的空间分布,对于土壤资源开发和保护、应对气候变化和生态系统健康都具有重要意义.本文以塔里木盆地北缘盐土1300 m×1700 m样地为试验区,采集5~10 cm深度土壤样品144个,构建土壤有机碳含量的贝叶斯地统计空间预测模型,并以普通克里格、序惯高斯模拟和逆距离加权方法为对照,评价贝叶斯地统计对土壤有机碳含量的预测性能.结果表明: 研究区土壤有机碳含量处于1.59~9.30 g·kg-1,平均值为4.36 g·kg-1,标准偏差为1.62 g·kg-1;半方差函数符合指数模型,空间结构比参数值为0.57;利用贝叶斯地统计方法,获得了土壤有机碳含量的空间分布图以及评价预测不确定性的预测方差、上95%分位数、下95%分位数分布图;与普通克里格、序惯高斯模拟和逆距离加权方法相比,贝叶斯地统计方法具有更高的土壤有机碳含量空间预测精度,显示出该方法对土壤有机碳含量预测的优越性.  相似文献   

15.
In recent years, small-area-based ecological regression analyses have been published that study the association between a health outcome and a covariate in several cities. These analyses have usually been performed independently for each city and have therefore yielded unrelated estimates for the cities considered, even though the same process has been studied in all of them. In this study, we propose a joint ecological regression model for multiple cities that accounts for spatial structure both within and between cities and explore the advantages of this model. The proposed model merges both disease mapping and geostatistical ideas. Our proposal is compared with two alternatives, one that models the association for each city as fixed effects and another that treats them as independent and identically distributed random effects. The proposed model allows us to estimate the association (and assess its significance) at locations with no available data. Our proposal is illustrated by an example of the association between unemployment (as a deprivation surrogate) and lung cancer mortality among men in 31 Spanish cities. In this example, the associations found were far more accurate for the proposed model than those from the fixed effects model. Our main conclusion is that ecological regression analyses can be markedly improved by performing joint analyses at several locations that share information among them. This finding should be taken into consideration in the design of future epidemiological studies.  相似文献   

16.
Recently, there has been an increased interest in modeling the association between aggregate disease counts and environmental exposures measured, for example via air pollution monitors, at point locations. This paper has two aims: first, we develop a model for such data in order to avoid ecological bias; second, we illustrate that modeling the exposure surface and estimating exposures may lead to bias in estimation of health effects. Design issues are also briefly considered, in particular the loss of information in moving from individual to ecological data, and the at-risk populations to consider in relation to the pollution monitor locations. The approach is investigated initially through simulations, and is then applied to a study of the association between mortality in those over 65 in the year 2000 and the previous year's SO2, in London. We conclude that the use of the proposed model can provide valid inference, but the use of estimated exposures should be carried out with great caution.  相似文献   

17.
The observation could be used to reduce the model uncertainties with data assimilation. If the observation cannot cover the whole model area due to spatial availability or instrument ability, how to do data assimilation at locations not covered by observation? Two commonly used strategies were firstly described: One is covariance localization (CL); the other is observation localization (OL). Compared with CL, OL is easy to parallelize and more efficient for large-scale analysis. This paper evaluated OL in soil moisture profile characterizations, in which the geostatistical semivariogram was used to fit the spatial correlated characteristics of synthetic L-Band microwave brightness temperature measurement. The fitted semivariogram model and the local ensemble transform Kalman filter algorithm are combined together to weight and assimilate the observations within a local region surrounding the grid cell of land surface model to be analyzed. Six scenarios were compared: 1_Obs with one nearest observation assimilated, 5_Obs with no more than five nearest local observations assimilated, and 9_Obs with no more than nine nearest local observations assimilated. The scenarios with no more than 16, 25, and 36 local observations were also compared. From the results we can conclude that more local observations involved in assimilation will improve estimations with an upper bound of 9 observations in this case. This study demonstrates the potentials of geostatistical correlation representation in OL to improve data assimilation of catchment scale soil moisture using synthetic L-band microwave brightness temperature, which cannot cover the study area fully in space due to vegetation effects.  相似文献   

18.
One barrier to interpreting the observational evidence concerning the adverse health effects of air pollution for public policy purposes is the measurement error inherent in estimates of exposure based on ambient pollutant monitors. Exposure assessment studies have shown that data from monitors at central sites may not adequately represent personal exposure. Thus, the exposure error resulting from using centrally measured data as a surrogate for personal exposure can potentially lead to a bias in estimates of the health effects of air pollution. This paper develops a multi-stage Poisson regression model for evaluating the effects of exposure measurement error on estimates of effects of particulate air pollution on mortality in time-series studies. To implement the model, we have used five validation data sets on personal exposure to PM10. Our goal is to combine data on the associations between ambient concentrations of particulate matter and mortality for a specific location, with the validation data on the association between ambient and personal concentrations of particulate matter at the locations where data have been collected. We use these data in a model to estimate the relative risk of mortality associated with estimated personal-exposure concentrations and make a comparison with the risk of mortality estimated with measurements of ambient concentration alone. We apply this method to data comprising daily mortality counts, ambient concentrations of PM10measured at a central site, and temperature for Baltimore, Maryland from 1987 to 1994. We have selected our home city of Baltimore to illustrate the method; the measurement error correction model is general and can be applied to other appropriate locations.Our approach uses a combination of: (1) a generalized additive model with log link and Poisson error for the mortality-personal-exposure association; (2) a multi-stage linear model to estimate the variability across the five validation data sets in the personal-ambient-exposure association; (3) data augmentation methods to address the uncertainty resulting from the missing personal exposure time series in Baltimore. In the Poisson regression model, we account for smooth seasonal and annual trends in mortality using smoothing splines. Taking into account the heterogeneity across locations in the personal-ambient-exposure relationship, we quantify the degree to which the exposure measurement error biases the results toward the null hypothesis of no effect, and estimate the loss of precision in the estimated health effects due to indirectly estimating personal exposures from ambient measurements.  相似文献   

19.
The sterile insect technique (SIT) is an environmental-friendly method used against Anastrepha ludens Loew (Diptera: Tephritidae) populations. This study aimed to perform an analysis of the spatial variability of the field distribution of sterile A. ludens using a geostatistical approach along with Geographic Information Systems (GIS). Field data on captures of sterile A. ludens during a Valencia orange season over a release area were analysed using spherical, exponential and Gaussian variograms. Such variograms were evaluated by criteria such as the mean absolute error, average standard error, root mean square error and the coefficient of determination. Results revealed a spatially structured distribution of sterile A. ludens across the release area. Interpolated models by Ordinary Kriging technique exhibited continuous surfaces evidencing spatial heterogeneity of the distribution of flies. Such a result evidenced that the spatial dynamics of flies significantly varied despite the planned uniform coverage of the release. The GIS led to integrating spatial information of the spatial dynamics through one single model. The release activity should be improved westward of the studied area, as the final model suggested that the ratio sterile: wild is lower than that in the east. This study provides insights into the spatial analysis of the distribution of sterile flies further than one single geographical point. Moreover, it highlights geostatistical techniques and GIS as useful tools for the assessment of the impact and quality of the release activity over fruit-growing areas subjected to an area-wide integrated pest management approach.  相似文献   

20.
ABSTRACT The Mahalanobis distance statistic (D2) has emerged as an effective tool to identify suitable habitat from presence data alone, but there has been no mechanism to select among potential habitat covariates. We propose that the best combination of explanatory variables for a D2 model can be identified by ranking potential models based on the proportion of the entire study area that is classified as potentially suitable habitat given that a predetermined proportion of occupied locations are correctly classified. In effect, our approach seeks to minimize errors of commission, or maximize specificity, while holding the omission error rate constant. We used this approach to identify potentially suitable habitat for the Olympic marmot (Marmota olympus), a declining species endemic to Olympic National Park, Washington, USA. We compared models built with all combinations of 11 habitat variables. A 7-variable model identified 21,143 ha within the park as potentially suitable for marmots, correctly classifying 80% of occupied locations. Additional refinements to the 7-variable model (e.g., eliminating small patches) further reduced the predicted area to 18,579 ha with little reduction in predictive power. Although we sought a model that would allow field workers to find 80% of Olympic marmot locations, in fact, <3% of 376 occupied locations and <9% of abandoned locations were >100 m from habitat predicted by the final model, suggesting that >90% of occupied marmot habitat could be found by observant workers surveying predicted habitat. The model comparison procedure allowed us to identify the suite of covariates that maximized specificity of our model and, thus, limited the amount of less favorable habitat included in the final prediction area. We expect that by maximizing specificity of models built from presence-only data, our model comparison procedure will be useful to conservation practitioners planning reintroductions, searching for rare species, or identifying habitat for protection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号