首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Diversity indices might be used to assess the impact of treatments on the relative abundance patterns in species communities. When several treatments are to be compared, simultaneous confidence intervals for the differences of diversity indices between treatments may be used. The simultaneous confidence interval methods described until now are either constructed or validated under the assumption of the multinomial distribution for the abundance counts. Motivated by four example data sets with background in agricultural and marine ecology, we focus on the situation when available replications show that the count data exhibit extra‐multinomial variability. Based on simulated overdispersed count data, we compare previously proposed methods assuming multinomial distribution, a method assuming normal distribution for the replicated observations of the diversity indices and three different bootstrap methods to construct simultaneous confidence intervals for multiple differences of Simpson and Shannon diversity indices. The focus of the simulation study is on comparisons to a control group. The severe failure of asymptotic multinomial methods in overdispersed settings is illustrated. Among the bootstrap methods, the widely known Westfall–Young method performs best for the Simpson index, while for the Shannon index, two methods based on stratified bootstrap and summed count data are preferable. The methods application is illustrated for an example.  相似文献   

2.
Monitoring the abundance of cryptic species inevitably relies on the use of index methods. Unfortunately, detectability is often confounded by unidentified covariates. One such species is the critically endangered Australasian Bittern Botaurus poiciloptilus. Current monitoring relies upon the ability to count males based on the conspicuous breeding calls of males. However, as in many vocal species, calling rates vary spatially and temporally, making it necessary to account for this when using call counts to index abundance. We undertook 461 15‐min call counts of Australasian Bitterns, in a range of conditions, during two breeding seasons at Whangamarino wetland, New Zealand. We fitted a range of generalized linear mixed models to these data to determine which factors were the best predictors of calling rate per individual Bittern (CRPI), allowing us to make recommendations regarding the optimum time and conditions for monitoring. Bittern CRPI was predictable in terms of time of day, month, cloud cover, rainfall and certain moon parameters, but some spatial and temporal variation remained unexplained. Results showed that the best time to detect Australasian Bitterns was 1 h before sunrise, in September (austral spring), on a moonlit night with no cloud or rain. Such models are useful for identifying times and conditions when counts are the highest and least variable, and could be applied to any species or cue count monitoring method where detection depends on counting calling individuals. Results can be used to standardize index counts, or sensibly to adjust and compare counts from different times. Standardizing monitoring in this way can lead to the development of monitoring methods that have a greater power to show population changes across shorter time periods. Moreover, the use of modelling processes to estimate effect sizes creates potential for such methods to be applied in circumstances where monitoring conditions are rarely optimum and standardization creates logistical trade‐offs, something that is particularly common in studies of cryptic species.  相似文献   

3.
Abundance data are widely used to monitor long-term population trends for management and conservation of species of interest. Programs that collect count data are often prohibitively expensive and time intensive, limiting the number of species that can be simultaneously monitored. Presence data, on the other hand, can often be collected in less time and for multiple species simultaneously. We investigate the relationship of counts to presence using 49 butterfly species across 4 sites over 9 years, and then compare trends produced from each index. We also employed simulated datasets to test the effect of reduced sampling on the relationship of counts to presence data and to investigate changes in each index’s power to reveal population trends. Presence and counts were highly correlated for most species tested, and population trends based on each index were concordant for most species. The effect of reduced sampling was species-specific, but on a whole, sensitivity of both indices to detect population trends was reduced. Common and rare species, as well as those with a range of life-history and behavioral traits performed equally well. The relationship between presence and count data may break down in cases of very abundant and widespread species with extended flight seasons. Our results suggest that when used cautiously, presence data has the potential to be used as a surrogate for counts. Collection of presence data may be useful for multi-species monitoring or to reduce the duration of monitoring visits without fully sacrificing the ability to infer population trends.  相似文献   

4.
Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are ‘structural’ (induced subgraphs) and ‘functional’ (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File.  相似文献   

5.

Background

Much of our current understanding of the epidemiology of Ascaris lumbricoides infections in humans has been acquired by analyzing worm count data. These data are collected by treating infected individuals with anthelmintics so that worms are expelled intact from the gastrointestinal tract. Analysis of such data established that individuals are predisposed to infection with few or many worms and members of the same household tend to harbor similar numbers of worms. These effects, known respectively as individual predisposition and household clustering, are considered characteristic of the epidemiology of ascariasis. The mechanisms behind these phenomena, however, remain unclear. In particular, the impact of heterogeneous individual exposures to infectious stages has not been thoroughly explored.

Methodology/Principal Findings

Bayesian methods were used to fit a three-level hierarchical statistical model to A. lumbricoides worm counts derived from a three-round chemo-expulsion study carried out in Dhaka, Bangladesh. The effects of individual predisposition, household clustering and household covariates of the numbers of worms per host (worm burden) were considered simultaneously. Individual predisposition was found to be of limited epidemiological significance once household clustering had been accounted for. The degree of intra-household variability among worm burdens was found to be reduced by approximately 58% when household covariates were included in the model. Covariates relating to decreased affluence and quality of housing construction were associated with a statistically significant increase in worm burden.

Conclusions/Significance

Heterogeneities in the exposure of individuals to infectious eggs have an important role in the epidemiology of A. lumbricoides infection. The household covariates identified as being associated with worm burden provide valuable insights into the source of these heterogeneities although above all emphasize and reiterate that infection with A. lumbricoides is inextricably associated with acute poverty.  相似文献   

6.
Point counts are commonly used to assess changes in bird abundance, including analytical approaches such as distance sampling that estimate density. Point‐count methods have come under increasing scrutiny because effects of detection probability and field error are difficult to quantify. For seven forest songbirds, we compared fixed‐radii counts (50 m and 100 m) and density estimates obtained from distance sampling to known numbers of birds determined by territory mapping. We applied point‐count analytic approaches to a typical forest management question and compared results to those obtained by territory mapping. We used a before–after control impact (BACI) analysis with a data set collected across seven study areas in the central Appalachians from 2006 to 2010. Using a 50‐m fixed radius, variance in error was at least 1.5 times that of the other methods, whereas a 100‐m fixed radius underestimated actual density by >3 territories per 10 ha for the most abundant species. Distance sampling improved accuracy and precision compared to fixed‐radius counts, although estimates were affected by birds counted outside 10‐ha units. In the BACI analysis, territory mapping detected an overall treatment effect for five of the seven species, and effects were generally consistent each year. In contrast, all point‐count methods failed to detect two treatment effects due to variance and error in annual estimates. Overall, our results highlight the need for adequate sample sizes to reduce variance, and skilled observers to reduce the level of error in point‐count data. Ultimately, the advantages and disadvantages of different survey methods should be considered in the context of overall study design and objectives, allowing for trade‐offs among effort, accuracy, and power to detect treatment effects.  相似文献   

7.
For time series of count data, correlated measurements, clustering as well as excessive zeros occur simultaneously in biomedical applications. Ignoring such effects might contribute to misleading treatment outcomes. A generalized mixture Poisson geometric process (GMPGP) model and a zero‐altered mixture Poisson geometric process (ZMPGP) model are developed from the geometric process model, which was originally developed for modelling positive continuous data and was extended to handle count data. These models are motivated by evaluating the trend development of new tumour counts for bladder cancer patients as well as by identifying useful covariates which affect the count level. The models are implemented using Bayesian method with Markov chain Monte Carlo (MCMC) algorithms and are assessed using deviance information criterion (DIC).  相似文献   

8.
This paper considers the clustering problem of physical step count data recorded on wearable devices. Clustering step data give an insight into an individual's activity status and further provide the groundwork for health‐related policies. However, classical methods, such as K‐means clustering and hierarchical clustering, are not suitable for step count data that are typically high‐dimensional and zero‐inflated. This paper presents a new clustering method for step data based on a novel combination of ensemble clustering and binning. We first construct multiple sets of binned data by changing the size and starting position of the bin, and then merge the clustering results from the binned data using a voting method. The advantage of binning, as a critical component, is that it substantially reduces the dimension of the original data while preserving the essential characteristics of the data. As a result, combining clustering results from multiple binned data can provide an improved clustering result that reflects both local and global structures of the data. Simulation studies and real data analysis were carried out to evaluate the empirical performance of the proposed method and demonstrate its general utility.  相似文献   

9.
Zero‐truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well‐known software packages, and additional programming is often required. Motivated by the Rao–Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero‐truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data.  相似文献   

10.
MicroRNAs (miRNAs) are important regulatory molecules in eukaryotic organisms. Existing methods for the identification of mature miRNA sequences in plants rely extensively on the search for stem–loop structures, leading to high false negative rates. Here, we describe a probabilistic method for ranking putative plant miRNAs using a naïve Bayes classifier and its publicly available implementation. We use a number of properties to construct the classifier, including sequence length, number of observations, existence of detectable predicted miRNA* sequences, the distribution of nearby reads and mapping multiplicity. We apply the method to small RNA sequence data from soybean, peach, Arabidopsis and rice and provide experimental validation of several predictions in soybean. The approach performs well overall and strongly enriches for known miRNAs over other types of sequences. By utilizing a Bayesian approach to rank putative miRNAs, our method is able to score miRNAs that would be eliminated by other methods, such as those that have low counts or lack detectable miRNA* sequences. As a result, we are able to detect several soybean miRNA candidates, including some that are 24 nucleotides long, a class that is almost universally eliminated by other methods.  相似文献   

11.
Apex carnivores are wide‐ranging, low‐density, hard to detect, and declining throughout most of their range, making population monitoring both critical and challenging. Rapid and inexpensive index calibration survey (ICS) methods have been developed to monitor large African carnivores. ICS methods assume constant detection probability and a predictable relationship between the index and the actual population of interest. The precision and utility of the resulting estimates from ICS methods have been questioned. We assessed the performance of one ICS method for large carnivores—track counts—with data from two long‐term studies of African lion populations. We conducted Monte Carlo simulation of intersections between transects (road segments) and lion movement paths (from GPS collar data) at varying survey intensities. Then, using the track count method we estimated population size and its confidence limits. We found that estimates either overstate precision or are too imprecise to be meaningful. Overstated precision stemmed from discarding the variance from population estimates when developing the method and from treating the conversion from tracks counts to population density as a back‐transformation, rather than applying the equation for the variance of a linear function. To effectively assess the status of species, the IUCN has set guidelines, and these should be integrated in survey designs. We propose reporting the half relative confidence interval width (HRCIW) as an easily calculable and interpretable measure of precision. We show that track counts do not adhere to IUCN criteria, and we argue that ICS methods for wide‐ranging low‐density species are unlikely to meet those criteria. Established, intensive methods lead to precise estimates, but some new approaches, like short, intensive, (spatial) capture–mark–recapture (CMR/SECR) studies, aided by camera trapping and/or genetic identification of individuals, hold promise. A handbook of best practices in monitoring populations of apex carnivores is strongly recommended.  相似文献   

12.
Summary Spatial cluster detection is an important methodology for identifying regions with excessive numbers of adverse health events without making strong model assumptions on the underlying spatial dependence structure. Previous work has focused on point or individual‐level outcome data and few advances have been made when the outcome data are reported at an aggregated level, for example, at the county‐ or census‐tract level. This article proposes a new class of spatial cluster detection methods for point or aggregate data, comprising of continuous, binary, and count data. Compared with the existing spatial cluster detection methods it has the following advantages. First, it readily incorporates region‐specific weights, for example, based on a region's population or a region's outcome variance, which is the key for aggregate data. Second, the established general framework allows for area‐level and individual‐level covariate adjustment. A simulation study is conducted to evaluate the performance of the method. The proposed method is then applied to assess spatial clustering of high Body Mass Index in a health maintenance organization population in the Seattle, Washington, USA area.  相似文献   

13.
ABSTRACT The validity of treating counts as indices to abundance is based on the assumption that the expected detection probability, E(p), is constant over time or comparison groups or, more realistically, that variation in p is small relative to variation in population size that investigators seek to detect. Unfortunately, reliable estimates of E(p) and var(p) are lacking for most index methods. As a case study, we applied the time‐of‐detection method to temporally replicated (within season) aural counts of crowing male Ring‐necked Pheasants (Phasianus colchicus) at 18 sites in southern Minnesota in 2007 to evaluate the detectability assumptions. More specifically, we used the time‐of‐detection method to estimate E(p) and var(p), and then used these estimates in a Monte Carlo simulation to evaluate bias‐variance tradeoffs associated with adjusting count indices for imperfect detection. The estimated mean detection probability in our case study was 0.533 (SE = 0.030) and estimated spatial variation in E(p) was 0.081 (95% CI: 0.057–0.126). On average, both adjusted (for) and unadjusted counts of crowing males qualitatively described the simulated relationship between pheasant abundance and grassland abundance, but the bias‐variance tradeoff was smaller for adjusted counts (MSE = 0.003 vs. 0.045, respectively). Our case study supports the general recommendation to use, whenever feasible, formal population‐estimation procedures (e.g., mark‐recapture, distance sampling, double sampling) to account for imperfect detection. However, we caution that interpreting estimates of absolute abundance can be complicated, even if formal estimation methods are used. For example, the time‐of‐detection method was useful for evaluating detectability assumptions in our case study and the method could be used to adjust aural count indices for imperfect detection. Conversely, using the time‐of‐detection method to estimate absolute abundances in our case study was problematic because the biological populations and sampling coverage could not be clearly delineated. These estimation and inference challenges may also be important in other avian surveys that involve mobile species (whose home ranges may overlap several sampling sites), temporally replicated counts, and inexact sampling coverage.  相似文献   

14.
Dimension reduction of high‐dimensional microbiome data facilitates subsequent analysis such as regression and clustering. Most existing reduction methods cannot fully accommodate the special features of the data such as count‐valued and excessive zero reads. We propose a zero‐inflated Poisson factor analysis model in this paper. The model assumes that microbiome read counts follow zero‐inflated Poisson distributions with library size as offset and Poisson rates negatively related to the inflated zero occurrences. The latent parameters of the model form a low‐rank matrix consisting of interpretable loadings and low‐dimensional scores that can be used for further analyses. We develop an efficient and robust expectation‐maximization algorithm for parameter estimation. We demonstrate the efficacy of the proposed method using comprehensive simulation studies. The application to the Oral Infections, Glucose Intolerance, and Insulin Resistance Study provides valuable insights into the relation between subgingival microbiome and periodontal disease.  相似文献   

15.

Introduction

With the renewed drive towards malaria elimination, there is a need for improved surveillance tools. While time series analysis is an important tool for surveillance, prediction and for measuring interventions’ impact, approximations by commonly used Gaussian methods are prone to inaccuracies when case counts are low. Therefore, statistical methods appropriate for count data are required, especially during “consolidation” and “pre-elimination” phases.

Methods

Generalized autoregressive moving average (GARMA) models were extended to generalized seasonal autoregressive integrated moving average (GSARIMA) models for parsimonious observation-driven modelling of non Gaussian, non stationary and/or seasonal time series of count data. The models were applied to monthly malaria case time series in a district in Sri Lanka, where malaria has decreased dramatically in recent years.

Results

The malaria series showed long-term changes in the mean, unstable variance and seasonality. After fitting negative-binomial Bayesian models, both a GSARIMA and a GARIMA deterministic seasonality model were selected based on different criteria. Posterior predictive distributions indicated that negative-binomial models provided better predictions than Gaussian models, especially when counts were low. The G(S)ARIMA models were able to capture the autocorrelation in the series.

Conclusions

G(S)ARIMA models may be particularly useful in the drive towards malaria elimination, since episode count series are often seasonal and non-stationary, especially when control is increased. Although building and fitting GSARIMA models is laborious, they may provide more realistic prediction distributions than do Gaussian methods and may be more suitable when counts are low.  相似文献   

16.
Outlier detection and cleaning procedures were evaluated to estimate mathematical restricted variogram models with discrete insect population count data. Because variogram modeling is significantly affected by outliers, methods to detect and clean outliers from data sets are critical for proper variogram modeling. In this study, we examined spatial data in the form of discrete measurements of insect counts on a rectangular grid. Two well-known insect pest population data were analyzed; one data set was the western flower thrips, Frankliniella occidentalis (Pergande) on greenhouse cucumbers and the other was the greenhouse whitefly, Trialeurodes vaporariorum (Westwood) on greenhouse cherry tomatoes. A spatial additive outlier model was constructed to detect outliers in both the isolated and patchy spatial distributions of outliers, and the outliers were cleaned with the neighboring median cleaner. To analyze the effect of outliers, we compared the relative nugget effects of data cleaned of outliers and data still containing outliers after transformation. In addition, the correlation coefficients between the actual and predicted values were compared using the leave-one-out cross-validation method with data cleaned of outliers and non-cleaned data after unbiased back transformation. The outlier detection and cleaning procedure improved geostatistical analysis, particularly by reducing the nugget effect, which greatly impacts the prediction variance of kriging. Consequently, the outlier detection and cleaning procedures used here improved the results of geostatistical analysis with highly skewed and extremely fluctuating data, such as insect counts.  相似文献   

17.
ABSTRACT Count data with means <2 are often assumed to follow a Poisson distribution. However, in many cases these kinds of data, such as number of young fledged, are more appropriately considered to be multinomial observations due to naturally occurring upper truncation of the distribution. We evaluated the performance of several versions of multinomial regression, plus Poisson and normal regression, for analysis of count data with means <2 through Monte Carlo simulations. Simulated data mimicked observed counts of number of young fledged (0, 1, 2, or 3) by California spotted owls (Strix occidentalis occidentalis). We considered size and power of tests to detect differences among 10 levels of a categorical predictor, as well as tests for trends across 10-year periods. We found regular regression and analysis of variance procedures based on a normal distribution to perform satisfactorily in all cases we considered, whereas failure rate of multinomial procedures was often excessively high, and the Poisson model demonstrated inappropriate test size for data where the variance/mean ratio was <1 or >1.2. Thus, managers can use simple statistical methods with which they are likely already familiar to analyze the kinds of count data we described here.  相似文献   

18.
19.
Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/.  相似文献   

20.
《Ecological Indicators》2007,7(2):254-276
A multivariate index assessing community stress in marine benthic count data is compared to other multivariate methods. Both ordinations and clustering methods are included in the comparison and so is AMBI (AZTI marine biotic index). The community disturbance index (CDI) requires the data to contain undisturbed reference samples. If this requirement is met, the calculations are carried out in two steps. Firstly, the reference samples are separated and used to build a model representing the natural variation. Subsequently, the remaining samples are compared to the model and the respective disturbance indices are calculated. The CDI has the same sensitivity as the traditional multivariate methods. Additionally, one obtains a quantification of the relative level of disturbance. The comparison is performed using data from monitoring surveys at three different oilfields in the North Sea: Troll, Ekofisk and Oseberg-Brage. The samples are collected in 1996, 1990 and 1996, respectively.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号