首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Mark rate, or the proportion of the population with unique, identifiable marks, must be determined in order to estimate population size from photographic identification data. In this study we address field sampling protocols and estimation methods for robust estimation of mark rate and its uncertainty in cetacean populations. We present two alternatives for estimating the variance of mark rate: (1) a variance estimator for clusters of unequal sizes (SRCS) and (2) a hierarchical Bayesian model (SRCS-Bayes), and compare them to the simple random sampling (SRS) variance estimator. We tested these variance estimators using a simulation to see how they perform at varying mark rates, number of groups sampled, photos per group, and mean group sizes. The hierarchical Bayesian model outperformed the frequentist variance estimators, with the true mark rate of the population held in its 95% HDI 91.9% of the time (compared with coverage of 79% for the SRS method and 76.3% for the SRCS-Cochran method). The simulation results suggest that, ideally, mark rate and its precision should be quantified using hierarchical Bayesian modeling, and researchers should attempt to sample as many unique groups as possible to improve accuracy and precision.  相似文献   

2.
Chen Z  Wang YG 《Biometrics》2004,60(4):997-1004
This article is motivated by a lung cancer study where a regression model is involved and the response variable is too expensive to measure but the predictor variable can be measured easily with relatively negligible cost. This situation occurs quite often in medical studies, quantitative genetics, and ecological and environmental studies. In this article, by using the idea of ranked-set sampling (RSS), we develop sampling strategies that can reduce cost and increase efficiency of the regression analysis for the above-mentioned situation. The developed method is applied retrospectively to a lung cancer study. In the lung cancer study, the interest is to investigate the association between smoking status and three biomarkers: polyphenol DNA adducts, micronuclei, and sister chromatic exchanges. Optimal sampling schemes with different optimality criteria such as A-, D-, and integrated mean square error (IMSE)-optimality are considered in the application. With set size 10 in RSS, the improvement of the optimal schemes over simple random sampling (SRS) is great. For instance, by using the optimal scheme with IMSE-optimality, the IMSEs of the estimated regression functions for the three biomarkers are reduced to about half of those incurred by using SRS.  相似文献   

3.
Evaluating the classification accuracy of a candidate biomarker signaling the onset of disease or disease status is essential for medical decision making. A good biomarker would accurately identify the patients who are likely to progress or die at a particular time in the future or who are in urgent need for active treatments. To assess the performance of a candidate biomarker, the receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) are commonly used. In many cases, the standard simple random sampling (SRS) design used for biomarker validation studies is costly and inefficient. In order to improve the efficiency and reduce the cost of biomarker validation, marker‐dependent sampling (MDS) may be used. In a MDS design, the selection of patients to assess true survival time is dependent on the result of a biomarker assay. In this article, we introduce a nonparametric estimator for time‐dependent AUC under a MDS design. The consistency and the asymptotic normality of the proposed estimator is established. Simulation shows the unbiasedness of the proposed estimator and a significant efficiency gain of the MDS design over the SRS design.  相似文献   

4.
Dorazio RM  Jelks HL  Jordan F 《Biometrics》2005,61(4):1093-1101
A statistical modeling framework is described for estimating the abundances of spatially distinct subpopulations of animals surveyed using removal sampling. To illustrate this framework, hierarchical models are developed using the Poisson and negative-binomial distributions to model variation in abundance among subpopulations and using the beta distribution to model variation in capture probabilities. These models are fitted to the removal counts observed in a survey of a federally endangered fish species. The resulting estimates of abundance have similar or better precision than those computed using the conventional approach of analyzing the removal counts of each subpopulation separately. Extension of the hierarchical models to include spatial covariates of abundance is straightforward and may be used to identify important features of an animal's habitat or to predict the abundance of animals at unsampled locations.  相似文献   

5.
Constant precision sampling plans for the white apple leafhopper, Typhlocyba pomaria McAtee, were developed so that it could be used as an indicator species for system stability as new integrated pest management programs without broad-spectrum pesticides are developed. Taylor's power law was used to model the relationship between the mean and the variance, and Green's constant precision sequential sample equation was used to develop sampling plans. Bootstrap simulations of the sampling plans showed greater precision (D = 0.25) than the desired precision (Do = 0.3), particularly at low mean population densities. We found that by adjusting the Do value in Green's equation to 0.4, we were able to reduce the average sample number by 25% and provided an average D = 0.31. The sampling plan described allows T. pomaria to be used as reasonable indicator species of agroecosystem stability in Washington apple orchards.  相似文献   

6.
A strategy was proposed for constructing core collections by least distance stepwise sampling (LDSS) based on genotypic values. In each procedure of cluster, the sampling is performed in the subgroup with the least distance in the dendrogram during constructing a core collection. Mean difference percentage (MD), variance difference percentage (VD), coincidence rate of range (CR) and variable rate of coefficient of variation (VR) were used to evaluate the representativeness of core collections constructed by this strategy. A cotton germplasm collection of 1,547 accessions with 18 quantitative traits was used to construct core collections. Genotypic values of all quantitative traits of the cotton collection were unbiasedly predicted based on mixed linear model approach. By three sampling percentages (10, 20 and 30%), four genetic distances (city block distance, Euclidean distance, standardized Euclidean distance and Mahalanobis distance) combining four hierarchical cluster methods (nearest distance method, furthest distance method, unweighted pair-group average method and Ward’s method) were adopted to evaluate the property of this strategy. Simulations were conducted in order to draw consistent, stable and reproducible results. The principal components analysis was performed to validate this strategy. The results showed that core collections constructed by LDSS strategy had a good representativeness of the initial collection. As compared to the control strategy (stepwise clusters with random sampling strategy), LDSS strategy could construct more representative core collections. For LDSS strategy, cluster methods did not need to be considered because all hierarchical cluster methods could give same results completely. The results also suggested that standardized Euclidean distance was an appropriate genetic distance for constructing core collections in this strategy.  相似文献   

7.
Adaptive sampling designs are becoming increasingly popular in environmental science, particularly for surveying rare and aggregated populations. An adaptive sample is one in which the survey design is modified, or adapted, in some way on the basis of information gained during the survey. There are many different adaptive survey designs that can be used to estimate animal and plant abundance. In adaptive cluster sampling, additional sample effort is allocated during the survey to the immediate neighborhood in which the species is found. In adaptive stratified sampling, additional sample effort is allocated during the survey to strata of high abundance. The appealing feature of these adaptive designs is that the field biologist gets to do what innately seems sensible when working with rare and aggregated populations—field effort is targeted around where the species is observed in the first wave of the survey. However, there are logistical challenges of applying this principle of targeted field effort while remaining in the framework of probability-based sampling. We propose a simplified adaptive survey design that incorporates both targeting field effort and being logistically feasible. We show with a case study population of rockfish that complete allocation stratified sampling is a very efficient design.  相似文献   

8.
Ranked set sampling (RSS) is a sampling procedure that can be considerably more efficient than simple random sampling (SRS). When the variable of interest is binary, ranking of the sample observations can be implemented using the estimated probabilities of success obtained from a logistic regression model developed for the binary variable. The main objective of this study is to use substantial data sets to investigate the application of RSS to estimation of a proportion for a population that is different from the one that provides the logistic regression. Our results indicate that precision in estimation of a population proportion is improved through the use of logistic regression to carry out the RSS ranking and, hence, the sample size required to achieve a desired precision is reduced. Further, the choice and the distribution of covariates in the logistic regression model are not overly crucial for the performance of a balanced RSS procedure.  相似文献   

9.
Lack of funds is one major issue in ecology, in particular at local scale. It is known that sustainable management of a natural population requires a good understanding of its functioning, itself dependent on a good long term monitoring program. Such programs are usually very difficult to implement, especially for resources characterized by high spatio-temporal variation in their distribution, resulting in a trade off between efficiency and costs. Today, thanks to rapidly evolving statistical theory, new survey designs are developed, some with the characteristic of well balancing samples in the study area. This paper aims at demonstrating that theses advanced sampling designs perform better than the usual ones for long term monitoring program of local resources, with the added benefices of saving money and also increasing results accuracy. To prove it, and for it high spatio-temporal variation in it distribution, we choose the example of Manila clam's stock monitoring in Arcachon bay. This stock is under high scrutiny and last campaigns could not be done because of lack of funding (at least 50,000€/survey). We use a simulation study based on real data to assess and compare performances of news and older sampling designs on this survey. Three sampling designs are tested in both of the 6 past monitoring campaigns data and we estimate the cost of their application in the field. Selected sampling designs are: 1 - simple random sampling (SRS - the one used in the past years of this monitoring program), 2 - generalized tessellation sampling (GRTS - a recent spatially balanced sampling design known for its high performance) and, 3 - balanced acceptance sampling design (BAS - a newly developed spatially balanced sampling design, never tested yet in a real population). We first confirm that the two spatially balanced sampling designs perform better than simple random sampling. Both of the advanced sampling designs perform equally and allow achieving same accuracy in the results with almost half sampling intensity than SRS. This makes them so cost-effective that 30% of each campaign price could be saved if they were used. Moreover, the three sampling designs need a constant sample size thought years to achieve a fixed accuracy in results. This will permit us to fix one sample size that could be done for all future campaigns; and this, despite the existence of spatial and temporal variations in clam's distribution.  相似文献   

10.
Plant pathologists need to manage plant diseases at low incidence levels. This needs to be performed efficiently in terms of precision, cost and time because most plant infections spread rapidly to other plants. Adaptive cluster sampling with a data‐driven stopping rule (ACS*) was proposed to control the final sample size and improve efficiency of the ordinary adaptive cluster sampling (ACS) when prior knowledge of population structure is not known. This study seeks to apply the ACS* design to plant diseases at various levels of clustering and incidences levels. Results from simulation study show that the ACS* is as efficient as the ordinary ACS design at low levels of disease incidence with highly clustered diseased plants and is an efficient design compared with simple random sampling (SRS) and ordinary ACS for some highly to less clustered diseased plants with moderate to higher levels of disease incidence.  相似文献   

11.
Hiby L  Krishna MB 《Biometrics》2001,57(3):727-731
Cutting straight line transects through dense forest is time consuming and expensive when large areas need to be surveyed for rare or highly clustered species. We argue that existing paths or game trails may be suitable as transects for line transect sampling even though they will not, in general, run straight. Formulas and software currently used to estimate local density using perpendicular distance data can be used with closest approach distances measured from curving transects. Suitable paths or trails are those for which the minimum radius of curvature is rarely less than the width of the shoulder in the detection probability function. The use of existing paths carries the risk of bias resulting from unrepresentative sampling of available habitats, and this must be weighed against the increase in coverage available.  相似文献   

12.
Population geneticists and community ecologists have long recognized the importance of sampling design for uncovering patterns of diversity within and among populations and in communities. Invasion ecologists increasingly have utilized phylogeographical patterns of mitochondrial or chloroplast DNA sequence variation to link introduced populations with putative source populations. However, many studies have ignored lessons from population genetics and community ecology and are vulnerable to sampling errors owing to insufficient field collections. A review of published invasion studies that utilized mitochondrial or chloroplast DNA markers reveals that insufficient sampling could strongly influence results and interpretations. Sixty per cent of studies sampled an average of less than six individuals per source population, vs. only 45% for introduced populations. Typically, far fewer introduced than source populations were surveyed, although they were sampled more intensively. Simulations based on published data forming a comprehensive mtDNA haplotype data set highlight and quantify the impact of the number of individuals surveyed per source population and number of putative source populations surveyed for accurate assignment of introduced individuals. Errors associated with sampling a low number of individuals are most acute when rare source haplotypes are dominant or fixed in the introduced population. Accuracy of assignment of introduced individuals is also directly related to the number of source populations surveyed and to the degree of genetic differentiation among them ( F ST). Incorrect interpretations resulting from sampling errors can be avoided if sampling design is considered before field collections are made.  相似文献   

13.
The compilation of all the available taxonomic and distributional information on the species present in a territory frequently generates a biased picture of the distribution of biodiversity due to the uneven distribution of the sampling effort performed. Thus, quality protocol assessments such as those proposed by Hortal et al. (Conservation Biology 21:853–863, 2007) must be done before using this kind of information for basic and applied purposes. The discrimination of localities that can be considered relatively well-surveyed from those not surveyed enough is a key first step in this protocol and can be attained by the previous definition of a sampling effort surrogate and the calculation of survey completeness using different estimators. Recently it has been suggested that records from exhaustive databases can be used as a sampling-effort surrogate to recognize probable well-surveyed localities. In this paper, we use an Iberian dung beetle database to identify the 50 × 50 km UTM cells that appear to be reliably inventoried, using both data derived from standardized sampling protocols and database records as a surrogate for sampling effort. Observed and predicted species richness values in the shared cells defined as well-surveyed by both methods suggest that the use of database records provides higher species richness values, which are proportionally greater in the richest localities by the inclusion of rare species.  相似文献   

14.
Barabesi L  Pisani C 《Biometrics》2002,58(3):586-592
In practical ecological sampling studies, a certain design (such as plot sampling or line-intercept sampling) is usually replicated more than once. For each replication, the Horvitz-Thompson estimation of the objective parameter is considered. Finally, an overall estimator is achieved by averaging the single Horvitz-Thompson estimators. Because the design replications are drawn independently and under the same conditions, the overall estimator is simply the sample mean of the Horvitz-Thompson estimators under simple random sampling. This procedure may be wisely improved by using ranked set sampling. Hence, we propose the replicated protocol under ranked set sampling, which gives rise to a more accurate estimation than the replicated protocol under simple random sampling.  相似文献   

15.
Despite the availability of newer approaches, traditional hierarchical clustering remains very popular in genetic diversity studies in plants. However, little is known about its suitability for molecular marker data. We studied the performance of traditional hierarchical clustering techniques using real and simulated molecular marker data. Our study also compared the performance of traditional hierarchical clustering with model-based clustering (STRUCTURE). We showed that the cophenetic correlation coefficient is directly related to subgroup differentiation and can thus be used as an indicator of the presence of genetically distinct subgroups in germplasm collections. Whereas UPGMA performed well in preserving distances between accessions, Ward excelled in recovering groups. Our results also showed a close similarity between clusters obtained by Ward and by STRUCTURE. Traditional cluster analysis can provide an easy and effective way of determining structure in germplasm collections using molecular marker data, and, the output can be used for sampling core collections or for association studies.  相似文献   

16.
A genetic model with genotype×environment (GE) interactions for controlling systematical errors in the field can be used for predicting genotypic values by an adjusted unbiased prediction (AUP) method. Mahalanobis distance, calculated based on the genotypic values, is then applied to measure the genetic distance among accessions. The unweighted pair-group average, Ward’s and the complete linkage methods of hierarchical clustering combined with three sampling strategies are proposed to construct core collections in a procedure of stepwise clustering. A homogeneous test and t-tests are suggested for use in testing variances and means, respectively. The coincidence rate (CR%) for range and the variable rate (VR%) for the coefficient of variation are designed to evaluate the property of core collections. A worked example of constructing core collections in cotton with 21 traits was conducted. Random sampling can represent the genetic diversity structure of the initial collection. Preferred sampling can keep the accessions with special or valuable characteristics in the initial collection. Deviation sampling can retain the larger genetic variability of the initial collection. For better representation of the core collection, cluster methods should be combined with different sampling strategies. The core collections based on genotypic values retained larger genetic variability and had superior representatives than those based on phenotypic values. Received: 15 October 1999 / Accepted: 24 November 1999  相似文献   

17.
The birth-death process is widely used in phylogenetics to model speciation and extinction. Recent studies have shown that the inferred rates are sensitive to assumptions about the sampling probability of lineages. Here, we examine the effect of the method used to sample lineages. Whereas previous studies have assumed random sampling (RS), we consider two extreme cases of biased sampling: "diversified sampling" (DS), where tips are selected to maximize diversity and "cluster sampling (CS)," where sample diversity is minimized. DS appears to be standard practice, for example, in analyses of higher taxa, whereas CS may occur under special circumstances, for example, in studies of geographically defined floras or faunas. Using both simulations and analyses of empirical data, we show that inferred rates may be heavily biased if the sampling strategy is not modeled correctly. In particular, when a diversified sample is treated as if it were a random or complete sample, the extinction rate is severely underestimated, often close to 0. Such dramatic errors may lead to serious consequences, for example, if estimated rates are used in assessing the vulnerability of threatened species to extinction. Using Bayesian model testing across 18 empirical data sets, we show that DS is commonly a better fit to the data than complete, random, or cluster sampling (CS). Inappropriate modeling of the sampling method may at least partly explain anomalous results that have previously been attributed to variation over time in birth and death rates.  相似文献   

18.
Lot quality assurance sampling (LQAS) surveys are commonly used for monitoring and evaluation in resource-limited settings. Recently several methods have been proposed to combine LQAS with cluster sampling for more timely and cost-effective data collection. For some of these methods, the standard binomial model can be used for constructing decision rules as the clustering can be ignored. For other designs, considered here, clustering is accommodated in the design phase. In this paper, we compare these latter cluster LQAS methodologies and provide recommendations for choosing a cluster LQAS design. We compare technical differences in the three methods and determine situations in which the choice of method results in a substantively different design. We consider two different aspects of the methods: the distributional assumptions and the clustering parameterization. Further, we provide software tools for implementing each method and clarify misconceptions about these designs in the literature. We illustrate the differences in these methods using vaccination and nutrition cluster LQAS surveys as example designs. The cluster methods are not sensitive to the distributional assumptions but can result in substantially different designs (sample sizes) depending on the clustering parameterization. However, none of the clustering parameterizations used in the existing methods appears to be consistent with the observed data, and, consequently, choice between the cluster LQAS methods is not straightforward. Further research should attempt to characterize clustering patterns in specific applications and provide suggestions for best-practice cluster LQAS designs on a setting-specific basis.  相似文献   

19.
Ranked set sampling (RSS) as suggested by McIntyre (1952) may be modified to introduced a new sampling method called pair rank set sampling (PRSS), which might be used in some area of application instead of the RSS to increase the efficiency of the estimators relative to the simple random sampling (SRS) method. Estimators of the population mean are considered. An example using real data is presented to illustrate computations.  相似文献   

20.
Empirical Bayes Gibbs sampling   总被引:3,自引:0,他引:3  
The wide applicability of Gibbs sampling has increased the use of more complex and multi-level hierarchical models. To use these models entails dealing with hyperparameters in the deeper levels of a hierarchy. There are three typical methods for dealing with these hyperparameters: specify them, estimate them, or use a 'flat' prior. Each of these strategies has its own associated problems. In this paper, using an empirical Bayes approach, we show how the hyperparameters can be estimated in a way that is both computationally feasible and statistically valid.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号