首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary We discuss design and analysis of longitudinal studies after case–control sampling, wherein interest is in the relationship between a longitudinal binary response that is related to the sampling (case–control) variable, and a set of covariates. We propose a semiparametric modeling framework based on a marginal longitudinal binary response model and an ancillary model for subjects' case–control status. In this approach, the analyst must posit the population prevalence of being a case, which is then used to compute an offset term in the ancillary model. Parameter estimates from this model are used to compute offsets for the longitudinal response model. Examining the impact of population prevalence and ancillary model misspecification, we show that time‐invariant covariate parameter estimates, other than the intercept, are reasonably robust, but intercept and time‐varying covariate parameter estimates can be sensitive to such misspecification. We study design and analysis issues impacting study efficiency, namely: choice of sampling variable and the strength of its relationship to the response, sample stratification, choice of working covariance weighting, and degree of flexibility of the ancillary model. The research is motivated by a longitudinal study following case–control sampling of the time course of attention deficit hyperactivity disorder (ADHD) symptoms.  相似文献   

2.
Estimating the size of hidden populations is essential to understand the magnitude of social and healthcare needs, risk behaviors, and disease burden. However, due to the hidden nature of these populations, they are difficult to survey, and there are no gold standard size estimation methods. Many different methods and variations exist, and diagnostic tools are needed to help researchers assess method-specific assumptions as well as compare between methods. Further, because many necessary mathematical assumptions are unrealistic for real survey implementation, assessment of how robust methods are to deviations from the stated assumptions is essential. We describe diagnostics and assess the performance of a new population size estimation method, capture–recapture with successive sampling population size estimation (CR-SS-PSE), which we apply to data from 3 years of studies from three cities and three hidden populations in Armenia. CR-SS-PSE relies on data from two sequential respondent-driven sampling surveys and extends the successive sampling population size estimation (SS-PSE) framework by using the number of individuals in the overlap between the two surveys and a model for the successive sampling process to estimate population size. We demonstrate that CR-SS-PSE is more robust to violations of successive sampling assumptions than SS-PSE. Further, we compare the CR-SS-PSE estimates to population size estimations using other common methods, including unique object and service multipliers, wisdom of the crowd, and two-source capture–recapture to illustrate volatility across estimation methods.  相似文献   

3.
Wang YG 《Biometrics》1999,55(3):984-989
Troxel, Lipsitz, and Brennan (1997, Biometrics 53, 857-869) considered parameter estimation from survey data with nonignorable nonresponse and proposed weighted estimating equations to remove the biases in the complete-case analysis that ignores missing observations. This paper suggests two alternative modifications for unbiased estimation of regression parameters when a binary outcome is potentially observed at successive time points. The weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) is also modified to obtain unbiased estimating functions. The suggested estimating functions are unbiased only when the missingness probability is correctly specified, and misspecification of the missingness model will result in biases in the estimates. Simulation studies are carried out to assess the performance of different methods when the covariate is binary or normal. For the simulation models used, the relative efficiency of the two new methods to the weighting methods is about 3.0 for the slope parameter and about 2.0 for the intercept parameter when the covariate is continuous and the missingness probability is correctly specified. All methods produce substantial biases in the estimates when the missingness model is misspecified or underspecified. Analysis of data from a medical survey illustrates the use and possible differences of these estimating functions.  相似文献   

4.
Estimating recombination rates from population genetic data.   总被引:21,自引:0,他引:21  
P Fearnhead  P Donnelly 《Genetics》2001,159(3):1299-1318
We introduce a new method for estimating recombination rates from population genetic data. The method uses a computationally intensive statistical procedure (importance sampling) to calculate the likelihood under a coalescent-based model. Detailed comparisons of the new algorithm with two existing methods (the importance sampling method of Griffiths and Marjoram and the MCMC method of Kuhner and colleagues) show it to be substantially more efficient. (The improvement over the existing importance sampling scheme is typically by four orders of magnitude.) The existing approaches not infrequently led to misleading results on the problems we investigated. We also performed a simulation study to look at the properties of the maximum-likelihood estimator of the recombination rate and its robustness to misspecification of the demographic model.  相似文献   

5.
Although many studies have reported human polymorphism data, there has been no analysis of the effect of sampling design on the patterns of variability recovered. Here, we consider which factors affect a summary of the allele-frequency spectrum. The most important variable to emerge from our analysis is the number of ethnicities sampled: studies that sequence individuals from more ethnicities recover more rare alleles. These observations are consistent with fine-scale geographic differentiation as well as population growth. They suggest that the geographic sampling strategy should be considered carefully, especially when the aim is to infer the demographic history of humans.  相似文献   

6.
Good–Turing frequency estimation (Good, 1953 ) is a simple, effective method for predicting detection probabilities of objects of both observed and unobserved classes based on observed frequencies of classes in a sample. The method has been used widely in several disciplines, such as information retrieval, computational linguistics, text recognition, and ecological diversity estimation. Nevertheless, existing studies assume sampling with replacement or sampling from an infinite population, which might be inappropriate for many practical applications. In light of this limitation, this article presents a modification of the Good–Turing estimation method to account for finite population sampling. We provide three practical extensions of the modified method, and we examine performance of the modified method and its extensions in simulation experiments.  相似文献   

7.
The purpose of many wildlife population studies is to estimate density, movement, or demographic parameters. Linking these parameters to covariates, such as habitat features, provides additional ecological insight and can be used to make predictions for management purposes. Line‐transect surveys, combined with distance sampling methods, are often used to estimate density at discrete points in time, whereas capture–recapture methods are used to estimate movement and other demographic parameters. Recently, open population spatial capture–recapture models have been developed, which simultaneously estimate density and demographic parameters, but have been made available only for data collected from a fixed array of detectors and have not incorporated the effects of habitat covariates. We developed a spatial capture–recapture model that can be applied to line‐transect survey data by modeling detection probability in a manner analogous to distance sampling. We extend this model to a) estimate demographic parameters using an open population framework and b) model variation in density and space use as a function of habitat covariates. The model is illustrated using simulated data and aerial line‐transect survey data for North Atlantic right whales in the southeastern United States, which also demonstrates the ability to integrate data from multiple survey platforms and accommodate differences between strata or demographic groups. When individuals detected from line‐transect surveys can be uniquely identified, our model can be used to simultaneously make inference on factors that influence spatial and temporal variation in density, movement, and population dynamics.  相似文献   

8.
Leblois R  Rousset F  Estoup A 《Genetics》2004,166(2):1081-1092
Drift and migration disequilibrium are very common in animal and plant populations. Yet their impact on methods of estimation of demographic parameters was rarely evaluated especially in complex realistic population models. The effect of such disequilibria on the estimation of demographic parameters depends on the population model, the statistics, and the genetic markers used. Here we considered the estimation of the product Dsigma2 from individual microsatellite data, where D is the density of adults and sigma2 the average squared axial parent-offspring distance in a continuous population evolving under isolation by distance. A coalescence-based simulation algorithm was used to study the effect on Dsigma2 estimation of temporal and spatial fluctuations of demographic parameters. Estimation of present-time Dsigma2 values was found to be robust to temporal changes in dispersal, to density reduction, and to spatial expansions with constant density, even for relatively recent changes (i.e., a few tens of generations ago). By contrast, density increase in the recent past gave Dsigma2 estimations biased largely toward past demographic parameters values. The method was also robust to spatial heterogeneity in density and estimated local demographic parameters when the density is homogenous around the sampling area (e.g., on a surface that equals four times the sampling area). Hence, in the limit of the situations studied in this article, and with the exception of the case of density increase, temporal and spatial fluctuations of demographic parameters appear to have a limited influence on the estimation of local and present-time demographic parameters with the method studied.  相似文献   

9.
Researchers interested in studying populations that are difficult to reach through traditional survey methods can now draw on a range of methods to access these populations. Yet many of these methods are more expensive and difficult to implement than studies using conventional sampling frames and trusted sampling methods. The network scale-up method (NSUM) provides a middle ground for researchers who wish to estimate the size of a hidden population, but lack the resources to conduct a more specialized hidden population study. Through this method it is possible to generate population estimates for a wide variety of groups that are perhaps unwilling to self-identify as such (for example, users of illegal drugs or other stigmatized populations) via traditional survey tools such as telephone or mail surveys—by asking a representative sample to estimate the number of people they know who are members of such a “hidden” subpopulation. The original estimator is formulated to minimize the weight a single scaling variable can exert upon the estimates. We argue that this introduces hidden and difficult to predict biases, and instead propose a series of methodological advances on the traditional scale-up estimation procedure, including a new estimator. Additionally, we formalize the incorporation of sample weights into the network scale-up estimation process, and propose a recursive process of back estimation “trimming” to identify and remove poorly performing predictors from the estimation process. To demonstrate these suggestions we use data from a network scale-up mail survey conducted in Nebraska during 2014. We find that using the new estimator and recursive trimming process provides more accurate estimates, especially when used in conjunction with sampling weights.  相似文献   

10.
Plant population responses are key to understanding the effects of threats such as climate change and invasions. However, we lack demographic data for most species, and the data we have are often geographically aggregated. We determined to what extent existing data can be extrapolated to predict population performance across larger sets of species and spatial areas. We used 550 matrix models, across 210 species, sourced from the COMPADRE Plant Matrix Database, to model how climate, geographic proximity and phylogeny predicted population performance. Models including only geographic proximity and phylogeny explained 5–40% of the variation in four key metrics of population performance. However, there was poor extrapolation between species and extrapolation was limited to geographic scales smaller than those at which landscape scale threats typically occur. Thus, demographic information should only be extrapolated with caution. Capturing demography at scales relevant to landscape level threats will require more geographically extensive sampling.  相似文献   

11.
We propose a likelihood ratio test to assess that sampling has been completed in closed population size estimation studies. More precisely, we assess if the expected number of subjects that have never been sampled is below a user-specified threshold. The likelihood ratio test statistic has a nonstandard distribution under the null hypothesis. Critical values can be easily approximated and tabulated, and they do not depend on model specification. We illustrate in a simulation study and three real data examples, one of which involves ascertainment bias of amyotrophic lateral sclerosis in Gulf War veterans.  相似文献   

12.
An increasing number of health services researchers are using multilevel analysis for evaluating health care performance. This method has the distinct advantage of accounting for within-provider correlation among patients. Alternatively, in a similar manner, estimators based on cluster sampling can also adjust for within-provider correlation. Cluster sampling methods do not require assumptions about error distribution as multilevel analysis does. To our knowledge, no comparison has been made between multilevel analysis and cluster sampling estimators in evaluating health care performance using either a simulated or real dataset. In this paper, we compare the cluster sampling estimators to multilevel estimators in evaluating screening mammography performance using Medicare claims data. We also discuss the strengths and limitations of multilevel analysis in profiling health care providers with small caseloads.  相似文献   

13.
The composite-likelihood estimator (CLE) of the population recombination rate considers only sites with exactly two alleles under a finite-sites mutation model (McVean, G. A. T., P. Awadalla, and P. Fearnhead. 2002. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160:1231-1241). While in such a model the identity of alleles is not considered, the CLE has been shown to be robust to minor misspecification of the underlying mutational model. However, there are many situations where the putative mutation and demographic history can be quite complex. One good example is rapidly evolving pathogens, like HIV-1. First we evaluated the performance of the CLE and the likelihood permutation test (LPT) under more complex, realistic models, including a general time reversible (GTR) substitution model, rate heterogeneity among sites (Gamma), positive selection, population growth, population structure, and noncontemporaneous sampling. Second, we relaxed some of the assumptions of the CLE allowing for a four-allele, GTR + Gamma model in an attempt to use the data more efficiently. Through simulations and the analysis of real data, we concluded that the CLE is robust to severe misspecifications of the substitution model, but underestimates the recombination rate in the presence of exponential growth, population mixture, selection, or noncontemporaneous sampling. In such cases, the use of more complex models slightly increases performance in some occasions, especially in the case of the LPT. Thus, our results provide for a more robust application of the estimation of recombination rates.  相似文献   

14.
Flexible multilevel models are proposed to allow for cluster-specific smooth estimation of growth curves in a mixed-effects modeling format that includes subject-specific random effects on the growth parameters. Attention is then focused on models that examine between-cluster comparisons of the effects of an ecologic covariate of interest (e.g. air pollution) on nonlinear functionals of growth curves (e.g. maximum rate of growth). A Gibbs sampling approach is used to get posterior mean estimates of nonlinear functionals along with their uncertainty estimates. A second-stage ecologic random-effects model is used to examine the association between a covariate of interest (e.g. air pollution) and the nonlinear functionals. A unified estimation procedure is presented along with its computational and theoretical details. The models are motivated by, and illustrated with, lung function and air pollution data from the Southern California Children's Health Study.  相似文献   

15.
This paper compares the distribution, sampling and estimation of abundance for two animal species in an African ecosystem by means of an intensive simulation of the sampling process under a geographical information system (GIS) environment. It focuses on systematic and random sampling designs, commonly used in wildlife surveys, comparing their performance to an adaptive design at three increasing sampling intensities, using the root mean square errors (RMSE). It further assesses the impact of sampling designs and intensities on estimates of population parameters. The simulation is based on data collected during a prior survey, in which geographical locations of all observed animals were recorded. This provides more detailed data than that usually available from transect surveys. The results show precision of estimates to increase with increasing sampling intensity, while no significant differences are observed between estimates obtained under random and systematic designs. An increase in precision is observed for the adaptive design, thereby validating the use of this design for sampling clustered populations. The study illustrates the benefits of combining statistical methods with GIS techniques to increase insight into wildlife population dynamics.  相似文献   

16.
Obtaining useful estimates of wildlife abundance or density requires thoughtful attention to potential sources of bias and precision, and it is widely understood that addressing incomplete detection is critical to appropriate inference. When the underlying assumptions of sampling approaches are violated, both increased bias and reduced precision of the population estimator may result. Bear (Ursus spp.) populations can be difficult to sample and are often monitored using mark‐recapture distance sampling (MRDS) methods, although obtaining adequate sample sizes can be cost prohibitive. With the goal of improving inference, we examined the underlying methodological assumptions and estimator efficiency of three datasets collected under an MRDS protocol designed specifically for bears. We analyzed these data using MRDS, conventional distance sampling (CDS), and open‐distance sampling approaches to evaluate the apparent bias‐precision tradeoff relative to the assumptions inherent under each approach. We also evaluated the incorporation of informative priors on detection parameters within a Bayesian context. We found that the CDS estimator had low apparent bias and was more efficient than the more complex MRDS estimator. When combined with informative priors on the detection process, precision was increased by >50% compared to the MRDS approach with little apparent bias. In addition, open‐distance sampling models revealed a serious violation of the assumption that all bears were available to be sampled. Inference is directly related to the underlying assumptions of the survey design and the analytical tools employed. We show that for aerial surveys of bears, avoidance of unnecessary model complexity, use of prior information, and the application of open population models can be used to greatly improve estimator performance and simplify field protocols. Although we focused on distance sampling‐based aerial surveys for bears, the general concepts we addressed apply to a variety of wildlife survey contexts.  相似文献   

17.
Austerlitz F  Kalaydjieva L  Heyer E 《Genetics》2003,165(3):1579-1586
The frequency of a rare mutant allele and the level of allelic association between this allele and one or several closely linked markers are frequently measured in genetic epidemiology. Both quantities are related to the time elapsed since the appearance of the mutation in the population and the intrinsic growth rate of the mutation (which may be different from the average population growth rate). Here, we develop a method that uses these two kinds of genetic data to perform a joint estimation of the age of the mutation and the minimum growth rate that is compatible with its present frequency. In absence of demographic data, it provides a useful estimate of population growth rate. When such data are available, contrasts among estimates from several loci allow demographic processes, affecting all loci similarly, to be distinguished from selection, affecting loci differently. Testing these estimates on populations for which data are available for several disorders shows good congruence with demographic data in some cases whereas in others higher growth rates are obtained, which may be the result of selection or hidden demographic processes.  相似文献   

18.
A model is developed that treats migration rates among populations as a function of the geographic distance between them and the size of both sources and recipient population. Specifically, mij/mjj = a(Ni/Nj)pe-bd, where mij/mjj is the relative migration rate into population j from population i, Ni is the size of the source population, Nj is the size of the recipient population, d is the geographic distance between populations i and j, p is a measure of differential density-dependence, b is a measure of distance decay, and a is an adjustment parameter with little demographic meaning. Methods of parameter estimation and hypothesis testing using maximum likelihood are outlined. These methods are applied to migration matrix data from 13 samples obtained from the literature representing a wide range of ecological settings. All samples show a significant effect of geographic distance on migration, and all but one show a significant effect of differential population size. All but one sample show an overall tendency for migration to be negative density-dependent; that is, the relative migration rate is greater from larger populations to smaller populations than the reverse.  相似文献   

19.
Pybus OG  Rambaut A  Harvey PH 《Genetics》2000,155(3):1429-1437
We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.  相似文献   

20.
Many recent studies have demonstrated a negative effect of small population size on single plant traits. However, not much is known about the actual consequences of reduced plant performance on the long-term prospect of species survival. I studied the effect of population size on population growth rate and survival probability in the rare perennial herbScorzonera hispanica occurring in fragmented grasslands. Its performance was measured using several traits related to reproduction in 21 populations ranging in size from 3 to 2475 plants. These data were then connected with data on full demography of the species from three of the studied populations. Two different matrix models differing in the number of transitions based on measurements in the populations differing in size were used to explore the relationship between population size and population growth rate. Both matrix models showed that despite the decline in seed production in small populations, population growth rate is never significantly different from one, and the populations could thus be expected to survive in the long run. Calculations of extinction probabilities that take into account demographic and environmental stochasticity, however, showed that populations below 100 flowering individuals have a high probability to become extinct. This demonstrates that demographic and environmental stochasticity is an important driver of the fate of small populations in this system. This study demonstrates that estimation of population growth rate can provide new insights into the effect of population size on population growth and survival. It also shows how matrix models enable the combination various pieces of information about the single populations into one overall measure, and may provide a useful tool for the standardization of studies on the effects of population size on population performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号