首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Most species are structured and influenced by processes that either increased or reduced gene flow between populations. However, most population genetic inference methods assume panmixia and reconstruct a history characterized by population size changes. This is potentially problematic as population structure can generate spurious signals of population size change through time. Moreover, when the model assumed for demographic inference is misspecified, genomic data will likely increase the precision of misleading if not meaningless parameters. For instance, if data were generated under an n-island model (characterized by the number of islands and migrants exchanged) inference based on a model of population size change would produce precise estimates of a bottleneck that would be meaningless. In addition, archaeological or climatic events around the bottleneck''s timing might provide a reasonable but potentially misleading scenario. In a context of model uncertainty (panmixia versus structure) genomic data may thus not necessarily lead to improved statistical inference. We consider two haploid genomes and develop a theory that explains why any demographic model with structure will necessarily be interpreted as a series of changes in population size by inference methods ignoring structure. We formalize a parameter, the inverse instantaneous coalescence rate, and show that it is equivalent to a population size only in panmictic models, and is mostly misleading for structured models. We argue that this issue affects all population genetics methods ignoring population structure which may thus infer population size changes that never took place. We apply our approach to human genomic data.  相似文献   

3.
This work extends the methods of demographic inference based on the distribution of pairwise genetic differences between individuals (mismatch distribution) to the case of linked microsatellite data. Population genetics theory describes the distribution of mutations among a sample of genes under different demographic scenarios. However, the actual number of mutations can rarely be deduced from DNA polymorphisms. The inclusion of mutation models in theoretical predictions can improve the performance of statistical methods. We have developed a maximum-pseudolikelihood estimator for the parameters that characterize a demographic expansion for a series of linked loci evolving under a stepwise mutation model. Those loci would correspond to DNA polymorphisms of linked microsatellites (such as those found on the Y chromosome or the chloroplast genome). The proposed method was evaluated with simulated data sets and with a data set of chloroplast microsatellites that showed signal for demographic expansion in a previous study. The results show that inclusion of a mutational model in the analysis improves the estimates of the age of expansion in the case of older expansions.  相似文献   

4.
We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.  相似文献   

5.
As the practice of using population models for wildlife risk assessment has become more common, so has the practice of using surrogate data, typically taken from the published scientific literature, as inputs for demographic models. This practice clearly exposes the user to inferential errors. However, it is likely to continue because demographic data are expensive to gather. We review potential errors associated with the use of previously published demographic data and how those errors propagate into the endpoints of demographic projection models. We suggest methods for inferring bias in model endpoints when multiple and opposing biases are present in the demographic input data. We provide an example using Eastern Meadowlarks (Sturnella magna), a common songbird in Midwestern grasslands and agro-ecosystems. We conclude with a brief review of methods that could improve inference made using published demographic data, including methods from life-history theory, meta-analysis, and Bayesian statistics.  相似文献   

6.
Complex demography and selection at linked sites can generate spurious signatures of divergent selection. Unfortunately, many attempts at demographic inference consider overly simple models and neglect the effect of selection at linked sites. In this issue, Rougemont and Bernatchez (2018) applied an approximate Bayesian computation (ABC) framework that accounts for indirect selection to reveal a complex history of secondary contacts in Atlantic salmon (Salmo salar) that might explain a high rate of latitudinal clines in this species.  相似文献   

7.
Coalescent theory is routinely used to estimate past population dynamics and demographic parameters from genealogies. While early work in coalescent theory only considered simple demographic models, advances in theory have allowed for increasingly complex demographic scenarios to be considered. The success of this approach has lead to coalescent-based inference methods being applied to populations with rapidly changing population dynamics, including pathogens like RNA viruses. However, fitting epidemiological models to genealogies via coalescent models remains a challenging task, because pathogen populations often exhibit complex, nonlinear dynamics and are structured by multiple factors. Moreover, it often becomes necessary to consider stochastic variation in population dynamics when fitting such complex models to real data. Using recently developed structured coalescent models that accommodate complex population dynamics and population structure, we develop a statistical framework for fitting stochastic epidemiological models to genealogies. By combining particle filtering methods with Bayesian Markov chain Monte Carlo methods, we are able to fit a wide class of stochastic, nonlinear epidemiological models with different forms of population structure to genealogies. We demonstrate our framework using two structured epidemiological models: a model with disease progression between multiple stages of infection and a two-population model reflecting spatial structure. We apply the multi-stage model to HIV genealogies and show that the proposed method can be used to estimate the stage-specific transmission rates and prevalence of HIV. Finally, using the two-population model we explore how much information about population structure is contained in genealogies and what sample sizes are necessary to reliably infer parameters like migration rates.  相似文献   

8.
Consider a sample of animal abundances collected from one sampling occasion. Our focus is in estimating the number of species in a closed population. In order to conduct a noninformative Bayesian inference when modeling this data, we derive Jeffreys and reference priors from the full likelihood. We assume that the species' abundances are randomly distributed according to a distribution indexed by a finite‐dimensional parameter. We consider two specific cases which assume that the mean abundances are constant or exponentially distributed. The Jeffreys and reference priors are functions of the Fisher information for the model parameters; the information is calculated in part using the linear difference score for integer parameter models (Lindsay & Roeder 1987). The Jeffreys and reference priors perform similarly in a data example we consider. The posteriors based on the Jeffreys and reference priors are proper. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

9.
Although phylogenetic inference of protein-coding sequences continues to dominate the literature, few analyses incorporate evolutionary models that consider the genetic code. This problem is exacerbated by the exclusion of codon-based models from commonly employed model selection techniques, presumably due to the computational cost associated with codon models. We investigated an efficient alternative to standard nucleotide substitution models, in which codon position (CP) is incorporated into the model. We determined the most appropriate model for alignments of 177 RNA virus genes and 106 yeast genes, using 11 substitution models including one codon model and four CP models. The majority of analyzed gene alignments are best described by CP substitution models, rather than by standard nucleotide models, and without the computational cost of full codon models. These results have significant implications for phylogenetic inference of coding sequences as they make it clear that substitution models incorporating CPs not only are a computationally realistic alternative to standard models but may also frequently be statistically superior.  相似文献   

10.
Guedj J  Thiébaut R  Commenges D 《Biometrics》2007,63(4):1198-1206
The study of dynamical models of HIV infection, based on a system of nonlinear ordinary differential equations (ODE), has considerably improved the knowledge of its pathogenesis. While the first models used simplified ODE systems and analyzed each patient separately, recent works dealt with inference in non-simplified models borrowing strength from the whole sample. The complexity of these models leads to great difficulties for inference and only the Bayesian approach has been attempted by now. We propose a full likelihood inference, adapting a Newton-like algorithm for these particular models. We consider a relatively complex ODE model for HIV infection and a model for the observations including the issue of detection limits. We apply this approach to the analysis of a clinical trial of antiretroviral therapy (ALBI ANRS 070) and we show that the whole algorithm works well in a simulation study.  相似文献   

11.
We consider inference for the treatment-arm mean difference of an outcome that would have been measured at the end of a randomized follow-up study if, during the course of the study, patients had not initiated a nonrandomized therapy or dropped out. We argue that the treatment-arm mean difference is not identified unless unverifiable assumptions are made. We describe identifying assumptions that are tantamount to postulating relationships between the components of a pattern-mixture model but that can also be interpreted as imposing restrictions on the cause-specific censoring probabilities of a selection model. We then argue that, although sufficient for identification, these assumptions are insufficient for inference due to the curse of dimensionality. We propose reducing dimensionality by specifying semiparametric cause-specific selection models. These models are useful for conducting a sensitivity analysis to examine how inference for the treatment-arm mean difference changes as one varies the magnitude of the cause-specific selection bias over a plausible range. We provide methodology for conducting such sensitivity analysis and illustrate our methods with an analysis of data from the AIDS Clinical Trial Group (ACTG) study 002.  相似文献   

12.
Bayesian Inference in Semiparametric Mixed Models for Longitudinal Data   总被引:1,自引:0,他引:1  
Summary .  We consider Bayesian inference in semiparametric mixed models (SPMMs) for longitudinal data. SPMMs are a class of models that use a nonparametric function to model a time effect, a parametric function to model other covariate effects, and parametric or nonparametric random effects to account for the within-subject correlation. We model the nonparametric function using a Bayesian formulation of a cubic smoothing spline, and the random effect distribution using a normal distribution and alternatively a nonparametric Dirichlet process (DP) prior. When the random effect distribution is assumed to be normal, we propose a uniform shrinkage prior (USP) for the variance components and the smoothing parameter. When the random effect distribution is modeled nonparametrically, we use a DP prior with a normal base measure and propose a USP for the hyperparameters of the DP base measure. We argue that the commonly assumed DP prior implies a nonzero mean of the random effect distribution, even when a base measure with mean zero is specified. This implies weak identifiability for the fixed effects, and can therefore lead to biased estimators and poor inference for the regression coefficients and the spline estimator of the nonparametric function. We propose an adjustment using a postprocessing technique. We show that under mild conditions the posterior is proper under the proposed USP, a flat prior for the fixed effect parameters, and an improper prior for the residual variance. We illustrate the proposed approach using a longitudinal hormone dataset, and carry out extensive simulation studies to compare its finite sample performance with existing methods.  相似文献   

13.
The sensitivity and specificity of markers for event times   总被引:1,自引:0,他引:1  
The statistical literature on assessing the accuracy of risk factors or disease markers as diagnostic tests deals almost exclusively with settings where the test, Y, is measured concurrently with disease status D. In practice, however, disease status may vary over time and there is often a time lag between when the marker is measured and the occurrence of disease. One example concerns the Framingham risk score (FR-score) as a marker for the future risk of cardiovascular events, events that occur after the score is ascertained. To evaluate such a marker, one needs to take the time lag into account since the predictive accuracy may be higher when the marker is measured closer to the time of disease occurrence. We therefore consider inference for sensitivity and specificity functions that are defined as functions of time. Semiparametric regression models are proposed. Data from a cohort study are used to estimate model parameters. One issue that arises in practice is that event times may be censored. In this research, we extend in several respects the work by Leisenring et al. (1997) that dealt only with parametric models for binary tests and uncensored data. We propose semiparametric models that accommodate continuous tests and censoring. Asymptotic distribution theory for parameter estimates is developed and procedures for making statistical inference are evaluated with simulation studies. We illustrate our methods with data from the Cardiovascular Health Study, relating the FR-score measured at enrollment to subsequent risk of cardiovascular events.  相似文献   

14.
As the field of phylogeography has continued to move in the model‐based direction, researchers continue struggling to construct useful models for inference. These models must be both simple enough to be tractable yet contain enough of the complexity of the natural world to make meaningful inference. Beyond constructing such models for inference, researchers explore model space and test competing models with the data on hand, with the goal of improving the understanding of the natural world and the processes underlying natural biological communities. Approximate Bayesian computation (ABC) has increased in recent popularity as a tool for evaluating alternative historical demographic models given population genetic samples. As a thorough demonstration, Pelletier & Carstens ( 2014 ) use ABC to test 143 phylogeographic submodels given geographically widespread genetic samples from the salamander species Plethodon idahoensis (Carstens et al. 2004 ) and, in so doing, demonstrate how the results of the ABC model choice procedure are dependent on the model set one chooses to evaluate.  相似文献   

15.
Exact inference for growth curves with intraclass correlation structure   总被引:2,自引:0,他引:2  
Weerahandi S  Berger VW 《Biometrics》1999,55(3):921-924
We consider repeated observations taken over time for each of several subjects. For example, one might consider the growth curve of a cohort of babies over time. We assume a simple linear growth curve model. Exact results based on sufficient statistics (exact tests of the null hypothesis that a coefficient is zero, or exact confidence intervals for coefficients) are not available to make inference on regression coefficients when an intraclass correlation structure is assumed. This paper will demonstrate that such exact inference is possible using generalized inference.  相似文献   

16.
Current procedures for inferring population history generally assume complete neutrality—that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.  相似文献   

17.
Inferring the demographic history of species and their populations is crucial to understand their contemporary distribution, abundance and adaptations. The high computational overhead of likelihood‐based inference approaches severely restricts their applicability to large data sets or complex models. In response to these restrictions, approximate Bayesian computation (ABC) methods have been developed to infer the demographic past of populations and species. Here, we present the results of an evaluation of the ABC‐based approach implemented in the popular software package diyabc using simulated data sets (mitochondrial DNA sequences, microsatellite genotypes and single nucleotide polymorphisms). We simulated population genetic data under five different simple, single‐population models to assess the model recovery rates as well as the bias and error of the parameter estimates. The ability of diyabc to recover the correct model was relatively low (0.49): 0.6 for the simplest models and 0.3 for the more complex models. The recovery rate improved significantly when reducing the number of candidate models from five to three (from 0.57 to 0.71). Among the parameters of interest, the effective population size was estimated at a higher accuracy compared to the timing of events. Increased amounts of genetic data did not significantly improve the accuracy of the parameter estimates. Some gains in accuracy and decreases in error were observed for scaled parameters (e.g., Neμ) compared to unscaled parameters (e.g., Ne and μ). We concluded that diyabc ‐based assessments are not suited to capture a detailed demographic history, but might be efficient at capturing simple, major demographic changes.  相似文献   

18.
Defining computable analytical measures of the effects of selection in populations with demographic and environmental stochasticity is a long-standing problem. We derive an analytical measure which takes in account all consequences of the discrete nature of deme size. Expressions of this measure are detailed for infinite island models of population structure. As an illustration we consider the evolution of dispersal in populations made of small demes with environmental and demographic stochasticity. We confirm some results obtained from the analysis of models based on deterministic approximations. In particular, when there is an Allee effect, we show that evolution of the dispersal rate may lead the metapopulation to extinction. Thus, selection on the dispersal rate could restrict the distribution of species subject to Allee effects. This selection-driven extinction is prevented by kin selection when the environmental extinction rate is small.  相似文献   

19.
Average human behavior in cue combination tasks is well predicted by bayesian inference models. As this capability is acquired over developmental timescales, the question arises, how it is learned. Here we investigated whether reward dependent learning, that is well established at the computational, behavioral, and neuronal levels, could contribute to this development. It is shown that a model free reinforcement learning algorithm can indeed learn to do cue integration, i.e. weight uncertain cues according to their respective reliabilities and even do so if reliabilities are changing. We also consider the case of causal inference where multimodal signals can originate from one or multiple separate objects and should not always be integrated. In this case, the learner is shown to develop a behavior that is closest to bayesian model averaging. We conclude that reward mediated learning could be a driving force for the development of cue integration and causal inference.  相似文献   

20.
Co‐occurring species are rarely considered as a factor influencing habitat selection. However, niche theory predicts that sharing resources, predators, and other interspecific interactions can limit the environmental conditions under which a species may exist. How does the spatial distribution of one species affect that of another within shared landscapes? We tested whether sympatric marten Martes americana and fishers M. pennanti in a mountain landscape in Alberta, Canada exhibit local‐scale spatial segregation, beyond differential habitat selection. We modelled marten and fisher distribution in relation to remotely‐sensed habitat data and species co‐occurrence, using generalized linear models and information‐theoretic model selection. Marten and fishers selected different habitat types and showed different responses to habitat fragmentation. Even after accounting for these differences, the absence of one species significantly explained the occurrence of the other. We conclude that the spatial distribution of marten and fishers influences habitat selection by each other at landscape scales, and hypothesize that this pattern may result from competition in a spatially heterogeneous environment. Species‐habitat models that consider only resources may fail to capture key predictors of species’ occurrence. Reliable prediction and inference requires that ecologists expand from landscapes to also include species‐scapes: a spatial plane of species interactions that combines with resources to drive species’ distributions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号