共查询到20条相似文献,搜索用时 15 毫秒
1.
Bayesian shrinkage analysis is arguably the state-of-the-art technique for large-scale multiple quantitative trait locus (QTL) mapping. However, when the shrinkage model does not involve indicator variables for marker inclusion, QTL detection remains heavily dependent on significance thresholds derived from phenotype permutation under the null hypothesis of no phenotype-to-genotype association. This approach is computationally intensive and more importantly, the hypothetical data generation at the heart of the permutation-based method violates the Bayesian philosophy. Here we propose a fully Bayesian decision rule for QTL detection under the recently introduced extended Bayesian LASSO for QTL mapping. Our new decision rule is free of any hypothetical data generation and relies on the well-established Bayes factors for evaluating the evidence for QTL presence at any locus. Simulation results demonstrate the remarkable performance of our decision rule. An application to real-world data is considered as well. 相似文献
2.
BETH GARDNER J. ANDREW ROYLE MICHAEL T. WEGAN RAYMOND E. RAINBOLT PAUL D. CURTIS 《The Journal of wildlife management》2010,74(2):318-325
ABSTRACT DNA-based mark-recapture has become a methodological cornerstone of research focused on bear species. The objective of such studies is often to estimate population size; however, doing so is frequently complicated by movement of individual bears. Movement affects the probability of detection and the assumption of closure of the population required in most models. To mitigate the bias caused by movement of individuals, population size and density estimates are often adjusted using ad hoc methods, including buffering the minimum polygon of the trapping array. We used a hierarchical, spatial capture-recapture model that contains explicit components for the spatial-point process that governs the distribution of individuals and their exposure to (via movement), and detection by, traps. We modeled detection probability as a function of each individual's distance to the trap and an indicator variable for previous capture to account for possible behavioral responses. We applied our model to a 2006 hair-snare study of a black bear (Ursus americanus) population in northern New York, USA. Based on the microsatellite marker analysis of collected hair samples, 47 individuals were identified. We estimated mean density at 0.20 bears/km2. A positive estimate of the indicator variable suggests that bears are attracted to baited sites; therefore, including a trap-dependence covariate is important when using bait to attract individuals. Bayesian analysis of the model was implemented in WinBUGS, and we provide the model specification. The model can be applied to any spatially organized trapping array (hair snares, camera traps, mist nests, etc.) to estimate density and can also account for heterogeneity and covariate information at the trap or individual level. 相似文献
3.
Most modern population genetics inference methods are based on the coalescence framework. Methods that allow estimating parameters of structured populations commonly insert migration events into the genealogies. For these methods the calculation of the coalescence probability density of a genealogy requires a product over all time periods between events. Data sets that contain populations with high rates of gene flow among them require an enormous number of calculations. A new method, transition probability-structured coalescence (TPSC), replaces the discrete migration events with probability statements. Because the speed of calculation is independent of the amount of gene flow, this method allows calculating the coalescence densities efficiently. The current implementation of TPSC uses an approximation simplifying the interaction among lineages. Simulations and coverage comparisons of TPSC vs. MIGRATE show that TPSC allows estimation of high migration rates more precisely, but because of the approximation the estimation of low migration rates is biased. The implementation of TPSC into programs that calculate quantities on phylogenetic tree structures is straightforward, so the TPSC approach will facilitate more general inferences in many computer programs. 相似文献
4.
Transmission events are the fundamental building blocks of the dynamics of any infectious disease. Much about the epidemiology of a disease can be learned when these individual transmission events are known or can be estimated. Such estimations are difficult and generally feasible only when detailed epidemiological data are available. The genealogy estimated from genetic sequences of sampled pathogens is another rich source of information on transmission history. Optimal inference of transmission events calls for the combination of genetic data and epidemiological data into one joint analysis. A key difficulty is that the transmission tree, which describes the transmission events between infected hosts, differs from the phylogenetic tree, which describes the ancestral relationships between pathogens sampled from these hosts. The trees differ both in timing of the internal nodes and in topology. These differences become more pronounced when a higher fraction of infected hosts is sampled. We show how the phylogenetic tree of sampled pathogens is related to the transmission tree of an outbreak of an infectious disease, by the within-host dynamics of pathogens. We provide a statistical framework to infer key epidemiological and mutational parameters by simultaneously estimating the phylogenetic tree and the transmission tree. We test the approach using simulations and illustrate its use on an outbreak of foot-and-mouth disease. The approach unifies existing methods in the emerging field of phylodynamics with transmission tree reconstruction methods that are used in infectious disease epidemiology. 相似文献
5.
JAY E. HOWELL JAMES T. PETERSON MICHAEL J. CONROY 《The Journal of wildlife management》2008,72(1):168-178
Abstract To predict the distributions of breeding birds in the state of Georgia, USA, we built hierarchical models consisting of 4 levels of nested mapping units of decreasing area: 90,000 ha, 3,600 ha, 144 ha, and 5.76 ha. We used the Partners in Flight database of point counts to generate presence and absence data at locations across the state of Georgia for 9 avian species: Acadian flycatcher (Empidonax virescens), brown-headed nuthatch (Sitta pusilla), Carolina wren (Thryothorus ludovicianus), indigo bunting (Passerina cyanea), northern cardinal (Cardinalis cardinalis), prairie warbler (Dendroica discolor), yellow-billed cuckoo (Coccyzus americanus), white-eyed vireo (Vireo griseus), and wood thrush (Hylocichla mustelina). At each location, we estimated hierarchical-level-specific habitat measurements using the Georgia GAP Analysis18 class land cover and other Geographic Information System sources. We created candidate, species-specific occupancy models based on previously reported relationships, and fit these using Markov chain Monte Carlo procedures implemented in OpenBugs. We then created a confidence model set for each species based on Akaike's Information Criterion. We found hierarchical habitat relationships for all species. Three-fold cross-validation estimates of model accuracy indicated an average overall correct classification rate of 60.5%. Comparisons with existing Georgia GAP Analysis models indicated that our models were more accurate overall. Our results provide guidance to wildlife scientists and managers seeking predict avian occurrence as a function of local and landscape-level habitat attributes. 相似文献
6.
For time series of count data, correlated measurements, clustering as well as excessive zeros occur simultaneously in biomedical applications. Ignoring such effects might contribute to misleading treatment outcomes. A generalized mixture Poisson geometric process (GMPGP) model and a zero‐altered mixture Poisson geometric process (ZMPGP) model are developed from the geometric process model, which was originally developed for modelling positive continuous data and was extended to handle count data. These models are motivated by evaluating the trend development of new tumour counts for bladder cancer patients as well as by identifying useful covariates which affect the count level. The models are implemented using Bayesian method with Markov chain Monte Carlo (MCMC) algorithms and are assessed using deviance information criterion (DIC). 相似文献
7.
Bayesian partitioning for estimating disease risk 总被引:7,自引:0,他引:7
This paper presents a Bayesian nonlinear approach for the analysis of spatial count data. It extends the Bayesian partition methodology of Holmes, Denison, and Mallick (1999, Bayesian partitioning for classification and regression, Technical Report, Imperial College, London) to handle data that involve counts. A demonstration involving incidence rates of leukemia in New York state is used to highlight the methodology. The model allows us to make probability statements on the incidence rates around point sources without making any parametric assumptions about the nature of the influence between the sources and the surrounding location. 相似文献
8.
Estimation of extreme quantal-response statistics, such as the concentration required to kill 99.9% of test subjects (LC99.9), remains a challenge in the presence of multiple covariates and complex study designs. Accurate and precise estimates of the LC99.9 for mixtures of toxicants are critical to ongoing control of a parasitic invasive species, the sea lamprey, in the Laurentian Great Lakes of North America. The toxicity of those chemicals is affected by local and temporal variations in water chemistry, which must be incorporated into the modeling. We develop multilevel empirical Bayes models for data from multiple laboratory studies. Our approach yields more accurate and precise estimation of the LC99.9 compared to alternative models considered. This study demonstrates that properly incorporating hierarchical structure in laboratory data yields better estimates of LC99.9 stream treatment values that are critical to larvae control in the field. In addition, out-of-sample prediction of the results of in situ tests reveals the presence of a latent seasonal effect not manifest in the laboratory studies, suggesting avenues for future study and illustrating the importance of dual consideration of both experimental and observational data. 相似文献
9.
Bayesian analyses of spatial data often use a conditionally autoregressive (CAR) prior, which can be written as the kernel of an improper density that depends on a precision parameter tau that is typically unknown. To include tau in the Bayesian analysis, the kernel must be multiplied by tau(k) for some k. This article rigorously derives k = (n - I)/2 for the L2 norm CAR prior (also called a Gaussian Markov random field model) and k = n - I for the L1 norm CAR prior, where n is the number of regions and I the number of "islands" (disconnected groups of regions) in the spatial map. Since I = 1 for a spatial structure defining a connected graph, this supports Knorr-Held's (2002, in Highly Structured Stochastic Systems, 260-264) suggestion that k = (n - 1)/2 in the L2 norm case, instead of the more common k = n/2. We illustrate the practical significance of our results using a periodontal example. 相似文献
10.
A flexible cure rate model for spatially correlated survival data based on generalized extreme value distribution and Gaussian process priors 下载免费PDF全文
Our present work proposes a new survival model in a Bayesian context to analyze right‐censored survival data for populations with a surviving fraction, assuming that the log failure time follows a generalized extreme value distribution. Many applications require a more flexible modeling of covariate information than a simple linear or parametric form for all covariate effects. It is also necessary to include the spatial variation in the model, since it is sometimes unexplained by the covariates considered in the analysis. Therefore, the nonlinear covariate effects and the spatial effects are incorporated into the systematic component of our model. Gaussian processes (GPs) provide a natural framework for modeling potentially nonlinear relationship and have recently become extremely powerful in nonlinear regression. Our proposed model adopts a semiparametric Bayesian approach by imposing a GP prior on the nonlinear structure of continuous covariate. With the consideration of data availability and computational complexity, the conditionally autoregressive distribution is placed on the region‐specific frailties to handle spatial correlation. The flexibility and gains of our proposed model are illustrated through analyses of simulated data examples as well as a dataset involving a colon cancer clinical trial from the state of Iowa. 相似文献
11.
A two-component model for counts of infectious diseases 总被引:1,自引:0,他引:1
We propose a stochastic model for the analysis of time series of disease counts as collected in typical surveillance systems on notifiable infectious diseases. The model is based on a Poisson or negative binomial observation model with two components: a parameter-driven component relates the disease incidence to latent parameters describing endemic seasonal patterns, which are typical for infectious disease surveillance data. An observation-driven or epidemic component is modeled with an autoregression on the number of cases at the previous time points. The autoregressive parameter is allowed to change over time according to a Bayesian changepoint model with unknown number of changepoints. Parameter estimates are obtained through the Bayesian model averaging using Markov chain Monte Carlo techniques. We illustrate our approach through analysis of simulated data and real notification data obtained from the German infectious disease surveillance system, administered by the Robert Koch Institute in Berlin. Software to fit the proposed model can be obtained from http://www.statistik.lmu.de/ approximately mhofmann/twins. 相似文献
12.
Bayesian estimation of the risk of a disease around a known point source of exposure is considered. The minimal requirements for data are that cases and populations at risk are known for a fixed set of concentric annuli around the point source, and each annulus has a uniquely defined distance from the source. The conventional Poisson likelihood is assumed for the counts of disease cases in each annular zone with zone‐specific relative risk and parameters and, conditional on the risks, the counts are considered to be independent. The prior for the relative risk parameters is assumed to be piecewise constant at the distance having a known number of components. This prior is the well‐known change‐point model. Monte Carlo sampling from the posterior results in zone‐specific posterior summaries, which can be applied for the calculation of a smooth curve describing the variation in disease risk as a function of the distance from the putative source. In addition, the posterior can be used in the calculation of posterior probabilities for interesting hypothesis. The suggested model is suitable for use in geographical information systems (GIS) aimed for monitoring disease risks. As an application, a case study on the incidence of lung cancer around a former asbestos mine in eastern Finland is presented. Further extensions of the model are discussed. 相似文献
13.
Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models 总被引:3,自引:0,他引:3
The problem of evaluating the goodness of the predictive distributionsof hierarchical Bayesian and empirical Bayes models is investigated.A Bayesian predictive information criterion is proposed as anestimator of the posterior mean of the expected loglikelihoodof the predictive distribution when the specified family ofprobability distributions does not contain the true distribution.The proposed criterion is developed by correcting the asymptoticbias of the posterior mean of the loglikelihood as an estimatorof its expected loglikelihood. In the evaluation of hierarchicalBayesian models with random effects, regardless of our parametricfocus, the proposed criterion considers the bias correctionof the posterior mean of the marginal loglikelihood becauseit requires a consistent parameter estimator. The use of thebootstrap in model evaluation is also discussed. 相似文献
14.
Gene flow and recombination in admixed populations produce genomes that are mosaic combinations of chromosome segments inherited from different source populations, that is, chromosome segments with different genetic ancestries. The statistical problem of estimating genetic ancestry from DNA sequence data has been widely studied, and analyses of genetic ancestry have facilitated research in molecular ecology and ecological genetics. In this review, we describe and compare different model‐based statistical methods used to infer genetic ancestry. We describe the conceptual and mathematical structure of these models and highlight some of their key differences and shared features. We then discuss recent empirical studies that use estimates of genetic ancestry to analyse population histories, the nature and genetic basis of species boundaries, and the genetic architecture of traits. These diverse studies demonstrate the breadth of applications that rely on genetic ancestry estimates and typify the genomics‐enabled research that is becoming increasingly common in molecular ecology. We conclude by identifying key research areas where future studies might further advance this field. 相似文献
15.
Generalized hierarchical multivariate CAR models for areal data 总被引:5,自引:0,他引:5
In the fields of medicine and public health, a common application of areal data models is the study of geographical patterns of disease. When we have several measurements recorded at each spatial location (for example, information on p>/= 2 diseases from the same population groups or regions), we need to consider multivariate areal data models in order to handle the dependence among the multivariate components as well as the spatial dependence between sites. In this article, we propose a flexible new class of generalized multivariate conditionally autoregressive (GMCAR) models for areal data, and show how it enriches the MCAR class. Our approach differs from earlier ones in that it directly specifies the joint distribution for a multivariate Markov random field (MRF) through the specification of simpler conditional and marginal models. This in turn leads to a significant reduction in the computational burden in hierarchical spatial random effect modeling, where posterior summaries are computed using Markov chain Monte Carlo (MCMC). We compare our approach with existing MCAR models in the literature via simulation, using average mean square error (AMSE) and a convenient hierarchical model selection criterion, the deviance information criterion (DIC; Spiegelhalter et al., 2002, Journal of the Royal Statistical Society, Series B64, 583-639). Finally, we offer a real-data application of our proposed GMCAR approach that models lung and esophagus cancer death rates during 1991-1998 in Minnesota counties. 相似文献
16.
Generalized spatial structural equation models 总被引:1,自引:0,他引:1
It is common in public health research to have high-dimensional, multivariate, spatially referenced data representing summaries of geographic regions. Often, it is desirable to examine relationships among these variables both within and across regions. An existing modeling technique called spatial factor analysis has been used and assumes that a common spatial factor underlies all the variables and causes them to be related to one another. An extension of this technique considers that there may be more than one underlying factor, and that relationships among the underlying latent variables are of primary interest. However, due to the complicated nature of the covariance structure of this type of data, existing methods are not satisfactory. We thus propose a generalized spatial structural equation model. In the first level of the model, we assume that the observed variables are related to particular underlying factors. In the second level of the model, we use the structural equation method to model the relationship among the underlying factors and use parametric spatial distributions on the covariance structure of the underlying factors. We apply the model to county-level cancer mortality and census summary data for Minnesota, including socioeconomic status and access to public utilities. 相似文献
17.
A simple population genetic model is presented for a hermaphrodite annual species, allowing both selfing and outcrossing. Those male gametes (pollen) responsible for outcrossing are assumed to disperse much further than seeds. Under this model, the pedigree of a sample from a single locality is loop-free. A novel Markov chain Monte Carlo strategy is presented for sampling from the joint posterior distribution of the pedigree of such a sample and the parameters of the population genetic model (including the selfing rate) given the genotypes of the sampled individuals at unlinked marker loci. The computational costs of this Markov chain Monte Carlo strategy scale well with the number of individuals in the sample, and the number of marker loci, but increase exponentially with the age (time since colonisation from the source population) of the local population. Consequently, this strategy is particularly suited to situations where the sample has been collected from a population which is the result of a recent colonisation process. 相似文献
18.
19.
G. Thaller I. Hoeschele 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》1996,93(7):1161-1166
A Bayesian approach to the statistical mapping of Quantitative Trait Loci (QTLs) using single markers was implemented via Markov Chain Monte Carlo (MCMC) algorithms for parameter estimation and hypothesis testing. Parameter estimators were marginal posterior means computed using a Gibbs sampler with data augmentation. Variables sampled included the augmented data (marker-QTL genotypes, polygenic effects), an indicator variable for linkage, and the parameters (allele frequency, QTL substitution effect, recombination rate, polygenic and residual variances). Several MCMC algorithms were derived for computing Bayesian tests of linkage, which consisted of the marginal posterior probability of linkage and the marginal likelihood of the QTL variance associated with the marker. 相似文献
20.
In this paper, we analyze infant mortality in Nigeria based on the data set from the 1999 Nigeria Demographic and Health Survey (NDHS). We investigate spatial patterns at a highly disaggregated level of Nigerian states and consider non-linear effects of mother's age at birth. Time to the occurrence of a child's death can intuitively be considered to be categorical in nature and the determinants of a child's death may differ in different age groups. Thus, it may be desirable to investigate separately the death of a child in the first month and in the remaining 11 months of the first year of life. To avoid selection bias, the data set used for this case study is based on information on children who were born 12 months preceding the survey. Inference is Bayesian and is based on Markov chain Monte Carlo (MCMC) techniques. We find that spatial variation and the determinants of death indeed differ considerably for the two age groups considered. 相似文献