首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Leeyoung Park  Ju H. Kim 《Genetics》2015,199(4):1007-1016
Causal models including genetic factors are important for understanding the presentation mechanisms of complex diseases. Familial aggregation and segregation analyses based on polygenic threshold models have been the primary approach to fitting genetic models to the family data of complex diseases. In the current study, an advanced approach to obtaining appropriate causal models for complex diseases based on the sufficient component cause (SCC) model involving combinations of traditional genetics principles was proposed. The probabilities for the entire population, i.e., normal–normal, normal–disease, and disease–disease, were considered for each model for the appropriate handling of common complex diseases. The causal model in the current study included the genetic effects from single genes involving epistasis, complementary gene interactions, gene–environment interactions, and environmental effects. Bayesian inference using a Markov chain Monte Carlo algorithm (MCMC) was used to assess of the proportions of each component for a given population lifetime incidence. This approach is flexible, allowing both common and rare variants within a gene and across multiple genes. An application to schizophrenia data confirmed the complexity of the causal factors. An analysis of diabetes data demonstrated that environmental factors and gene–environment interactions are the main causal factors for type II diabetes. The proposed method is effective and useful for identifying causal models, which can accelerate the development of efficient strategies for identifying causal factors of complex diseases.  相似文献   

2.
Estimation of extreme quantal-response statistics, such as the concentration required to kill 99.9% of test subjects (LC99.9), remains a challenge in the presence of multiple covariates and complex study designs. Accurate and precise estimates of the LC99.9 for mixtures of toxicants are critical to ongoing control of a parasitic invasive species, the sea lamprey, in the Laurentian Great Lakes of North America. The toxicity of those chemicals is affected by local and temporal variations in water chemistry, which must be incorporated into the modeling. We develop multilevel empirical Bayes models for data from multiple laboratory studies. Our approach yields more accurate and precise estimation of the LC99.9 compared to alternative models considered. This study demonstrates that properly incorporating hierarchical structure in laboratory data yields better estimates of LC99.9 stream treatment values that are critical to larvae control in the field. In addition, out-of-sample prediction of the results of in situ tests reveals the presence of a latent seasonal effect not manifest in the laboratory studies, suggesting avenues for future study and illustrating the importance of dual consideration of both experimental and observational data.  相似文献   

3.
4.
Various simple mathematical models have been used to investigate dengue transmission. Some of these models explicitly model the mosquito population, while others model the mosquitoes implicitly in the transmission term. We study the impact of modeling assumptions on the dynamics of dengue in Thailand by fitting dengue hemorrhagic fever (DHF) data to simple vector–host and SIR models using Bayesian Markov chain Monte Carlo estimation. The parameter estimates obtained for both models were consistent with previous studies. Most importantly, model selection found that the SIR model was substantially better than the vector–host model for the DHF data from Thailand. Therefore, explicitly incorporating the mosquito population may not be necessary in modeling dengue transmission for some populations.  相似文献   

5.
Most modern population genetics inference methods are based on the coalescence framework. Methods that allow estimating parameters of structured populations commonly insert migration events into the genealogies. For these methods the calculation of the coalescence probability density of a genealogy requires a product over all time periods between events. Data sets that contain populations with high rates of gene flow among them require an enormous number of calculations. A new method, transition probability-structured coalescence (TPSC), replaces the discrete migration events with probability statements. Because the speed of calculation is independent of the amount of gene flow, this method allows calculating the coalescence densities efficiently. The current implementation of TPSC uses an approximation simplifying the interaction among lineages. Simulations and coverage comparisons of TPSC vs. MIGRATE show that TPSC allows estimation of high migration rates more precisely, but because of the approximation the estimation of low migration rates is biased. The implementation of TPSC into programs that calculate quantities on phylogenetic tree structures is straightforward, so the TPSC approach will facilitate more general inferences in many computer programs.  相似文献   

6.
Cheon S  Liang F 《Bio Systems》2008,91(1):94-107
Monte Carlo methods have received much attention recently in the literature of phylogenetic tree construction. However, they often suffer from two difficulties, the curse of dimensionality and the local-trap problem. The former one is due to that the number of possible phylogenetic trees increases at a super-exponential rate as the number of taxa increases. The latter one is due to that the phylogenetic tree has often a rugged energy landscape. In this paper, we propose a new phylogenetic tree construction method, which attempts to alleviate these two difficulties simultaneously by making use of the sequential structure of phylogenetic trees in conjunction with stochastic approximation Monte Carlo (SAMC) simulations. The use of the sequential structure of the problem provides substantial help to reduce the curse of dimensionality in simulations, and SAMC effectively prevents the system from getting trapped in local energy minima. The new method is compared with a variety of existing Bayesian and non-Bayesian methods on simulated and real datasets. Numerical results are in favor of the new method in terms of quality of the resulting phylogenetic trees.  相似文献   

7.
In this paper, we analyze infant mortality in Nigeria based on the data set from the 1999 Nigeria Demographic and Health Survey (NDHS). We investigate spatial patterns at a highly disaggregated level of Nigerian states and consider non-linear effects of mother's age at birth. Time to the occurrence of a child's death can intuitively be considered to be categorical in nature and the determinants of a child's death may differ in different age groups. Thus, it may be desirable to investigate separately the death of a child in the first month and in the remaining 11 months of the first year of life. To avoid selection bias, the data set used for this case study is based on information on children who were born 12 months preceding the survey. Inference is Bayesian and is based on Markov chain Monte Carlo (MCMC) techniques. We find that spatial variation and the determinants of death indeed differ considerably for the two age groups considered.  相似文献   

8.
Monte Carlo methods have received much attention in the recent literature of phylogeny analysis. However, the conventional Markov chain Monte Carlo algorithms, such as the Metropolis–Hastings algorithm, tend to get trapped in a local mode in simulating from the posterior distribution of phylogenetic trees, rendering the inference ineffective. In this paper, we apply an advanced Monte Carlo algorithm, the stochastic approximation Monte Carlo algorithm, to Bayesian phylogeny analysis. Our method is compared with two popular Bayesian phylogeny software, BAMBE and MrBayes, on simulated and real datasets. The numerical results indicate that our method outperforms BAMBE and MrBayes. Among the three methods, SAMC produces the consensus trees which have the highest similarity to the true trees, and the model parameter estimates which have the smallest mean square errors, but costs the least CPU time.  相似文献   

9.
Our present work proposes a new survival model in a Bayesian context to analyze right‐censored survival data for populations with a surviving fraction, assuming that the log failure time follows a generalized extreme value distribution. Many applications require a more flexible modeling of covariate information than a simple linear or parametric form for all covariate effects. It is also necessary to include the spatial variation in the model, since it is sometimes unexplained by the covariates considered in the analysis. Therefore, the nonlinear covariate effects and the spatial effects are incorporated into the systematic component of our model. Gaussian processes (GPs) provide a natural framework for modeling potentially nonlinear relationship and have recently become extremely powerful in nonlinear regression. Our proposed model adopts a semiparametric Bayesian approach by imposing a GP prior on the nonlinear structure of continuous covariate. With the consideration of data availability and computational complexity, the conditionally autoregressive distribution is placed on the region‐specific frailties to handle spatial correlation. The flexibility and gains of our proposed model are illustrated through analyses of simulated data examples as well as a dataset involving a colon cancer clinical trial from the state of Iowa.  相似文献   

10.
Bayesian estimation of the risk of a disease around a known point source of exposure is considered. The minimal requirements for data are that cases and populations at risk are known for a fixed set of concentric annuli around the point source, and each annulus has a uniquely defined distance from the source. The conventional Poisson likelihood is assumed for the counts of disease cases in each annular zone with zone‐specific relative risk and parameters and, conditional on the risks, the counts are considered to be independent. The prior for the relative risk parameters is assumed to be piecewise constant at the distance having a known number of components. This prior is the well‐known change‐point model. Monte Carlo sampling from the posterior results in zone‐specific posterior summaries, which can be applied for the calculation of a smooth curve describing the variation in disease risk as a function of the distance from the putative source. In addition, the posterior can be used in the calculation of posterior probabilities for interesting hypothesis. The suggested model is suitable for use in geographical information systems (GIS) aimed for monitoring disease risks. As an application, a case study on the incidence of lung cancer around a former asbestos mine in eastern Finland is presented. Further extensions of the model are discussed.  相似文献   

11.
There are often two types of correlations in multivariate spatial data: correlations between variables measured at the same locations, and correlations of each variable across the locations. We hypothesize that these two types of correlations are caused by a common spatially correlated underlying factor. Under this hypothesis, we propose a generalized common spatial factor model. The parameters are estimated using the Bayesian method and a Markov chain Monte Carlo computing technique. Our main goals are to determine which observed variables share a common underlying spatial factor and also to predict the common spatial factor. The model is applied to county-level cancer mortality data in Minnesota to find whether there exists a common spatial factor underlying the cancer mortality throughout the state.  相似文献   

12.
Bayesian shrinkage analysis is arguably the state-of-the-art technique for large-scale multiple quantitative trait locus (QTL) mapping. However, when the shrinkage model does not involve indicator variables for marker inclusion, QTL detection remains heavily dependent on significance thresholds derived from phenotype permutation under the null hypothesis of no phenotype-to-genotype association. This approach is computationally intensive and more importantly, the hypothetical data generation at the heart of the permutation-based method violates the Bayesian philosophy. Here we propose a fully Bayesian decision rule for QTL detection under the recently introduced extended Bayesian LASSO for QTL mapping. Our new decision rule is free of any hypothetical data generation and relies on the well-established Bayes factors for evaluating the evidence for QTL presence at any locus. Simulation results demonstrate the remarkable performance of our decision rule. An application to real-world data is considered as well.  相似文献   

13.
A Bayesian CART algorithm   总被引:3,自引:0,他引:3  
  相似文献   

14.
Hodges JS  Carlin BP  Fan Q 《Biometrics》2003,59(2):317-322
Bayesian analyses of spatial data often use a conditionally autoregressive (CAR) prior, which can be written as the kernel of an improper density that depends on a precision parameter tau that is typically unknown. To include tau in the Bayesian analysis, the kernel must be multiplied by tau(k) for some k. This article rigorously derives k = (n - I)/2 for the L2 norm CAR prior (also called a Gaussian Markov random field model) and k = n - I for the L1 norm CAR prior, where n is the number of regions and I the number of "islands" (disconnected groups of regions) in the spatial map. Since I = 1 for a spatial structure defining a connected graph, this supports Knorr-Held's (2002, in Highly Structured Stochastic Systems, 260-264) suggestion that k = (n - 1)/2 in the L2 norm case, instead of the more common k = n/2. We illustrate the practical significance of our results using a periodontal example.  相似文献   

15.
ABSTRACT DNA-based mark-recapture has become a methodological cornerstone of research focused on bear species. The objective of such studies is often to estimate population size; however, doing so is frequently complicated by movement of individual bears. Movement affects the probability of detection and the assumption of closure of the population required in most models. To mitigate the bias caused by movement of individuals, population size and density estimates are often adjusted using ad hoc methods, including buffering the minimum polygon of the trapping array. We used a hierarchical, spatial capture-recapture model that contains explicit components for the spatial-point process that governs the distribution of individuals and their exposure to (via movement), and detection by, traps. We modeled detection probability as a function of each individual's distance to the trap and an indicator variable for previous capture to account for possible behavioral responses. We applied our model to a 2006 hair-snare study of a black bear (Ursus americanus) population in northern New York, USA. Based on the microsatellite marker analysis of collected hair samples, 47 individuals were identified. We estimated mean density at 0.20 bears/km2. A positive estimate of the indicator variable suggests that bears are attracted to baited sites; therefore, including a trap-dependence covariate is important when using bait to attract individuals. Bayesian analysis of the model was implemented in WinBUGS, and we provide the model specification. The model can be applied to any spatially organized trapping array (hair snares, camera traps, mist nests, etc.) to estimate density and can also account for heterogeneity and covariate information at the trap or individual level.  相似文献   

16.
Summary In epidemics of infectious diseases such as influenza, an individual may have one of four possible final states: prior immune, escaped from infection, infected with symptoms, and infected asymptomatically. The exact state is often not observed. In addition, the unobserved transmission times of asymptomatic infections further complicate analysis. Under the assumption of missing at random, data‐augmentation techniques can be used to integrate out such uncertainties. We adapt an importance‐sampling‐based Monte Carlo Expectation‐Maximization (MCEM) algorithm to the setting of an infectious disease transmitted in close contact groups. Assuming the independence between close contact groups, we propose a hybrid EM‐MCEM algorithm that applies the MCEM or the traditional EM algorithms to each close contact group depending on the dimension of missing data in that group, and discuss the variance estimation for this practice. In addition, we propose a bootstrap approach to assess the total Monte Carlo error and factor that error into the variance estimation. The proposed methods are evaluated using simulation studies. We use the hybrid EM‐MCEM algorithm to analyze two influenza epidemics in the late 1970s to assess the effects of age and preseason antibody levels on the transmissibility and pathogenicity of the viruses.  相似文献   

17.
Suitability of trees as hosts for epiphytic lichens are studied in a forest stand of size 25 ha. Suitability is measured as occupation probabilites which are modelled using hierarchical Bayesian approach. These probabilities are useful for an ecologist. They give smoothed spatial distribution map of suitability for each of the species and can be used in detecting high‐ and low‐probability areas. In addition, suitability is explained by tree‐level covariates. Spatial dependence, which is due to unobserved spatially structured covariates, is modelled through an unobserved Markov random field. Markov chain Monte Carlo method has been applied in Bayesian computation. The extensive spatial data consist of the occurrences of eight lichen species and one bryophyte on all of the 1253 potential host trees. In addition, coordinates of the trees and several tree characteristics have been recorded. The data have been analysed for four most abundant species: Lobaria pulmonaria, Nephroma bellum, Nephroma parile and Peltigera praetextata. The tree level parameters, subject to estimation, consist of the occurrence probabilities for each tree and for each lichen species. Model validation is discussed in detail and, in addition to Bayesian validation tools, the autologistic model and case‐control design based on logistic regression have been suggested for validation of covariate effects. As a result we present suitability maps for the four lichen species. We observed, that among the observed tree covariates, the diameter at breast height (DBH) correlates with lichen occurrence. Our modelling approach has close connections to disease mapping in spatial epidemiology.  相似文献   

18.
Tensor regression analysis is finding vast emerging applications in a variety of clinical settings, including neuroimaging, genomics, and dental medicine. The motivation for this paper is a study of periodontal disease (PD) with an order-3 tensor response: multiple biomarkers measured at prespecified tooth–sites within each tooth, for each participant. A careful investigation would reveal considerable skewness in the responses, in addition to response missingness. To mitigate the shortcomings of existing analysis tools, we propose a new Bayesian tensor response regression method that facilitates interpretation of covariate effects on both marginal and joint distributions of highly skewed tensor responses, and accommodates missing-at-random responses under a closure property of our tensor model. Furthermore, we present a prudent evaluation of the overall covariate effects while identifying their possible variations on only a sparse subset of the tensor components. Our method promises Markov chain Monte Carlo (MCMC) tools that are readily implementable. We illustrate substantial advantages of our proposal over existing methods via simulation studies and application to a real data set derived from a clinical study of PD. The R package BSTN available in GitHub implements our model.  相似文献   

19.
Bayesian model–based clustering provides a powerful and flexible tool that can be incorporated into regression models to better understand the grouping of observations. Using data from the Seychelles Child Development Study, we explore the effect of prenatal methylmercury exposure on 20 neurodevelopmental outcomes measured in 9-year-old children. Rather than cluster individual subjects, we cluster the outcomes within a multiple outcomes model. By using information in the data to nest the outcomes into groups called domains, the model more accurately reflects the shared characteristics of neurodevelopmental domains and improves estimation of the overall and outcome-specific exposure effects by shrinking effects within and between domains selected by the data. The Bayesian paradigm allows for sampling from the posterior distribution of the grouping parameters; thus, inference can be made about group membership and their defining characteristics. We avoid the often difficult and highly subjective requirement of a priori identification of the total number of groups by incorporating a Dirichlet process prior to form a fully Bayesian multiple outcomes model.  相似文献   

20.
Generalized spatial structural equation models   总被引:1,自引:0,他引:1  
It is common in public health research to have high-dimensional, multivariate, spatially referenced data representing summaries of geographic regions. Often, it is desirable to examine relationships among these variables both within and across regions. An existing modeling technique called spatial factor analysis has been used and assumes that a common spatial factor underlies all the variables and causes them to be related to one another. An extension of this technique considers that there may be more than one underlying factor, and that relationships among the underlying latent variables are of primary interest. However, due to the complicated nature of the covariance structure of this type of data, existing methods are not satisfactory. We thus propose a generalized spatial structural equation model. In the first level of the model, we assume that the observed variables are related to particular underlying factors. In the second level of the model, we use the structural equation method to model the relationship among the underlying factors and use parametric spatial distributions on the covariance structure of the underlying factors. We apply the model to county-level cancer mortality and census summary data for Minnesota, including socioeconomic status and access to public utilities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号