共查询到20条相似文献,搜索用时 0 毫秒
1.
Background
Recently mixed linear models are used to address the issue of “missing\" heritability in traditional Genome-wide association studies (GWAS). The models assume that all single-nucleotide polymorphisms (SNPs) are associated with the phenotypes of interest. However, it is more common that only a small proportion of SNPs have significant effects on the phenotypes, while most SNPs have no or very small effects. To incorporate this feature, we propose an efficient Hierarchical Bayesian Model (HBM) that extends the existing mixed models to enforce automatic selection of significant SNPs. The HBM models the SNP effects using a mixture distribution of a point mass at zero and a normal distribution, where the point mass corresponds to those non-associative SNPs.Results
We estimate the HBM using Gibbs sampling. The estimation performance of our method is first demonstrated through two simulation studies. We make the simulation setups realistic by using parameters fitted on the Framingham Heart Study (FHS) data. The simulation studies show that our method can accurately estimate the proportion of SNPs associated with the simulated phenotype and identify these SNPs, as well as adapt to certain model mis-specification than the standard mixed models. In addition, we analyze data from the FHS and the Health and Retirement Study (HRS) to study the association between Body Mass Index (BMI) and SNPs on Chromosome 16, and replicate the identified genetic associations. The analysis of the FHS data identifies 0.3% SNPs on Chromosome 16 that affect BMI, including rs9939609 and rs9939973 on the FTO gene. These two SNPs are in strong linkage disequilibrium with rs1558902 (Rsq =0.901 for rs9939609 and Rsq =0.905 for rs9939973), which has been reported to be linked with obesity in previous GWAS. We then replicate the findings using the HRS data: the analysis finds 0.4% of SNPs associated with BMI on Chromosome 16. Furthermore, around 25% of the genes that are identified to be associated with BMI are common between the two studies.Conclusions
The results demonstrate that the HBM and the associated estimation algorithm offer a powerful tool for identifying significant genetic associations with phenotypes of interest, among a large number of SNPs that are common in modern genetics studies. 相似文献2.
We consider Bayesian inference and model selection for prevalence estimation using a longitudinal two-phase design in which subjects initially receive a low-cost screening test followed by an expensive diagnostic test conducted on several occasions. The change in the subject's diagnostic probability over time is described using four mixed-effects probit models in which the subject-specific effects are captured by latent variables. The computations are performed using Markov chain Monte Carlo methods. These models are then compared using the deviance information criterion. The methodology is illustrated with an analysis of alcohol and drug use in adolescents using data from the Great Smoky Mountains Study. 相似文献
3.
Bayesian (via Gibbs sampling) and empirical BLUP (EBLUP) estimation of fixed effects and breeding values were compared by simulation. Combinations of two simulation models (with or without effect of contemporary group (CG)), three selection schemes (random, phenotypic and BLUP selection), two levels of heritability (0.20 and 0.50) and two levels of pedigree information (0% and 15% randomly missing) were considered. Populations consisted of 450 animals spread over six discrete generations. An infinitesimal additive genetic animal model was assumed while simulating data. EBLUP and Bayesian estimates of CG effects and breeding values were, in all situations, essentially the same with respect to Spearman''s rank correlation between true and estimated values. Bias and mean square error (MSE) of EBLUP and Bayesian estimates of CG effects and breeding values showed the same pattern over the range of simulated scenarios. Methods were not biased by phenotypic and BLUP selection when pedigree information was complete, albeit MSE of estimated breeding values increased for situations where CG effects were present. Estimation of breeding values by Bayesian and EBLUP was similarly affected by joint effect of phenotypic or BLUP selection and randomly missing pedigree information. For both methods, bias and MSE of estimated breeding values and CG effects substantially increased across generations. 相似文献
4.
Multiple shrinkage and subset selection in wavelets 总被引:6,自引:0,他引:6
5.
C. S. Wang D. Gianola D. A. Sorensen J. Jensen A. Christensen J. J. Rutledge 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》1994,88(2):220-230
A replicated selection experiment aimed at increasing litter size (total number of pigs born per litter) in Danish Landrace pigs was conducted from 1984 to 1991. The experiment included two selection and two control lines. In each generation, 30 and 14 first litters were produced in selection and control lines, respectively, and dams produced two litters. Each replicate, consisting of one selection and one control line, was founded from 60 families chosen randomly from the population at large. Family selection was practiced, and the criterion was the predicted breeding value for litter size computed using a repeatability animal model, and taking into account all available information. The data consisted of 947 records from 523 dams (424 dams had two litters) representing five cycles of selection of increased litter size. Data were analyzed from a Bayesian perspective, based on marginal posterior distributions of genetic parameters of interest. Marginalization was achieved using Gibbs sampling, with a single chain length of 1 205 000. After discarding the first 5 000 iterations, a sample was drawn every ten iterations, so 120 000 samples in total were saved. Densities were estimated and plotted, and summary statistics were computed from the estimated densities. The posterior means (± standard error) of heritability and repeatability were 0.22 ± 0.06 and 0.32 ± 0.05, respectively. These point estimates of genetic parameters were within the range of literature values, although on the high side. The posterior mean (± standard error) of genetic response to selection, defined as the difference between the mean breeding values of the selected lines and that of the base population, was 1.37 ± 0.43 pigs after five cycles of selection. The regression (through the origin) of breeding values in the selected lines on generation was 0.25 ± 0.08 pigs. Several informative priors constructed from information obtained with field data in this population were used to examine their influence on inferences. The priors were influential because of the relatively small scale of the experiment. An analysis excluding data from one of the control lines gave smaller genetic variance and heritability, and a smaller response to selection. However, it appears that selection for litter size is effective, but that the true rate of response is probably smaller than data from this experiment suggest. 相似文献
6.
We describe a Bayesian quantile regression model that uses a confirmatory factor structure for part of the design matrix. This model is appropriate when the covariates are indicators of scientifically determined latent factors, and it is these latent factors that analysts seek to include as predictors in the quantile regression. We apply the model to a study of birth weights in which the effects of latent variables representing psychosocial health and actual tobacco usage on the lower quantiles of the response distribution are of interest. The models can be fit using an R package called factorQR. 相似文献
7.
David Manner John W. Seaman Dean M. Young 《Biometrical journal. Biometrische Zeitschrift》2004,46(6):750-759
If a dependent variable in a regression analysis is exceptionally expensive or hard to obtain the overall sample size used to fit the model may be limited. To avoid this one may use a cheaper or more easily collected “surrogate” variable to supplement the expensive variable. The regression analysis will be enhanced to the degree the surrogate is associated with the costly dependent variable. We develop a Bayesian approach incorporating surrogate variables in regression based on a two‐stage experiment. Illustrative examples are given, along with comparisons to an existing frequentist method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献
8.
Summary . Two-agent combination trials have recently attracted enormous attention in oncology research. There are several strong motivations for combining different agents in a treatment: to induce the synergistic treatment effect, to increase the dose intensity with nonoverlapping toxicities, and to target different tumor cell susceptibilities. To accommodate this growing trend in clinical trials, we propose a Bayesian adaptive design for dose finding based on latent 2 × 2 tables. In the search for the maximum tolerated dose combination, we continuously update the posterior estimates for the unknown parameters associated with marginal probabilities and the correlation parameter based on the data from successive patients. By reordering the dose toxicity probabilities in the two-dimensional space, we assign each coming cohort of patients to the most appropriate dose combination. We conduct extensive simulation studies to examine the operating characteristics of the proposed method under various practical scenarios. Finally, we illustrate our dose-finding procedure with a clinical trial of agent combinations at M. D. Anderson Cancer Center. 相似文献
9.
This paper considers the use of a multivariate binomial probit model for the analysis of correlated exchangeable binary data. The model can naturally accommodate both cluster and individual level covariates, while keeping a fairly flexible intracluster association structure. We discuss Bayesian estimation when a sample of independent clusters of varying sizes are available, and show how Gibbs sampling may be used to derive the posterior densities of parameters. The methodology is illustrated with two examples: the first involves epidemiological data from a study of familial disease aggregation; the second uses teratological data from a developmental toxicity application. 相似文献
10.
A Bayesian adaptive design is proposed for dose-finding in phase I/II clinical trials to incorporate the bivariate outcomes, toxicity and efficacy, of a new treatment. Without specifying any parametric functional form for the drug dose-response curve, we jointly model the bivariate binary data to account for the correlation between toxicity and efficacy. After observing all the responses of each cohort of patients, the dosage for the next cohort is escalated, deescalated, or unchanged according to the proposed odds ratio criteria constructed from the posterior toxicity and efficacy probabilities. A novel class of prior distributions is proposed through logit transformations which implicitly imposes a monotonic constraint on dose toxicity probabilities and correlates the probabilities of the bivariate outcomes. We conduct simulation studies to evaluate the operating characteristics of the proposed method. Under various scenarios, the new Bayesian design based on the toxicity-efficacy odds ratio trade-offs exhibits good properties and treats most patients at the desirable dose levels. The method is illustrated with a real trial design for a breast medical oncology study. 相似文献
11.
D. P. Gwaze J. A. Woolliams 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2001,103(1):63-69
The use of Gibbs sampling in making decisions about the optimal selection environment was demonstrated. Marginal posterior
distributions of the efficiency of selection across sites were obtained using the Gibbs sampler, a Bayesian method, from which
the probability that the efficiency of selection lay between specified values and the variance of the distribution were computed,
providing a lot of information on which to make decisions regarding the location of genetic tests. The heritability, genetic
correlations and efficiencies of selection estimated using REML and Gibbs sampling were similar. However, the latter approach
showed that the point estimates of the efficiencies of selection were subject to substantial error. The decision regarding
selection at maturity was consistent with that obtained using point estimates from REML, but Gibbs sampling allowed the efficiencies
of selection to be interpreted with more confidence. The decision regarding early selection differed from that based on REML
point estimates. Generally, the decisions to make early selections at site B for planting at both site B and A, and to make
selections at maturity at each individual site, were robust to different priors in the Gibbs sampling.
Received: 19 June 2000 / Accepted: 18 October 2000 相似文献
12.
Clarke SC McAllister MK Milner-Gulland EJ Kirkwood GP Michielsens CG Agnew DJ Pikitch EK Nakano H Shivji MS 《Ecology letters》2006,9(10):1115-1126
Despite growing concerns about overexploitation of sharks, lack of accurate, species-specific harvest data often hampers quantitative stock assessment. In such cases, trade studies can provide insights into exploitation unavailable from traditional monitoring. We applied Bayesian statistical methods to trade data in combination with genetic identification to estimate by species, the annual number of globally traded shark fins, the most commercially valuable product from a group of species often unrecorded in harvest statistics. Our results provide the first fishery-independent estimate of the scale of shark catches worldwide and indicate that shark biomass in the fin trade is three to four times higher than shark catch figures reported in the only global data base. Comparison of our estimates to approximated stock assessment reference points for one of the most commonly traded species, blue shark, suggests that current trade volumes in numbers of sharks are close to or possibly exceeding the maximum sustainable yield levels. 相似文献
13.
Ning Li Michael J. Daniels Gang Li Robert M. Elashoff 《Biometrical journal. Biometrische Zeitschrift》2013,55(1):17-37
We explore a Bayesian approach to selection of variables that represent fixed and random effects in modeling of longitudinal binary outcomes with missing data caused by dropouts. We show via analytic results for a simple example that nonignorable missing data lead to biased parameter estimates. This bias results in selection of wrong effects asymptotically, which we can confirm via simulations for more complex settings. By jointly modeling the longitudinal binary data with the dropout process that possibly leads to nonignorable missing data, we are able to correct the bias in estimation and selection. Mixture priors with a point mass at zero are used to facilitate variable selection. We illustrate the proposed approach using a clinical trial for acute ischemic stroke. 相似文献
14.
低剂量混合污染生态毒理与风险评价研究进展 总被引:3,自引:0,他引:3
环境中的化学品往往以低剂量混合形式存在.对单一化学品高剂量暴露下的生态毒性研究成果,难以适用于环境中低剂量混合物的生态毒理效应诊断及风险评价.文中概述了低剂量化学品混合污染生态毒理及风险评价方面的研究进展,主要包括低剂量化学品混合污染诊断的分子毒理研究方法、风险评价方法,并介绍了简单和复杂混合物的风险评价方案.对低剂量混合污染生态毒理与风险评价研究的发展动向提出了见解,指出低剂量化学混合物的研究需要寻找敏感终点,引入多学科手段,积累更多的数据,建立完善、统一的评价体系. 相似文献
15.
This paper discusses random effects in censored ordinal regression and presents a Gibbs sampling approach to fit the regression model. A latent structure and its corresponding Bayesian formulation are introduced to effectively deal with heterogeneous and censored ordinal observations. This work is motivated by the need to analyze interval-censored ordinal data from multiple studies in toxicological risk assessment. Application of our methodology to the data offers further support to the conclusions developed earlier using GEE methods yet provides additional insight into the uncertainty levels of the risk estimates. 相似文献
16.
Accurate and fast estimation of genetic parameters that underlie quantitative traits using mixed linear models with additive and dominance effects is of great importance in both natural and breeding populations. Here, we propose a new fast adaptive Markov chain Monte Carlo (MCMC) sampling algorithm for the estimation of genetic parameters in the linear mixed model with several random effects. In the learning phase of our algorithm, we use the hybrid Gibbs sampler to learn the covariance structure of the variance components. In the second phase of the algorithm, we use this covariance structure to formulate an effective proposal distribution for a Metropolis-Hastings algorithm, which uses a likelihood function in which the random effects have been integrated out. Compared with the hybrid Gibbs sampler, the new algorithm had better mixing properties and was approximately twice as fast to run. Our new algorithm was able to detect different modes in the posterior distribution. In addition, the posterior mode estimates from the adaptive MCMC method were close to the REML (residual maximum likelihood) estimates. Moreover, our exponential prior for inverse variance components was vague and enabled the estimated mode of the posterior variance to be practically zero, which was in agreement with the support from the likelihood (in the case of no dominance). The method performance is illustrated using simulated data sets with replicates and field data in barley. 相似文献
17.
Waldmann P 《Evolution; international journal of organic evolution》2004,58(2):238-244
The concept of developmental instability (DI) is frequently used in evolutionary biology, and a range of definitions has been proposed. Moreover, numerous different statistical methods have been used for estimation of DI. The common basis for all methods is that measures need to be obtained from repeated structures within organisms. In the case of fluctuating asymmetry, mirror images could be interpreted as the repeats of each other. All repeats of a trait on one organism should, from a quantitative perspective, have the same genetic foundation. Most previous methods have not accounted for the genetics of the underlying trait. It is here shown how a statistical method from quantitative genetics (the repeated records animal model) can be used for assessment of DI, based on estimation of the variance due to the permanent environment. Moreover, Gibbs sampling is used for inference of the parameters, which provides a Bayesian framework where posterior distributions easily can be calculated from any functions of the variance components. The method is applied to a real dataset from two populations of the plant Scabiosa canescens, and results shows that it works well under realistic situations. 相似文献
18.
Across multiply imputed data sets, variable selection methods such as stepwise regression and other criterion-based strategies that include or exclude particular variables typically result in models with different selected predictors, thus presenting a problem for combining the results from separate complete-data analyses. Here, drawing on a Bayesian framework, we propose two alternative strategies to address the problem of choosing among linear regression models when there are missing covariates. One approach, which we call \"impute, then select\" (ITS) involves initially performing multiple imputation and then applying Bayesian variable selection to the multiply imputed data sets. A second strategy is to conduct Bayesian variable selection and missing data imputation simultaneously within one Gibbs sampling process, which we call \"simultaneously impute and select\" (SIAS). The methods are implemented and evaluated using the Bayesian procedure known as stochastic search variable selection for multivariate normal data sets, but both strategies offer general frameworks within which different Bayesian variable selection algorithms could be used for other types of data sets. A study of mental health services utilization among children in foster care programs is used to illustrate the techniques. Simulation studies show that both ITS and SIAS outperform complete-case analysis with stepwise variable selection and that SIAS slightly outperforms ITS. 相似文献
19.
Simulated data were used to determine the properties of multivariate prediction of breeding values for categorical and continuous traits using phenotypic, molecular genetic and pedigree information by mixed linear-threshold animal models via Gibbs sampling. Simulation parameters were chosen such that the data resembled situations encountered in Warmblood horse populations. Genetic evaluation was performed in the context of the radiographic findings in the equine limbs. The simulated pedigree comprised seven generations and 40 000 animals per generation. The simulated data included additive genetic values, residuals and fixed effects for one continuous trait and liabilities of four binary traits. For one of the binary traits, quantitative trait locus (QTL) effects and genetic markers were simulated, with three different scenarios with respect to recombination rate (r) between genetic markers and QTL and polymorphism information content (PIC) of genetic markers being studied: r = 0.00 and PIC = 0.90 (r0p9), r = 0.01 and PIC = 0.90 (r1p9), and r = 0.00 and PIC = 0.70 (r0p7). For each scenario, 10 replicates were sampled from the simulated horse population, and six different data sets were generated per replicate. Data sets differed in number and distribution of animals with trait records and the availability of genetic marker information. Breeding values were predicted via Gibbs sampling using a Bayesian mixed linear-threshold animal model with residual covariances fixed to zero and a proper prior for the genetic covariance matrix. Relative breeding values were used to investigate expected response to multi- and single-trait selection. In the sires with 10 or more offspring with trait information, correlations between true and predicted breeding values ranged between 0.89 and 0.94 for the continuous traits and between 0.39 and 0.77 for the binary traits. Proportions of successful identification of sires of average, favourable and unfavourable genetic value were 81% to 86% for the continuous trait and 57% to 74% for the binary traits in these sires. Expected decrease of prevalence of the QTL trait was 3% to 12% after multi-trait selection for all binary traits and 9% to 17% after single-trait selection for the QTL trait. The combined use of phenotype and genotype data was superior to the use of phenotype data alone. It was concluded that information on phenotypes and highly informative genetic markers should be used for prediction of breeding values in mixed linear-threshold animal models via Gibbs sampling to achieve maximum reduction in prevalences of binary traits. 相似文献
20.
Getachew A. Dagne 《Biometrical journal. Biometrische Zeitschrift》2004,46(6):653-663
This article presents two‐component hierarchical Bayesian models which incorporate both overdispersion and excess zeros. The components may be resultants of some intervention (treatment) that changes the rare event generating process. The models are also expanded to take into account any heterogeneity that may exist in the data. Details of the model fitting, checking and selecting alternative models from a Bayesian perspective are also presented. The proposed methods are applied to count data on the assessment of an efficacy of pesticides in controlling the reproduction of whitefly. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献