首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
The main causes of numerical chromosomal anomalies, including trisomies, arise from an error in the chromosomal segregation during the meiotic process, named a non-disjunction. One of the most used techniques to analyze chromosomal anomalies nowadays is the polymerase chain reaction (PCR), which counts the number of peaks or alleles in a polymorphic microsatellite locus. It was shown in previous works that the number of peaks has a multinomial distribution whose probabilities depend on the non-disjunction fraction F. In this work, we propose a Bayesian approach for estimating the meiosis I non-disjunction fraction F. in the absence of the parental information. Since samples of trisomic patients are, in general, small, the Bayesian approach can be a good alternative for solving this problem. We consider the sampling/importance resampling technique and the Simpson rule to extract information from the posterior distribution of F. Bayes and maximum likelihood estimators are compared through a Monte Carlo simulation, focusing on the influence of different sample sizes and prior specifications in the estimates. We apply the proposed method to estimate F. for patients with trisomy of chromosome 21 providing a sensitivity analysis for the method. The results obtained show that Bayes estimators are better in almost all situations.  相似文献   

2.
In this paper we introduce a misclassification model for the meiosis I non‐disjunction fraction in numerical chromosomal anomalies named trisomies. We obtain posteriors, and their moments, for the probability that a non‐disjunction occurs in the first division of meiosis and for the misclassification errors. We also extend previous works by providing the exact posterior, and its moments, for the probability that a non‐disjunction occurs in the first division of meiosis assuming the model proposed in the literature which does not consider that data are subject to misclassification. We perform Monte Carlo studies in order to compare Bayes estimates obtained by using both models. An application to Down Syndrome data is also presented. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

3.
To study lifetimes of certain engineering processes, a lifetime model which can accommodate the nature of such processes is desired. The mixture models of underlying lifetime distributions are intuitively more appropriate and appealing to model the heterogeneous nature of process as compared to simple models. This paper is about studying a 3-component mixture of the Rayleigh distributionsin Bayesian perspective. The censored sampling environment is considered due to its popularity in reliability theory and survival analysis. The expressions for the Bayes estimators and their posterior risks are derived under different scenarios. In case the case that no or little prior information is available, elicitation of hyperparameters is given. To examine, numerically, the performance of the Bayes estimators using non-informative and informative priors under different loss functions, we have simulated their statistical properties for different sample sizes and test termination times. In addition, to highlight the practical significance, an illustrative example based on a real-life engineering data is also given.  相似文献   

4.
We consider that observations come from a general normal linearmodel and that it is desirable to test a simplifying null hypothesisabout the parameters. We approach this problem from an objectiveBayesian, model-selection perspective. Crucial ingredients forthis approach are ‘proper objective priors’ to beused for deriving the Bayes factors. Jeffreys-Zellner-Siow priorshave good properties for testing null hypotheses defined byspecific values of the parameters in full-rank linear models.We extend these priors to deal with general hypotheses in generallinear models, not necessarily of full rank. The resulting priors,which we call ‘conventional priors’, are expressedas a generalization of recently introduced ‘partiallyinformative distributions’. The corresponding Bayes factorsare fully automatic, easily computed and very reasonable. Themethodology is illustrated for the change-point problem andthe equality of treatments effects problem. We compare the conventionalpriors derived for these problems with other objective Bayesianproposals like the intrinsic priors. It is concluded that bothpriors behave similarly although interesting subtle differencesarise. We adapt the conventional priors to deal with nonnestedmodel selection as well as multiple-model comparison. Finally,we briefly address a generalization of conventional priors tononnormal scenarios.  相似文献   

5.
The discovery of rare genetic variants through next generation sequencing is a very challenging issue in the field of human genetics. We propose a novel region‐based statistical approach based on a Bayes Factor (BF) to assess evidence of association between a set of rare variants (RVs) located on the same genomic region and a disease outcome in the context of case‐control design. Marginal likelihoods are computed under the null and alternative hypotheses assuming a binomial distribution for the RV count in the region and a beta or mixture of Dirac and beta prior distribution for the probability of RV. We derive the theoretical null distribution of the BF under our prior setting and show that a Bayesian control of the false Discovery Rate can be obtained for genome‐wide inference. Informative priors are introduced using prior evidence of association from a Kolmogorov‐Smirnov test statistic. We use our simulation program, sim1000G, to generate RV data similar to the 1000 genomes sequencing project. Our simulation studies showed that the new BF statistic outperforms standard methods (SKAT, SKAT‐O, Burden test) in case‐control studies with moderate sample sizes and is equivalent to them under large sample size scenarios. Our real data application to a lung cancer case‐control study found enrichment for RVs in known and novel cancer genes. It also suggests that using the BF with informative prior improves the overall gene discovery compared to the BF with noninformative prior.  相似文献   

6.
We consider the problem of estimating a population size by removal sampling when the sampling rate is unknown. Bayesian methods are now widespread and allow to include prior knowledge in the analysis. However, we show that Bayes estimates based on default improper priors lead to improper posteriors or infinite estimates. Similarly, weakly informative priors give unstable estimators that are sensitive to the choice of hyperparameters. By examining the likelihood, we show that population size estimates can be stabilized by penalizing small values of the sampling rate or large value of the population size. Based on theoretical results and simulation studies, we propose some recommendations on the choice of the prior. Then, we applied our results to real datasets.  相似文献   

7.
OBJECTIVE: We continue statistical development of the posterior probability of linkage (PPL). We present a two-point PPL allowing for unequal male and female recombination fractions, thetaM and thetaF, and consider alternative priors on thetaM, thetaF. METHODS: We compare the sex-averaged PPL (PPLSA), assuming thetaM = thetaF, to the sex-specific PPL (PPLSS) in (thetaM, thetaF), in a series of simulations; we also compute the PPLSS using alternative priors on (thetaM, thetaF). RESULTS: The PPLSS based on a prior that ignores prior genomic information on sex specific recombination rates performs essentially identically to the PPLSA, even in the presence of large thetaM, thetaF differences. Moreover, adaptively skewing the prior, to incorporate (correct) genomic information on thetaM, thetaF differences, actually worsens performance of the PPLSS. We demonstrate that this has little to do with the PPLSS per se, but is rather due to extremely high levels of variability in the location of the maximum likelihood estimates of (thetaM, thetaF) in realistic data sets. CONCLUSIONS: Incorporating (correct) prior genomic information is not always helpful. We recommend that the PPLSA be used as the standard form of the PPL regardless of the sex-specific recombination rates in the region of the marker in question.  相似文献   

8.
The measurement of biallelic pair-wise association called linkage disequilibrium (LD) is an important issue in order to understand the genomic architecture. A plethora of measures of association in two by two tables have been proposed in the literature. Beside the problem of choosing an appropriate measure, the problem of their estimation has been neglected in the literature. It needs to be emphasized that the definition of a measure and the choice of an estimator function for it are conceptually unrelated tasks. In this paper, we compare the performance of various estimators for the three popular LD measures D', r and Y in a simulation study for small to moderate samples sizes (N<=500). The usual frequency-plug-in estimators can lead to unreliable or undefined estimates. Estimators based on the computationally expensive volume measures have been proposed recently as a remedy to this well-known problem. We confirm that volume estimators have better expected mean square error than the naive plug-in estimators. But they are outperformed by estimators plugging-in easy to calculate non-informative Bayesian probability estimates into the theoretical formulae for the measures. Fully Bayesian estimators with non-informative Dirichlet priors have comparable accuracy but are computationally more expensive. We recommend the use of non-informative Bayesian plug-in estimators based on Jeffreys' prior, in particular when dealing with SNP array data where the occurrence of small table entries and table margins is likely.  相似文献   

9.
Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass.  相似文献   

10.
We propose a Bayesian method for testing molecular clock hypotheses for use with aligned sequence data from multiple taxa. Our method utilizes a nonreversible nucleotide substitution model to avoid the necessity of specifying either a known tree relating the taxa or an outgroup for rooting the tree. We employ reversible jump Markov chain Monte Carlo to sample from the posterior distribution of the phylogenetic model parameters and conduct hypothesis testing using Bayes factors, the ratio of the posterior to prior odds of competing models. Here, the Bayes factors reflect the relative support of the sequence data for equal rates of evolutionary change between taxa versus unequal rates, averaged over all possible phylogenetic parameters, including the tree and root position. As the molecular clock model is a restriction of the more general unequal rates model, we use the Savage-Dickey ratio to estimate the Bayes factors. The Savage-Dickey ratio provides a convenient approach to calculating Bayes factors in favor of sharp hypotheses. Critical to calculating the Savage-Dickey ratio is a determination of the prior induced on the modeling restrictions. We demonstrate our method on a well-studied mtDNA sequence data set consisting of nine primates. We find strong support against a global molecular clock, but do find support for a local clock among the anthropoids. We provide mathematical derivations of the induced priors on branch length restrictions assuming equally likely trees. These derivations also have more general applicability to the examination of prior assumptions in Bayesian phylogenetics.  相似文献   

11.
ABSTRACT: BACKGROUND: An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biological information concerning the variables of interest. Pathway and network maps are one example of a source of such information. However, although ancillary information is increasingly available, it is not always clear how it should be used nor how it should be weighted in relation to primary data. RESULTS: We put forward an approach in which biological knowledge is incorporated using informative prior distributions over variable subsets, with prior information selected and weighted in an automated, objective manner using an empirical Bayes formulation. We employ continuous, linear models with interaction terms and exploit biochemically-motivated sparsity constraints to permit exact inference. We show an example of priors for pathway- and network-based information and illustrate our proposed method on both synthetic response data and by an application to cancer drug response data. Comparisons are also made to alternative Bayesian and frequentist penalised-likelihood methods for incorporating network-based information. CONCLUSIONS: The empirical Bayes method proposed here can aid prior elicitation for Bayesian variable selection studies and help to guard against mis-specification of priors. Empirical Bayes, together with the proposed pathway-based priors, results in an approach with a competitive variable selection performance. In addition, the overall procedure is fast, deterministic, and has very few user-set parameters, yet is capable of capturing interplay between molecular players. The approach presented is general and readily applicable in any setting with multiple sources of biological prior knowledge.  相似文献   

12.
Nathan P. Lemoine 《Oikos》2019,128(7):912-928
Throughout the last two decades, Bayesian statistical methods have proliferated throughout ecology and evolution. Numerous previous references established both philosophical and computational guidelines for implementing Bayesian methods. However, protocols for incorporating prior information, the defining characteristic of Bayesian philosophy, are nearly nonexistent in the ecological literature. Here, I hope to encourage the use of weakly informative priors in ecology and evolution by providing a ‘consumer's guide’ to weakly informative priors. The first section outlines three reasons why ecologists should abandon noninformative priors: 1) common flat priors are not always noninformative, 2) noninformative priors provide the same result as simpler frequentist methods, and 3) noninformative priors suffer from the same high type I and type M error rates as frequentist methods. The second section provides a guide for implementing informative priors, wherein I detail convenient ‘reference’ prior distributions for common statistical models (i.e. regression, ANOVA, hierarchical models). I then use simulations to visually demonstrate how informative priors influence posterior parameter estimates. With the guidelines provided here, I hope to encourage the use of weakly informative priors for Bayesian analyses in ecology. Ecologists can and should debate the appropriate form of prior information, but should consider weakly informative priors as the new ‘default’ prior for any Bayesian model.  相似文献   

13.
This paper gives an approximate Bayes procedure for the estimation of the reliability function of a two-parameter Cauchy distribution using Jeffreys' non-informative prior with a squared-error loss function, and with a log-odds ratio squared-error loss function. Based on a Monte Carlo simulation study, two such Bayes estimators of the reliability are compared with the maximum likelihood estimator.  相似文献   

14.
Meiosis is the process which produces haploid gametes from diploid precursor cells. This reduction of chromosome number is achieved by two successive divisions. Whereas homologs segregate during meiosis I, sister chromatids segregate during meiosis II. To identify novel proteins required for proper segregation of chromosomes during meiosis, we applied a high-throughput knockout technique to delete 87 S. pombe genes whose expression is upregulated during meiosis and analyzed the mutant phenotypes. Using this approach, we identified a new protein, Dil1, which is required to prevent meiosis I homolog non-disjunction. We show that Dil1 acts in the dynein pathway to promote oscillatory nuclear movement during meiosis.  相似文献   

15.
An alternative to frequentist approaches to multiple comparisons is Duncan's k-ratio Bayes rule approach. The purpose of this paper is to compile key results on k-ratio Bayes rules for a number of multiple comparison problems that heretofore, have only been available in separate papers or doctoral dissertations. Among other problems, multiple comparisons for means in one-way, two-way, and treatments-vs.-control structures will be reviewed. In the k-ratio approach, the optimal joint rule for a multiple comparisons problem is derived under the assumptions of additive losses and prior exchangeability for the component comparisons. In the component loss function for a comparison, a balance is achieved between the decision losses due to Type I and Type II errors by assuming that their ratio is k. The component loss is also linear in the magnitude of the error. Under the assumption of additive losses, the joint Bayes rule for the component comparisons applies to each comparison the Bayes test for that comparison considered alone. That is, a comparisonwise approach is optimal. However, under prior exchangeability of the comparisons, the component test critical regions adapt to omnibus patterns in the data. For example, for a balanced one-way array of normally distributed means, the Bayes critical t value for a difference between means is inversely related to the F ratio measuring heterogeneity among the means, resembling a continuous version of Fisher's F-protected least significant difference rule. For more complicated treatment structures, the Bayes critical t value for a difference depends intuitively on multiple F ratios and marginal difference(s) (if applicable), such that the critical t value warranted for the difference can range from being as conservative as that given by a familywise rule to actually being anti-conservative relative to that given by the unadjusted 5%-level Student's t test.  相似文献   

16.

Background

In quantitative trait mapping and genomic prediction, Bayesian variable selection methods have gained popularity in conjunction with the increase in marker data and computational resources. Whereas shrinkage-inducing methods are common tools in genomic prediction, rigorous decision making in mapping studies using such models is not well established and the robustness of posterior results is subject to misspecified assumptions because of weak biological prior evidence.

Methods

Here, we evaluate the impact of prior specifications in a shrinkage-based Bayesian variable selection method which is based on a mixture of uniform priors applied to genetic marker effects that we presented in a previous study. Unlike most other shrinkage approaches, the use of a mixture of uniform priors provides a coherent framework for inference based on Bayes factors. To evaluate the robustness of genetic association under varying prior specifications, Bayes factors are compared as signals of positive marker association, whereas genomic estimated breeding values are considered for genomic selection. The impact of specific prior specifications is reduced by calculation of combined estimates from multiple specifications. A Gibbs sampler is used to perform Markov chain Monte Carlo estimation (MCMC) and a generalized expectation-maximization algorithm as a faster alternative for maximum a posteriori point estimation. The performance of the method is evaluated by using two publicly available data examples: the simulated QTLMAS XII data set and a real data set from a population of pigs.

Results

Combined estimates of Bayes factors were very successful in identifying quantitative trait loci, and the ranking of Bayes factors was fairly stable among markers with positive signals of association under varying prior assumptions, but their magnitudes varied considerably. Genomic estimated breeding values using the mixture of uniform priors compared well to other approaches for both data sets and loss of accuracy with the generalized expectation-maximization algorithm was small as compared to that with MCMC.

Conclusions

Since no error-free method to specify priors is available for complex biological phenomena, exploring a wide variety of prior specifications and combining results provides some solution to this problem. For this purpose, the mixture of uniform priors approach is especially suitable, because it comprises a wide and flexible family of distributions and computationally intensive estimation can be carried out in a reasonable amount of time.  相似文献   

17.
Array-based technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally validated, a novel computational framework (QuantiSNP) for detecting regions of copy number variation from BeadArray SNP genotyping data using an Objective Bayes Hidden-Markov Model (OB-HMM). Objective Bayes measures are used to set certain hyperparameters in the priors using a novel re-sampling framework to calibrate the model to a fixed Type I (false positive) error rate. Other parameters are set via maximum marginal likelihood to prior training data of known structure. QuantiSNP provides probabilistic quantification of state classifications and significantly improves the accuracy of segmental aneuploidy identification and mapping, relative to existing analytical tools (Beadstudio, Illumina), as demonstrated by validation of breakpoint boundaries. QuantiSNP identified both novel and validated CNVs. QuantiSNP was developed using BeadArray SNP data but it can be adapted to other platforms and we believe that the OB-HMM framework has widespread applicability in genomic research. In conclusion, QuantiSNP is a novel algorithm for high-resolution CNV/aneuploidy detection with application to clinical genetics, cancer and disease association studies.  相似文献   

18.
While Bayesian analysis has become common in phylogenetics, the effects of topological prior probabilities on tree inference have not been investigated. In Bayesian analyses, the prior probability of topologies is almost always considered equal for all possible trees, and clade support is calculated from the majority rule consensus of the approximated posterior distribution of topologies. These uniform priors on tree topologies imply non-uniform prior probabilities of clades, which are dependent on the number of taxa in a clade as well as the number of taxa in the analysis. As such, uniform topological priors do not model ignorance with respect to clades. Here, we demonstrate that Bayesian clade support, bootstrap support, and jackknife support from 17 empirical studies are significantly and positively correlated with non-uniform clade priors resulting from uniform topological priors. Further, we demonstrate that this effect disappears for bootstrap and jackknife when data sets are free from character conflict, but remains pronounced for Bayesian clade supports, regardless of tree shape. Finally, we propose the use of a Bayes factor to account for the fact that uniform topological priors do not model ignorance with respect to clade probability.  相似文献   

19.
Wolfinger RD  Kass RE 《Biometrics》2000,56(3):768-774
We consider the usual normal linear mixed model for variance components from a Bayesian viewpoint. With conjugate priors and balanced data, Gibbs sampling is easy to implement; however, simulating from full conditionals can become difficult for the analysis of unbalanced data with possibly nonconjugate priors, thus leading one to consider alternative Markov chain Monte Carlo schemes. We propose and investigate a method for posterior simulation based on an independence chain. The method is customized to exploit the structure of the variance component model, and it works with arbitrary prior distributions. As a default reference prior, we use a version of Jeffreys' prior based on the integrated (restricted) likelihood. We demonstrate the ease of application and flexibility of this approach in familiar settings involving both balanced and unbalanced data.  相似文献   

20.
Much forensic inference based upon DNA evidence is made assuming that the Hardy-Weinberg equilibrium (HWE) is valid for the genetic loci being used. Several statistical tests to detect and measure deviation from HWE have been devised, each having advantages and limitations. The limitations become more obvious when testing for deviation within multiallelic DNA loci is attempted. Here we present an exact test for HWE in the biallelic case, based on the ratio of weighted likelihoods under the null and alternative hypotheses, the Bayes factor. This test does not depend on asymptotic results and minimizes a linear combination of type I and type II errors. By ordering the sample space using the Bayes factor, we also define a significance (evidence) index, P value, using the weighted likelihood under the null hypothesis. We compare it to the conditional exact test for the case of sample size n = 10. Using the idea under the method of chi(2) partition, the test is used sequentially to test equilibrium in the multiple allele case and then applied to two short tandem repeat loci, using a real Caucasian data bank, showing its usefulness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号