首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
A conjugate Wishart prior is used to present a simple and rapid procedure for computing the analytic posterior (mode and uncertainty) of the precision matrix elements of a Gaussian distribution. An interpretation of covariance estimates in terms of eigenvalues is presented, along with a simple decision-rule step to improve the performance of the estimation of sparse precision matrices and associated graphs. In this, elements of the estimated precision matrix that are zero or near zero can be detected and shrunk to zero. Simulated data sets are used to compare posterior estimation with decision-rule with two other Wishart-based approaches and with graphical lasso. Furthermore, an empirical Bayes procedure is used to select prior hyperparameters in high dimensional cases with extension to sparsity.  相似文献   

2.
Ronald A. Fisher, who is the founder of maximum likelihood estimation (ML estimation), criticized the Bayes estimation of using a uniform prior distribution, because we can create estimates arbitrarily if we use Bayes estimation by changing the transformation used before the analysis. Thus, the Bayes estimates lack the scientific objectivity, especially when the amount of data is small. However, we can use the Bayes estimates as an approximation to the objective ML estimates if we use an appropriate transformation that makes the posterior distribution close to a normal distribution. One-to-one correspondence exists between a uniform prior distribution under a transformed scale and a non-uniform prior distribution under the original scale. For this reason, the Bayes estimation of ML estimates is essentially identical to the estimation using Jeffreys prior.  相似文献   

3.
This article is concerned with the Bayesian estimation of stochastic rate constants in the context of dynamic models of intracellular processes. The underlying discrete stochastic kinetic model is replaced by a diffusion approximation (or stochastic differential equation approach) where a white noise term models stochastic behavior and the model is identified using equispaced time course data. The estimation framework involves the introduction of m- 1 latent data points between every pair of observations. MCMC methods are then used to sample the posterior distribution of the latent process and the model parameters. The methodology is applied to the estimation of parameters in a prokaryotic autoregulatory gene network.  相似文献   

4.
There have been various attempts to improve the reconstruction of gene regulatory networks from microarray data by the systematic integration of biological prior knowledge. Our approach is based on pioneering work by Imoto et al. where the prior knowledge is expressed in terms of energy functions, from which a prior distribution over network structures is obtained in the form of a Gibbs distribution. The hyperparameters of this distribution represent the weights associated with the prior knowledge relative to the data. We have derived and tested a Markov chain Monte Carlo (MCMC) scheme for sampling networks and hyperparameters simultaneously from the posterior distribution, thereby automatically learning how to trade off information from the prior knowledge and the data. We have extended this approach to a Bayesian coupling scheme for learning gene regulatory networks from a combination of related data sets, which were obtained under different experimental conditions and are therefore potentially associated with different active subpathways. The proposed coupling scheme is a compromise between (1) learning networks from the different subsets separately, whereby no information between the different experiments is shared; and (2) learning networks from a monolithic fusion of the individual data sets, which does not provide any mechanism for uncovering differences between the network structures associated with the different experimental conditions. We have assessed the viability of all proposed methods on data related to the Raf signaling pathway, generated both synthetically and in cytometry experiments.  相似文献   

5.
Parameter estimation in dynamic systems finds applications in various disciplines, including system biology. The well-known expectation-maximization (EM) algorithm is a popular method and has been widely used to solve system identification and parameter estimation problems. However, the conventional EM algorithm cannot exploit the sparsity. On the other hand, in gene regulatory network inference problems, the parameters to be estimated often exhibit sparse structure. In this paper, a regularized expectation-maximization (rEM) algorithm for sparse parameter estimation in nonlinear dynamic systems is proposed that is based on the maximum a posteriori (MAP) estimation and can incorporate the sparse prior. The expectation step involves the forward Gaussian approximation filtering and the backward Gaussian approximation smoothing. The maximization step employs a re-weighted iterative thresholding method. The proposed algorithm is then applied to gene regulatory network inference. Results based on both synthetic and real data show the effectiveness of the proposed algorithm.  相似文献   

6.
F. Perron  K. Mengersen 《Biometrics》2001,57(2):518-528
Nonparametric modeling is an indispensable tool in many applications and its formulation in an hierarchical Bayesian context, using the entire posterior distribution rather than particular expectations, increases its flexibility. In this article, the focus is on nonparametric estimation through a mixture of triangular distributions. The optimality of this methodology is addressed and bounds on the accuracy of this approximation are derived. Although our approach is more widely applicable, we focus for simplicity on estimation of a monotone nondecreasing regression on [0, 1] with additive error, effectively approximating the function of interest by a function having a piecewise linear derivative. Computationally accessible methods of estimation are described through an amalgamation of existing Markov chain Monte Carlo algorithms. Simulations and examples illustrate the approach.  相似文献   

7.
Summary The statistical interpretation of the histogram representation of NMR spectra is described, leading to an estimation of the probability density function of the noise. The white-noise and Gaussian hypotheses are discussed, and a new estimator of the noise standard deviation is derived from the histogram strategy. The Bayesian approach to NMR signal detection is presented. This approach homogeneously combines prior knowledge, obtained from the histogram strategy, together with the posterior information resulting from the test of presence of a set of reference shapes in the neighbourhood of each data point. This scheme leads to a new strategy in the local detection of NMR signals in 2D and 3D spectra, which is illustrated by a complete peak-picking algorithm.  相似文献   

8.
Summary We examine situations where interest lies in the conditional association between outcome and exposure variables, given potential confounding variables. Concern arises that some potential confounders may not be measured accurately, whereas others may not be measured at all. Some form of sensitivity analysis might be employed, to assess how this limitation in available data impacts inference. A Bayesian approach to sensitivity analysis is straightforward in concept: a prior distribution is formed to encapsulate plausible relationships between unobserved and observed variables, and posterior inference about the conditional exposure–disease relationship then follows. In practice, though, it can be challenging to form such a prior distribution in both a realistic and simple manner. Moreover, it can be difficult to develop an attendant Markov chain Monte Carlo (MCMC) algorithm that will work effectively on a posterior distribution arising from a highly nonidentified model. In this article, a simple prior distribution for acknowledging both poorly measured and unmeasured confounding variables is developed. It requires that only a small number of hyperparameters be set by the user. Moreover, a particular computational approach for posterior inference is developed, because application of MCMC in a standard manner is seen to be ineffective in this problem.  相似文献   

9.
Inferring speciation times under an episodic molecular clock   总被引:5,自引:0,他引:5  
We extend our recently developed Markov chain Monte Carlo algorithm for Bayesian estimation of species divergence times to allow variable evolutionary rates among lineages. The method can use heterogeneous data from multiple gene loci and accommodate multiple fossil calibrations. Uncertainties in fossil calibrations are described using flexible statistical distributions. The prior for divergence times for nodes lacking fossil calibrations is specified by use of a birth-death process with species sampling. The prior for lineage-specific substitution rates is specified using either a model with autocorrelated rates among adjacent lineages (based on a geometric Brownian motion model of rate drift) or a model with independent rates among lineages specified by a log-normal probability distribution. We develop an infinite-sites theory, which predicts that when the amount of sequence data approaches infinity, the width of the posterior credibility interval and the posterior mean of divergence times form a perfect linear relationship, with the slope indicating uncertainties in time estimates that cannot be reduced by sequence data alone. Simulations are used to study the influence of among-lineage rate variation and the number of loci sampled on the uncertainty of divergence time estimates. The analysis suggests that posterior time estimates typically involve considerable uncertainties even with an infinite amount of sequence data, and that the reliability and precision of fossil calibrations are critically important to divergence time estimation. We apply our new algorithms to two empirical data sets and compare the results with those obtained in previous Bayesian and likelihood analyses. The results demonstrate the utility of our new algorithms.  相似文献   

10.
In classification, prior knowledge is incorporated in a Bayesian framework by assuming that the feature-label distribution belongs to an uncertainty class of feature-label distributions governed by a prior distribution. A posterior distribution is then derived from the prior and the sample data. An optimal Bayesian classifier (OBC) minimizes the expected misclassification error relative to the posterior distribution. From an application perspective, prior construction is critical. The prior distribution is formed by mapping a set of mathematical relations among the features and labels, the prior knowledge, into a distribution governing the probability mass across the uncertainty class. In this paper, we consider prior knowledge in the form of stochastic differential equations (SDEs). We consider a vector SDE in integral form involving a drift vector and dispersion matrix. Having constructed the prior, we develop the optimal Bayesian classifier between two models and examine, via synthetic experiments, the effects of uncertainty in the drift vector and dispersion matrix. We apply the theory to a set of SDEs for the purpose of differentiating the evolutionary history between two species.  相似文献   

11.
One key problem in computational neuroscience and neural engineering is the identification and modeling of functional connectivity in the brain using spike train data. To reduce model complexity, alleviate overfitting, and thus facilitate model interpretation, sparse representation and estimation of functional connectivity is needed. Sparsities include global sparsity, which captures the sparse connectivities between neurons, and local sparsity, which reflects the active temporal ranges of the input-output dynamical interactions. In this paper, we formulate a generalized functional additive model (GFAM) and develop the associated penalized likelihood estimation methods for such a modeling problem. A GFAM consists of a set of basis functions convolving the input signals, and a link function generating the firing probability of the output neuron from the summation of the convolutions weighted by the sought model coefficients. Model sparsities are achieved by using various penalized likelihood estimations and basis functions. Specifically, we introduce two variations of the GFAM using a global basis (e.g., Laguerre basis) and group LASSO estimation, and a local basis (e.g., B-spline basis) and group bridge estimation, respectively. We further develop an optimization method based on quadratic approximation of the likelihood function for the estimation of these models. Simulation and experimental results show that both group-LASSO-Laguerre and group-bridge-B-spline can capture faithfully the global sparsities, while the latter can replicate accurately and simultaneously both global and local sparsities. The sparse models outperform the full models estimated with the standard maximum likelihood method in out-of-sample predictions.  相似文献   

12.
Roy J  Daniels MJ 《Biometrics》2008,64(2):538-545
Summary .   In this article we consider the problem of fitting pattern mixture models to longitudinal data when there are many unique dropout times. We propose a marginally specified latent class pattern mixture model. The marginal mean is assumed to follow a generalized linear model, whereas the mean conditional on the latent class and random effects is specified separately. Because the dimension of the parameter vector of interest (the marginal regression coefficients) does not depend on the assumed number of latent classes, we propose to treat the number of latent classes as a random variable. We specify a prior distribution for the number of classes, and calculate (approximate) posterior model probabilities. In order to avoid the complications with implementing a fully Bayesian model, we propose a simple approximation to these posterior probabilities. The ideas are illustrated using data from a longitudinal study of depression in HIV-infected women.  相似文献   

13.
The binary decision element described by the decision rule depending upon weight vector w is a model of neuron examined in this paper. The environment of the element is described by some unknown, stationary distribution p(x). The input signals x[n] of the element appear in each step n independently in accordance with the distribution p(x). During an unsupervised learning process the weight vector w[n] is changed on the base of the input vector x[n]. In the paper there are regarded two self-learning algorithms which are stochastic approximation type. For both algorithms the same rule of past experiences neglecting or the rule of weight decrease has been introduced. The first algorithm differs from the other one by a rule of weight increase. It has been proved that only one of these algorithms always leads to the same decision rule in a given environment p(x).This work was done during stay of Dr. L. Bobrowski at the University of Salerno in the frame of Polish-Italian Agreement on Scientific Cooperation  相似文献   

14.
A stochastic approximation algorithm is proposed for recursive estimation of the hyperparameters characterizing, in a population, the probability density function of the parameters of a statistical model. For a given population model defined by a parametric model of a biological process, an error model, and a class of densities on the set of the individual parameters, this algorithm provides a sequence of estimates from a sequence of individuals' observation vectors. Convergence conditions are verified for a class of population models including usual pharmacokinetic applications. This method is implemented for estimation of pharmacokinetic population parameters from drug multiple-dosing data. Its estimation capabilities are evaluated and compared to a classical method in population pharmacokinetics, the first-order method (NONMEM), on simulated data.  相似文献   

15.
MOTIVATION: Gene selection algorithms for cancer classification, based on the expression of a small number of biomarker genes, have been the subject of considerable research in recent years. Shevade and Keerthi propose a gene selection algorithm based on sparse logistic regression (SLogReg) incorporating a Laplace prior to promote sparsity in the model parameters, and provide a simple but efficient training procedure. The degree of sparsity obtained is determined by the value of a regularization parameter, which must be carefully tuned in order to optimize performance. This normally involves a model selection stage, based on a computationally intensive search for the minimizer of the cross-validation error. In this paper, we demonstrate that a simple Bayesian approach can be taken to eliminate this regularization parameter entirely, by integrating it out analytically using an uninformative Jeffrey's prior. The improved algorithm (BLogReg) is then typically two or three orders of magnitude faster than the original algorithm, as there is no longer a need for a model selection step. The BLogReg algorithm is also free from selection bias in performance estimation, a common pitfall in the application of machine learning algorithms in cancer classification. RESULTS: The SLogReg, BLogReg and Relevance Vector Machine (RVM) gene selection algorithms are evaluated over the well-studied colon cancer and leukaemia benchmark datasets. The leave-one-out estimates of the probability of test error and cross-entropy of the BLogReg and SLogReg algorithms are very similar, however the BlogReg algorithm is found to be considerably faster than the original SLogReg algorithm. Using nested cross-validation to avoid selection bias, performance estimation for SLogReg on the leukaemia dataset takes almost 48 h, whereas the corresponding result for BLogReg is obtained in only 1 min 24 s, making BLogReg by far the more practical algorithm. BLogReg also demonstrates better estimates of conditional probability than the RVM, which are of great importance in medical applications, with similar computational expense. AVAILABILITY: A MATLAB implementation of the sparse logistic regression algorithm with Bayesian regularization (BLogReg) is available from http://theoval.cmp.uea.ac.uk/~gcc/cbl/blogreg/  相似文献   

16.
This paper outlines the mathematical theory required for eliciting the hyperparameters of a subjective conjugate distribution for the exponential survival model with censoring. The technique involves the quantification of expert knowledge based on determination by the expert of expected fractiles of a survival distribution in a particular clinical trial setting. Once the prior predictive distribution is determined and the fractiles elicited one can proceed, using iterative techniques, to solve for the hyperparameters. The restrictions and constraints of the hyperparameters as well as the fractiles are studied. The theory is then applied in a clinical trial setting.  相似文献   

17.
Constraint-based structure learning algorithms generally perform well on sparse graphs. Although sparsity is not uncommon, there are some domains where the underlying graph can have some dense regions; one of these domains is gene regulatory networks, which is the main motivation to undertake the study described in this paper. We propose a new constraint-based algorithm that can both increase the quality of output and decrease the computational requirements for learning the structure of gene regulatory networks. The algorithm is based on and extends the PC algorithm. Two different types of information are derived from the prior knowledge; one is the probability of existence of edges, and the other is the nodes that seem to be dependent on a large number of nodes compared to other nodes in the graph. Also a new method based on Gene Ontology for gene regulatory network validation is proposed. We demonstrate the applicability and effectiveness of the proposed algorithms on both synthetic and real data sets.  相似文献   

18.
The influence of various heterogeneous parameters, stochastic uncertain factors, and pollutant particles from the industrial effluents in the water system is investigated using advection dispersion equation (ADE) and the Bayesian approximation. Here, the decay coefficient is decomposed into the exact part and the deviation part. The coefficient is used to find out the errors and deviation in decay during the flow of pollutants. Two Bayesian models are developed to analyze the posterior distributions and to find out the Bayes factor for the stochastic covariance estimation. The Bayesian calibration focused the characteristics of both on mechanistic and statistical approximation. The efficiency and accuracy of the developed models are checked from the results obtained on the basis of the confidence interval. Markov chain Monte Carlo simulation is used to acquire the convergence point of parameters for the posterior estimation. The stochastic covariance or white noise represents the effect of random factors on the river system. The analysis revealed that the rate of decay is dependent upon the duration and distance traveled by the pollutants. The collaboration of ADE and Bayesian approximation encourage the water-quality management and environmental modeling.  相似文献   

19.
Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects'' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects'' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior.  相似文献   

20.
Bayesian Inference in Semiparametric Mixed Models for Longitudinal Data   总被引:1,自引:0,他引:1  
Summary .  We consider Bayesian inference in semiparametric mixed models (SPMMs) for longitudinal data. SPMMs are a class of models that use a nonparametric function to model a time effect, a parametric function to model other covariate effects, and parametric or nonparametric random effects to account for the within-subject correlation. We model the nonparametric function using a Bayesian formulation of a cubic smoothing spline, and the random effect distribution using a normal distribution and alternatively a nonparametric Dirichlet process (DP) prior. When the random effect distribution is assumed to be normal, we propose a uniform shrinkage prior (USP) for the variance components and the smoothing parameter. When the random effect distribution is modeled nonparametrically, we use a DP prior with a normal base measure and propose a USP for the hyperparameters of the DP base measure. We argue that the commonly assumed DP prior implies a nonzero mean of the random effect distribution, even when a base measure with mean zero is specified. This implies weak identifiability for the fixed effects, and can therefore lead to biased estimators and poor inference for the regression coefficients and the spline estimator of the nonparametric function. We propose an adjustment using a postprocessing technique. We show that under mild conditions the posterior is proper under the proposed USP, a flat prior for the fixed effect parameters, and an improper prior for the residual variance. We illustrate the proposed approach using a longitudinal hormone dataset, and carry out extensive simulation studies to compare its finite sample performance with existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号