首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Murphy EF  Gilmour SG  Crabbe MJ 《FEBS letters》2004,556(1-3):193-198
Details about the parameters of kinetic systems are crucial for progress in both medical and industrial research, including drug development, clinical diagnosis and biotechnology applications. Such details must be collected by a series of kinetic experiments and investigations. The correct design of the experiment is essential to collecting data suitable for analysis, modelling and deriving the correct information. We have developed a systematic and iterative Bayesian method and sets of rules for the design of enzyme kinetic experiments. Our method selects the optimum design to collect data suitable for accurate modelling and analysis and minimises the error in the parameters estimated. The rules select features of the design such as the substrate range and the number of measurements. We show here that this method can be directly applied to the study of other important kinetic systems, including drug transport, receptor binding, microbial culture and cell transport kinetics. It is possible to reduce the errors in the estimated parameters and, most importantly, increase the efficiency and cost-effectiveness by reducing the necessary amount of experiments and data points measured.  相似文献   

2.
Integrating physical knowledge and machine learning is a critical aspect of developing industrially focused digital twins for monitoring, optimisation, and design of microalgal and cyanobacterial photo-production processes. However, identifying the correct model structure to quantify the complex biological mechanism poses a severe challenge for the construction of kinetic models, while the lack of data due to the time-consuming experiments greatly impedes applications of most data-driven models. This study proposes the use of an innovative hybrid modelling approach that consists of a simple kinetic model to govern the overall process dynamic trajectory and a data-driven model to estimate mismatch between the kinetic equations and the real process. An advanced automatic model structure identification strategy is adopted to simultaneously identify the most physically probable kinetic model structure and minimum number of data-driven model parameters that can accurately represent multiple data sets over a broad spectrum of process operating conditions. Through this hybrid modelling and automatic structure identification framework, a highly accurate mathematical model was constructed to simulate and optimise an algal lutein production process. Performance of this hybrid model for long-term predictive modelling, optimisation, and online self-calibration is demonstrated and thoroughly discussed, indicating its significant potential for future industrial application.  相似文献   

3.
A common concern in Bayesian data analysis is that an inappropriately informative prior may unduly influence posterior inferences. In the context of Bayesian clinical trial design, well chosen priors are important to ensure that posterior-based decision rules have good frequentist properties. However, it is difficult to quantify prior information in all but the most stylized models. This issue may be addressed by quantifying the prior information in terms of a number of hypothetical patients, i.e., a prior effective sample size (ESS). Prior ESS provides a useful tool for understanding the impact of prior assumptions. For example, the prior ESS may be used to guide calibration of prior variances and other hyperprior parameters. In this paper, we discuss such prior sensitivity analyses by using a recently proposed method to compute a prior ESS. We apply this in several typical settings of Bayesian biomedical data analysis and clinical trial design. The data analyses include cross-tabulated counts, multiple correlated diagnostic tests, and ordinal outcomes using a proportional-odds model. The study designs include a phase I trial with late-onset toxicities, a phase II trial that monitors event times, and a phase I/II trial with dose-finding based on efficacy and toxicity.  相似文献   

4.
5.
Basket trials simultaneously evaluate the effect of one or more drugs on a defined biomarker, genetic alteration, or molecular target in a variety of disease subtypes, often called strata. A conventional approach for analyzing such trials is an independent analysis of each of the strata. This analysis is inefficient as it lacks the power to detect the effect of drugs in each stratum. To address these issues, various designs for basket trials have been proposed, centering on designs using Bayesian hierarchical models. In this article, we propose a novel Bayesian basket trial design that incorporates predictive sample size determination, early termination for inefficacy and efficacy, and the borrowing of information across strata. The borrowing of information is based on the similarity between the posterior distributions of the response probability. In general, Bayesian hierarchical models have many distributional assumptions along with multiple parameters. By contrast, our method has prior distributions for response probability and two parameters for similarity of distributions. The proposed design is easier to implement and less computationally demanding than other Bayesian basket designs. Through a simulation with various scenarios, our proposed design is compared with other designs including one that does not borrow information and one that uses a Bayesian hierarchical model.  相似文献   

6.
Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome.  相似文献   

7.
This paper develops Bayesian sample size formulae for experiments comparing two groups, where relevant preexperimental information from multiple sources can be incorporated in a robust prior to support both the design and analysis. We use commensurate predictive priors for borrowing of information and further place Gamma mixture priors on the precisions to account for preliminary belief about the pairwise (in)commensurability between parameters that underpin the historical and new experiments. Averaged over the probability space of the new experimental data, appropriate sample sizes are found according to criteria that control certain aspects of the posterior distribution, such as the coverage probability or length of a defined density region. Our Bayesian methodology can be applied to circumstances that compare two normal means, proportions, or event times. When nuisance parameters (such as variance) in the new experiment are unknown, a prior distribution can further be specified based on preexperimental data. Exact solutions are available based on most of the criteria considered for Bayesian sample size determination, while a search procedure is described in cases for which there are no closed-form expressions. We illustrate the application of our sample size formulae in the design of clinical trials, where pretrial information is available to be leveraged. Hypothetical data examples, motivated by a rare-disease trial with an elicited expert prior opinion, and a comprehensive performance evaluation of the proposed methodology are presented.  相似文献   

8.
Bayesian methods allow borrowing of historical information through prior distributions. The concept of prior effective sample size (prior ESS) facilitates quantification and communication of such prior information by equating it to a sample size. Prior information can arise from historical observations; thus, the traditional approach identifies the ESS with such a historical sample size. However, this measure is independent of newly observed data, and thus would not capture an actual “loss of information” induced by the prior in case of prior-data conflict. We build on a recent work to relate prior impact to the number of (virtual) samples from the current data model and introduce the effective current sample size (ECSS) of a prior, tailored to the application in Bayesian clinical trial designs. Special emphasis is put on robust mixture, power, and commensurate priors. We apply the approach to an adaptive design in which the number of recruited patients is adjusted depending on the effective sample size at an interim analysis. We argue that the ECSS is the appropriate measure in this case, as the aim is to save current (as opposed to historical) patients from recruitment. Furthermore, the ECSS can help overcome lack of consensus in the ESS assessment of mixture priors and can, more broadly, provide further insights into the impact of priors. An R package accompanies the paper.  相似文献   

9.

Background

Translating a known metabolic network into a dynamic model requires reasonable guesses of all enzyme parameters. In Bayesian parameter estimation, model parameters are described by a posterior probability distribution, which scores the potential parameter sets, showing how well each of them agrees with the data and with the prior assumptions made.

Results

We compute posterior distributions of kinetic parameters within a Bayesian framework, based on integration of kinetic, thermodynamic, metabolic, and proteomic data. The structure of the metabolic system (i.e., stoichiometries and enzyme regulation) needs to be known, and the reactions are modelled by convenience kinetics with thermodynamically independent parameters. The parameter posterior is computed in two separate steps: a first posterior summarises the available data on enzyme kinetic parameters; an improved second posterior is obtained by integrating metabolic fluxes, concentrations, and enzyme concentrations for one or more steady states. The data can be heterogenous, incomplete, and uncertain, and the posterior is approximated by a multivariate log-normal distribution. We apply the method to a model of the threonine synthesis pathway: the integration of metabolic data has little effect on the marginal posterior distributions of individual model parameters. Nevertheless, it leads to strong correlations between the parameters in the joint posterior distribution, which greatly improve the model predictions by the following Monte-Carlo simulations.

Conclusion

We present a standardised method to translate metabolic networks into dynamic models. To determine the model parameters, evidence from various experimental data is combined and weighted using Bayesian parameter estimation. The resulting posterior parameter distribution describes a statistical ensemble of parameter sets; the parameter variances and correlations can account for missing knowledge, measurement uncertainties, or biological variability. The posterior distribution can be used to sample model instances and to obtain probabilistic statements about the model's dynamic behaviour.  相似文献   

10.
Bayesian phylogenetic methods require the selection of prior probability distributions for all parameters of the model of evolution. These distributions allow one to incorporate prior information into a Bayesian analysis, but even in the absence of meaningful prior information, a prior distribution must be chosen. In such situations, researchers typically seek to choose a prior that will have little effect on the posterior estimates produced by an analysis, allowing the data to dominate. Sometimes a prior that is uniform (assigning equal prior probability density to all points within some range) is chosen for this purpose. In reality, the appropriate prior depends on the parameterization chosen for the model of evolution, a choice that is largely arbitrary. There is an extensive Bayesian literature on appropriate prior choice, and it has long been appreciated that there are parameterizations for which uniform priors can have a strong influence on posterior estimates. We here discuss the relationship between model parameterization and prior specification, using the general time-reversible model of nucleotide evolution as an example. We present Bayesian analyses of 10 simulated data sets obtained using a variety of prior distributions and parameterizations of the general time-reversible model. Uniform priors can produce biased parameter estimates under realistic conditions, and a variety of alternative priors avoid this bias.  相似文献   

11.
MOTIVATION: Cells continuously reprogram their gene expression network as they move through the cell cycle or sense changes in their environment. In order to understand the regulation of cells, time series expression profiles provide a more complete picture than single time point expression profiles. Few analysis techniques, however, are well suited to modelling such time series data. RESULTS: We describe an approach that naturally handles time series data with the capabilities of modelling causality, feedback loops, and environmental or hidden variables using a Dynamic Bayesian network. We also present a novel way of combining prior biological knowledge and current observations to improve the quality of analysis and to model interactions between sets of genes rather than individual genes. Our approach is evaluated on time series expression data measured in response to physiological changes that affect tryptophan metabolism in E. coli. Results indicate that this approach is capable of finding correlations between sets of related genes.  相似文献   

12.
Summary A Bayesian method was developed for identifying genetic markers linked to quantitative trait loci (QTL) by analyzing data from daughter or granddaughter designs and single markers or marker pairs. Traditional methods may yield unrealistic results because linkage tests depend on number of markers and QTL gene effects associated with selected markers are overestimated. The Bayesian or posterior probability of linkage combines information from a daughter or granddaughter design with the prior probability of linkage between a marker locus and a QTL. If the posterior probability exceeds a certain quantity, linkage is declared. Upon linkage acceptance, Bayesian estimates of marker-QTL recombination rate and QTL gene effects and frequencies are obtained. The Bayesian estimates of QTL gene effects account for different amounts of information by shrinking information from data toward the mean or mode of a prior exponential distribution of gene effects. Computation of the Bayesian analysis is feasible. Exact results are given for biallelic QTL, and extensions to multiallelic QTL are suggested.  相似文献   

13.
14.
Cook AR  Gibson GJ  Gilligan CA 《Biometrics》2008,64(3):860-868
Summary .   This article describes a method for choosing observation times for stochastic processes to maximise the expected information about their parameters. Two commonly used models for epidemiological processes are considered: a simple death process and a susceptible-infected (SI) epidemic process with dual sources for infection spreading within and from outwith the population. The search for the optimal design uses Bayesian computational methods to explore the joint parameter-data-design space, combined with a method known as moment closure to approximate the likelihood to make the acceptance step efficient. For the processes considered, a small number of optimally chosen observations are shown to yield almost as much information as much more intensively observed schemes that are commonly used in epidemiological experiments. Analysis of the simple death process allows a comparison between the full Bayesian approach and locally optimal designs around a point estimate from the prior based on asymptotic results. The robustness of the approach to misspecified priors is demonstrated for the SI epidemic process, for which the computational intractability of the likelihood precludes locally optimal designs. We show that optimal designs derived by the Bayesian approach are similar for observational studies of a single epidemic and for studies involving replicated epidemics in independent subpopulations. Different optima result, however, when the objective is to maximise the gain in information based on informative and non-informative priors: this has implications when an experiment is designed to convince a naïve or sceptical observer rather than consolidate the belief of an informed observer. Some extensions to the methods, including the selection of information criteria and extension to other epidemic processes with transition probabilities, are briefly addressed.  相似文献   

15.
This paper presents a new statistical techniques — Bayesian Generalized Associative Functional Networks (GAFN), to model the dynamical plant growth process of greenhouse crops. GAFNs are able to incorporate the domain knowledge and data to model complex ecosystem. By use of the functional networks and Bayesian framework, the prior knowledge can be naturally embedded into the model, and the functional relationship between inputs and outputs can be learned during the training process. Our main interest is focused on the Generalized Associative Functional Networks (GAFNs), which are appropriate to model multiple variable processes. Three main advantages are obtained through the applications of Bayesian GAFN methods to modeling dynamic process of plant growth. Firstly, this approach provides a powerful tool for revealing some useful relationships between the greenhouse environmental factors and the plant growth parameters. Secondly, Bayesian GAFN can model Multiple-Input Multiple-Output (MIMO) systems from the given data, and presents a good generalization capability from the final single model for successfully fitting all 12 data sets over 5-year field experiments. Thirdly, the Bayesian GAFN method can also play as an optimization tool to estimate the interested parameter in the agro-ecosystem. In this work, two algorithms are proposed for the statistical inference of parameters in GAFNs. Both of them are based on the variational inference, also called variational Bayes (VB) techniques, which may provide probabilistic interpretations for the built models. VB-based learning methods are able to yield estimations of the full posterior probability of model parameters. Synthetic and real-world examples are implemented to confirm the validity of the proposed methods.  相似文献   

16.
A Bayesian model-based clustering approach is proposed for identifying differentially expressed genes in meta-analysis. A Bayesian hierarchical model is used as a scientific tool for combining information from different studies, and a mixture prior is used to separate differentially expressed genes from non-differentially expressed genes. Posterior estimation of the parameters and missing observations are done by using a simple Markov chain Monte Carlo method. From the estimated mixture model, useful measure of significance of a test such as the Bayesian false discovery rate (FDR), the local FDR (Efron et al., 2001), and the integration-driven discovery rate (IDR; Choi et al., 2003) can be easily computed. The model-based approach is also compared with commonly used permutation methods, and it is shown that the model-based approach is superior to the permutation methods when there are excessive under-expressed genes compared to over-expressed genes or vice versa. The proposed method is applied to four publicly available prostate cancer gene expression data sets and simulated data sets.  相似文献   

17.
Microarray technology is a powerful tool for animal functional genomics studies, with applications spanning from gene identification and mapping, to function and control of gene expression. Microarray assays, however, are complex and costly, and hence generally performed with relatively small number of animals. Nevertheless, they generate data sets of unprecedented complexity and dimensionality. Therefore, such trials require careful planning and experimental design, in addition to tailored statistical and computational tools for their appropriate data mining. In this review, we discuss experimental design and data analysis strategies, which incorporate prior genomic and biological knowledge, such as genotypes and gene function and pathway membership. We focus the discussion on the design of genetical genomics studies, and on significance testing for detection of differential expression. It is shown that the use of prior biological information can improve the efficiency of microarray experiments.  相似文献   

18.
Pilot studies are often used to help design ecological studies. Ideally the pilot data are incorporated into the full-scale study data, but if the pilot study's results indicate a need for major changes to experimental design, then pooling pilot and full-scale study data is difficult. The default position is to disregard the preliminary data. But ignoring pilot study data after a more comprehensive study has been completed forgoes statistical power or costs more by sampling additional data equivalent to the pilot study's sample size. With Bayesian methods, pilot study data can be used as an informative prior for a model built from the full-scale study dataset. We demonstrate a Bayesian method for recovering information from otherwise unusable pilot study data with a case study on eucalypt seedling mortality. A pilot study of eucalypt tree seedling mortality was conducted in southeastern Australia in 2005. A larger study with a modified design was conducted the following year. The two datasets differed substantially, so they could not easily be combined. Posterior estimates from pilot dataset model parameters were used to inform a model for the second larger dataset. Model checking indicated that incorporating prior information maintained the predictive capacity of the model with respect to the training data. Importantly, adding prior information improved model accuracy in predicting a validation dataset. Adding prior information increased the precision and the effective sample size for estimating the average mortality rate. We recommend that practitioners move away from the default position of discarding pilot study data when they are incompatible with the form of their full-scale studies. More generally, we recommend that ecologists should use informative priors more frequently to reap the benefits of the additional data.  相似文献   

19.
Bayesian clinical trial designs offer the possibility of a substantially reduced sample size, increased statistical power, and reductions in cost and ethical hazard. However when prior and current information conflict, Bayesian methods can lead to higher than expected type I error, as well as the possibility of a costlier and lengthier trial. This motivates an investigation of the feasibility of hierarchical Bayesian methods for incorporating historical data that are adaptively robust to prior information that reveals itself to be inconsistent with the accumulating experimental data. In this article, we present several models that allow for the commensurability of the information in the historical and current data to determine how much historical information is used. A primary tool is elaborating the traditional power prior approach based upon a measure of commensurability for Gaussian data. We compare the frequentist performance of several methods using simulations, and close with an example of a colon cancer trial that illustrates a linear models extension of our adaptive borrowing approach. Our proposed methods produce more precise estimates of the model parameters, in particular, conferring statistical significance to the observed reduction in tumor size for the experimental regimen as compared to the control regimen.  相似文献   

20.
A Bayesian network classification methodology for gene expression data.   总被引:5,自引:0,他引:5  
We present new techniques for the application of a Bayesian network learning framework to the problem of classifying gene expression data. The focus on classification permits us to develop techniques that address in several ways the complexities of learning Bayesian nets. Our classification model reduces the Bayesian network learning problem to the problem of learning multiple subnetworks, each consisting of a class label node and its set of parent genes. We argue that this classification model is more appropriate for the gene expression domain than are other structurally similar Bayesian network classification models, such as Naive Bayes and Tree Augmented Naive Bayes (TAN), because our model is consistent with prior domain experience suggesting that a relatively small number of genes, taken in different combinations, is required to predict most clinical classes of interest. Within this framework, we consider two different approaches to identifying parent sets which are supported by the gene expression observations and any other currently available evidence. One approach employs a simple greedy algorithm to search the universe of all genes; the second approach develops and applies a gene selection algorithm whose results are incorporated as a prior to enable an exhaustive search for parent sets over a restricted universe of genes. Two other significant contributions are the construction of classifiers from multiple, competing Bayesian network hypotheses and algorithmic methods for normalizing and binning gene expression data in the absence of prior expert knowledge. Our classifiers are developed under a cross validation regimen and then validated on corresponding out-of-sample test sets. The classifiers attain a classification rate in excess of 90% on out-of-sample test sets for two publicly available datasets. We present an extensive compilation of results reported in the literature for other classification methods run against these same two datasets. Our results are comparable to, or better than, any we have found reported for these two sets, when a train-test protocol as stringent as ours is followed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号