首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 671 毫秒
1.
Upstream bioprocess characterization and optimization are time and resource‐intensive tasks. Regularly in the biopharmaceutical industry, statistical design of experiments (DoE) in combination with response surface models (RSMs) are used, neglecting the process trajectories and dynamics. Generating process understanding with time‐resolved, dynamic process models allows to understand the impact of temporal deviations, production dynamics, and provides a better understanding of the process variations that stem from the biological subsystem. The authors propose to use DoE studies in combination with hybrid modeling for process characterization. This approach is showcased on Escherichia coli fed‐batch cultivations at the 20L scale, evaluating the impact of three critical process parameters. The performance of a hybrid model is compared to a pure data‐driven model and the widely adopted RSM of the process endpoints. Further, the performance of the time‐resolved models to simultaneously predict biomass and titer is evaluated. The superior behavior of the hybrid model compared to the pure black‐box approaches for process characterization is presented. The evaluation considers important criteria, such as the prediction accuracy of the biomass and titer endpoints as well as the time‐resolved trajectories. This showcases the high potential of hybrid models for soft‐sensing and model predictive control.  相似文献   

2.
Yuan Y  Yin G 《Biometrics》2011,67(4):1543-1554
In the estimation of a dose-response curve, parametric models are straightforward and efficient but subject to model misspecifications; nonparametric methods are robust but less efficient. As a compromise, we propose a semiparametric approach that combines the advantages of parametric and nonparametric curve estimates. In a mixture form, our estimator takes a weighted average of the parametric and nonparametric curve estimates, in which a higher weight is assigned to the estimate with a better model fit. When the parametric model assumption holds, the semiparametric curve estimate converges to the parametric estimate and thus achieves high efficiency; when the parametric model is misspecified, the semiparametric estimate converges to the nonparametric estimate and remains consistent. We also consider an adaptive weighting scheme to allow the weight to vary according to the local fit of the models. We conduct extensive simulation studies to investigate the performance of the proposed methods and illustrate them with two real examples.  相似文献   

3.
Cao J  Wang L  Xu J 《Biometrics》2011,67(4):1305-1313
Applied scientists often like to use ordinary differential equations (ODEs) to model complex dynamic processes that arise in biology, engineering, medicine, and many other areas. It is interesting but challenging to estimate ODE parameters from noisy data, especially when the data have some outliers. We propose a robust method to address this problem. The dynamic process is represented with a nonparametric function, which is a linear combination of basis functions. The nonparametric function is estimated by a robust penalized smoothing method. The penalty term is defined with the parametric ODE model, which controls the roughness of the nonparametric function and maintains the fidelity of the nonparametric function to the ODE model. The basis coefficients and ODE parameters are estimated in two nested levels of optimization. The coefficient estimates are treated as an implicit function of ODE parameters, which enables one to derive the analytic gradients for optimization using the implicit function theorem. Simulation studies show that the robust method gives satisfactory estimates for the ODE parameters from noisy data with outliers. The robust method is demonstrated by estimating a predator-prey ODE model from real ecological data.  相似文献   

4.
Bayesian Inference in Semiparametric Mixed Models for Longitudinal Data   总被引:1,自引:0,他引:1  
Summary .  We consider Bayesian inference in semiparametric mixed models (SPMMs) for longitudinal data. SPMMs are a class of models that use a nonparametric function to model a time effect, a parametric function to model other covariate effects, and parametric or nonparametric random effects to account for the within-subject correlation. We model the nonparametric function using a Bayesian formulation of a cubic smoothing spline, and the random effect distribution using a normal distribution and alternatively a nonparametric Dirichlet process (DP) prior. When the random effect distribution is assumed to be normal, we propose a uniform shrinkage prior (USP) for the variance components and the smoothing parameter. When the random effect distribution is modeled nonparametrically, we use a DP prior with a normal base measure and propose a USP for the hyperparameters of the DP base measure. We argue that the commonly assumed DP prior implies a nonzero mean of the random effect distribution, even when a base measure with mean zero is specified. This implies weak identifiability for the fixed effects, and can therefore lead to biased estimators and poor inference for the regression coefficients and the spline estimator of the nonparametric function. We propose an adjustment using a postprocessing technique. We show that under mild conditions the posterior is proper under the proposed USP, a flat prior for the fixed effect parameters, and an improper prior for the residual variance. We illustrate the proposed approach using a longitudinal hormone dataset, and carry out extensive simulation studies to compare its finite sample performance with existing methods.  相似文献   

5.
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.  相似文献   

6.
Process understanding is emphasized in the process analytical technology initiative and the quality by design paradigm to be essential for manufacturing of biopharmaceutical products with consistent high quality. A typical approach to developing a process understanding is applying a combination of design of experiments with statistical data analysis. Hybrid semi-parametric modeling is investigated as an alternative method to pure statistical data analysis. The hybrid model framework provides flexibility to select model complexity based on available data and knowledge. Here, a parametric dynamic bioreactor model is integrated with a nonparametric artificial neural network that describes biomass and product formation rates as function of varied fed-batch fermentation conditions for high cell density heterologous protein production with E. coli. Our model can accurately describe biomass growth and product formation across variations in induction temperature, pH and feed rates. The model indicates that while product expression rate is a function of early induction phase conditions, it is negatively impacted as productivity increases. This could correspond with physiological changes due to cytoplasmic product accumulation. Due to the dynamic nature of the model, rational process timing decisions can be made and the impact of temporal variations in process parameters on product formation and process performance can be assessed, which is central for process understanding.  相似文献   

7.
Nonparametric feature selection for high-dimensional data is an important and challenging problem in the fields of statistics and machine learning. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space. The space is generated by a novel tensor product kernel, which depends on a set of parameters that determines the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove the oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and applications to two real studies.  相似文献   

8.
9.
We consider testing whether the nonparametric function in a semiparametric additive mixed model is a simple fixed degree polynomial, for example, a simple linear function. This test provides a goodness-of-fit test for checking parametric models against nonparametric models. It is based on the mixed-model representation of the smoothing spline estimator of the nonparametric function and the variance component score test by treating the inverse of the smoothing parameter as an extra variance component. We also consider testing the equivalence of two nonparametric functions in semiparametric additive mixed models for two groups, such as treatment and placebo groups. The proposed tests are applied to data from an epidemiological study and a clinical trial and their performance is evaluated through simulations.  相似文献   

10.
Li R  Nie L 《Biometrics》2008,64(3):904-911
Summary .   Motivated by an analysis of a real data set in ecology, we consider a class of partially nonlinear models where both a nonparametric component and a parametric component are present. We develop two new estimation procedures to estimate the parameters in the parametric component. Consistency and asymptotic normality of the resulting estimators are established. We further propose an estimation procedure and a generalized F -test procedure for the nonparametric component in the partially nonlinear models. Asymptotic properties of the newly proposed estimation procedure and the test statistic are derived. Finite sample performance of the proposed inference procedures are assessed by Monte Carlo simulation studies. An application in ecology is used to illustrate the proposed methods.  相似文献   

11.
Using the method of generalized threshold models, the problem is formulated and solved to evaluate the parametric stability of the model of a gene subnetwork controlling the early ontogenesis of the fruit fly Drosophila melanogaster. Computer experiments have been performed to test the parametric stability of the model. Quantitative evaluations have been obtained for parametric stability of the Drosophila gene subnetwork in nuclei along the embryo's anterior-posterior axis. The results of computer experiments have been compared with the previous research data on "sensitivity" of functioning regimes to random changes of the parameters in the models of prokaryotic and eukaryotic systems, namely the system controlling the lambda-phage development and the subsystem controlling the flower morphogenesis of Arabidopsis thaliana. The obtained results confirm high parametric stability of gene networks that control the development of organisms.  相似文献   

12.
The problem of evaluating the parametric stability of three models of pro- and eukaryotic gene networks controlling ontogenetic processes has been defined and solved. Experimental plans of testing gene networks for parametric stability based on the method of generalized threshold models were developed and realized as a software application. We examined the "sensitivity" of the functioning modes to random variations of the parameters in the three model systems: the system of developmental control of phage lambda, the subsystem of morphogenetic control of Arabidopsis thaliana flower, and the gene subnetwork controlling early ontogeny in Drosophila melanogaster. The parametric stability was quantitatively assessed for these models.  相似文献   

13.
Population multiple components is a statistical tool useful for the analysis of time-dependent hybrid data. With a small number of parameters, it is possible to model and to predict the periodic behavior of a population. In this article, we propose two methods to compare among populations rhythmometric parameters obtained by multiple component analysis. The first is a parametric method based in the usual statistical techniques for comparison of mean vectors in multivariate normal populations. The method, through MANOVA analysis, allows comparison of the MESOR and amplitude-acrophase pair of each component among two or more populations. The second is a nonparametric method, based in bootstrap techniques, to compare parameters from two populations. This test allows one to compare the MESOR, the amplitude, and the acrophase of each fitted component, as well as the global amplitude, orthophase, and bathyphase estimated when all fitted components are harmonics of a fundamental period. The idea is to calculate a confidence interval for the difference of the parameters of interest. If this interval does not contain zero, it can be concluded that the parameters from the two models are different with high probability. An estimation of p-value for the corresponding test can also be calculated. Both methods are illustrated with an example, based on clinical data. The nonparametric test can also be applied to paired data, a special situation of great interest in practice. By the use of similar bootstrap techniques, we illustrate how to construct confidence intervals for any rhythmometric parameter estimated from population multiple components models, including the orthophase, bathyphase, and global amplitude. These tests for comparison of parameters among populations are a needed tool when modeling the nonsinusoidal rhythmic behavior of hybrid data by population multiple component analysis.  相似文献   

14.
We present a new algorithm to estimate hemodynamic response function (HRF) and drift components of fMRI data in wavelet domain. The HRF is modeled by both parametric and nonparametric models. The functional Magnetic resonance Image (fMRI) noise is modeled as a fractional brownian motion (fBm). The HRF parameters are estimated in wavelet domain by exploiting the property that wavelet transforms with a sufficient number of vanishing moments decorrelates a fBm process. Using this property, the noise covariance matrix in wavelet domain can be assumed to be diagonal whose entries are estimated using the sample variance estimator at each scale. We study the influence of the sampling rate of fMRI time series and shape assumption of HRF on the estimation performance. Results are presented by adding synthetic HRFs on simulated and null fMRI data. We also compare these methods with an existing method,(1) where correlated fMRI noise is modeled by a second order polynomial functions.  相似文献   

15.
Population multiple components is a statistical tool useful for the analysis of time-dependent hybrid data. With a small number of parameters, it is possible to model and to predict the periodic behavior of a population. In this article, we propose two methods to compare among populations rhythmometric parameters obtained by multiple component analysis. The first is a parametric method based in the usual statistical techniques for comparison of mean vectors in multivariate normal populations. The method, through MANOVA analysis, allows comparison of the MESOR and amplitude-acrophase pair of each component among two or more populations. The second is a nonparametric method, based in bootstrap techniques, to compare parameters from two populations. This test allows one to compare the MESOR, the amplitude, and the acrophase of each fitted component, as well as the global amplitude, orthophase, and bathyphase estimated when all fitted components are harmonics of a fundamental period. The idea is to calculate a confidence interval for the difference of the parameters of interest. If this interval does not contain zero, it can be concluded that the parameters from the two models are different with high probability. An estimation of p-value for the corresponding test can also be calculated. Both methods are illustrated with an example, based on clinical data. The nonparametric test can also be applied to paired data, a special situation of great interest in practice. By the use of similar bootstrap techniques, we illustrate how to construct confidence intervals for any rhythmometric parameter estimated from population multiple components models, including the orthophase, bathyphase, and global amplitude. These tests for comparison of parameters among populations are a needed tool when modeling the nonsinusoidal rhythmic behavior of hybrid data by population multiple component analysis.  相似文献   

16.
Wu B  Guan Z  Zhao H 《Biometrics》2006,62(3):735-744
Nonparametric and parametric approaches have been proposed to estimate false discovery rate under the independent hypothesis testing assumption. The parametric approach has been shown to have better performance than the nonparametric approaches. In this article, we study the nonparametric approaches and quantify the underlying relations between parametric and nonparametric approaches. Our study reveals the conservative nature of the nonparametric approaches, and establishes the connections between the empirical Bayes method and p-value-based nonparametric methods. Based on our results, we advocate using the parametric approach, or directly modeling the test statistics using the empirical Bayes method.  相似文献   

17.
The enzyme cellulase, a multienzyme complex made up of several proteins, catalyzes the conversion of cellulose to glucose in an enzymatic hydrolysis-based biomass-to-ethanol process. Production of cellulase enzyme proteins in large quantities using the fungus Trichoderma reesei requires understanding the dynamics of growth and enzyme production. The method of neural network parameter function modeling, which combines the approximation capabilities of neural networks with fundamental process knowledge, is utilized to develop a mathematical model of this dynamic system. In addition, kinetic models are also developed. Laboratory data from bench-scale fermentations involving growth and protein production by T. reesei on lactose and xylose are used to estimate the parameters in these models. The relative performances of the various models and the results of optimizing these models on two different performance measures are presented. An approximately 33% lower root-mean-squared error (RMSE) in protein predictions and about 40% lower total RMSE is obtained with the neural network-based model as opposed to kinetic models. Using the neural network-based model, the RMSE in predicting optimal conditions for two performance indices, is about 67% and 40% lower, respectively, when compared with the kinetic models. Thus, both model predictions and optimization results from the neural network-based model are found to be closer to the experimental data than the kinetic models developed in this work. It is shown that the neural network parameter function modeling method can be useful as a "macromodeling" technique to rapidly develop dynamic models of a process.  相似文献   

18.
This paper reviews a general framework for the modelling of longitudinal data with random measurement times based on marked point processes and presents a worked example. We construct a quite general regression models for longitudinal data, which may in particular include censoring that only depend on the past and outside random variation, and dependencies between measurement times and measurements. The modelling also generalises statistical counting process models. We review a non-parametric Nadarya-Watson kernel estimator of the regression function, and a parametric analysis that is based on a conditional least squares (CLS) criterion. The parametric analysis presented, is a conditional version of the generalised estimation equations of LIANG and ZEGER (1986). We conclude that the usual nonparametric and parametric regression modelling can be applied to this general set-up, with some modifications. The presented framework provides an easily implemented and powerful tool for model building for repeated measurements.  相似文献   

19.
Tang L  Emerson SS  Zhou XH 《Biometrics》2008,64(4):1137-1145
SUMMARY: Comparison of the accuracy of two diagnostic tests using the receiver operating characteristic (ROC) curves from two diagnostic tests has been typically conducted using fixed sample designs. On the other hand, the human experimentation inherent in a comparison of diagnostic modalities argues for periodic monitoring of the accruing data to address many issues related to the ethics and efficiency of the medical study. To date, very little research has been done on the use of sequential sampling plans for comparative ROC studies, even when these studies may use expensive and unsafe diagnostic procedures. In this article we propose a nonparametric group sequential design plan. The nonparametric sequential method adapts a nonparametric family of weighted area under the ROC curve statistics (Wieand et al., 1989, Biometrika 76, 585-592) and a group sequential sampling plan. We illustrate the implementation of this nonparametric approach for sequentially comparing ROC curves in the context of diagnostic screening for nonsmall-cell lung cancer. We also describe a semiparametric sequential method based on proportional hazard models. We compare the statistical properties of the nonparametric approach with alternative semiparametric and parametric analyses in simulation studies. The results show the nonparametric approach is robust to model misspecification and has excellent finite-sample performance.  相似文献   

20.
An estimator of the hazard rate function from discrete failure time data is obtained by semiparametric smoothing of the (nonsmooth) maximum likelihood estimator, which is achieved by repeated multiplication of a Markov chain transition-type matrix. This matrix is constructed so as to have a given standard discrete parametric hazard rate model, termed the vehicle model, as its stationary hazard rate. As with the discrete density estimation case, the proposed estimator gives improved performance when the vehicle model is a good one and otherwise provides a nonparametric method comparable to the only purely nonparametric smoother discussed in the literature. The proposed semiparametric smoothing approach is then extended to hazard models with covariates and is illustrated by applications to simulated and real data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号