首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The application of dielectric spectroscopy was frequently investigated as an on-line cell culture monitoring tool; however, it still requires supportive data and experience in order to become a robust technique. In this study, dielectric spectroscopy was used to predict viable cell density (VCD) at industrially relevant high levels in concentrated fed-batch culture of Chinese hamster ovary cells producing a monoclonal antibody for pharmaceutical purposes. For on-line dielectric spectroscopy measurements, capacitance was scanned within a wide range of frequency values (100–19,490 kHz) in six parallel cell cultivation batches. Prior to detailed mathematical analysis of the collected data, principal component analysis (PCA) was applied to compare dielectric behavior of the cultivations. PCA analysis resulted in detecting measurement disturbances. By using the measured spectroscopic data, partial least squares regression (PLS), Cole–Cole, and linear modeling were applied and compared in order to predict VCD. The Cole–Cole and the PLS model provided reliable prediction over the entire cultivation including both the early and decline phases of cell growth, while the linear model failed to estimate VCD in the later, declining cultivation phase. In regards to the measurement error sensitivity, remarkable differences were shown among PLS, Cole–Cole, and linear modeling. VCD prediction accuracy could be improved in the runs with measurement disturbances by first derivative pre-treatment in PLS and by parameter optimization of the Cole–Cole modeling.  相似文献   

2.
Aims We examine the relationships between the distribution of British ground beetle species and climatic and altitude variables with a view to developing models for evaluating the impact of climate change. Location Data from 1684 10‐km squares in Britain were used to model species–climate/altitude relationships. A validation data set was composed of data from 326 British 10‐km squares not used in the model data set. Methods The relationships between incidence and climate and altitude variables for 137 ground beetle species were investigated using logistic regression. The models produced were subjected to a validation exercise using the Kappa statistic with a second data set of 30 species. Distribution patterns for four species were predicted for Britain using the regression equations generated. Results As many as 136 ground beetle species showed significant relationships with one or more of the altitude and climatic variables but the amount of variation explained by the models was generally poor. Models explaining 20% or more of the variation in species incidence were generated for only 10 species. Mean summer temperature and mean annual temperature were the best predictors for eight and six of these 10 species respectively. Few models based on altitude, annual precipitation and mean winter temperature were good predictors of ground beetle species distribution. The results of the validation exercise were mixed, with models for four species showing good or moderate fits whilst the remainder were poor. Main conclusions Whilst there were many significant relationships between British ground beetle species distributions and altitude and climatic variables, these variables did not appear to be good predictors of ground beetle species distribution. The poor model performance appears to be related to the coarse nature of the response and predictor data sets and the absence of key predictors from the models.  相似文献   

3.
I evaluated the predictive ability of statistical models obtained by applying seven methods of variable selection to 12 ecological and environmental data sets. Cross-validation, involving repeated splits of each data set into training and validation subsets, was used to obtain honest estimates of predictive ability that could be fairly compared among methods. There was surprisingly little difference in predictive ability among five methods based on multiple linear regression. Stepwise methods performed similarly to exhaustive algorithms for subset selection, and the choice of criterion for comparing models (Akaike's information criterion, Schwarz's Bayesian information criterion or F statistics) had little effect on predictive ability. For most of the data sets, two methods based on regression trees yielded models with substantially lower predictive ability. I argue that there is no 'best' method of variable selection and that any of the regression-based approaches discussed here is capable of yielding useful predictive models.  相似文献   

4.
This study was performed in order to evaluate a new LED‐based 2D‐fluorescence spectrometer for in‐line bioprocess monitoring of Chinese hamster ovary (CHO) cell culture processes. The new spectrometer used selected excitation wavelengths of 280, 365, and 455 nm to collect spectral data from six 10‐L fed‐batch processes. The technique provides data on various fluorescent compounds from the cultivation medium as well as from cell metabolism. In addition, scattered light offers information about the cultivation status. Multivariate data analysis tools were applied to analyze the large data sets of the collected fluorescence spectra. First, principal component analysis was used to accomplish an overview of all spectral data from all six CHO cultivations. Partial least square regression models were developed to correlate 2D‐fluorescence spectral data with selected critical process variables as offline reference values. A separate independent fed‐batch process was used for model validation and prediction. An almost continuous in‐line bioprocess monitoring was realized because 2D‐fluorescence spectra were collected every 10 min during the whole cultivation. The new 2D‐fluorescence device demonstrates the significant potential for accurate prediction of the total cell count, viable cell count, and the cell viability. The results strongly indicated that the technique is particularly capable to distinguish between different cell statuses inside the bioreactor. In addition, spectral data provided information about the lactate metabolism shift and cellular respiration during the cultivation process. Overall, the 2D‐fluorescence device is a highly sensitive tool for process analytical technology applications in mammalian cell cultures.  相似文献   

5.
This study explores the ability of regression models, with no knowledge of the underlying physiology, to estimate physiological parameters relevant for metabolism and endocrinology. Four regression models were compared: multiple linear regression (MLR), principal component regression (PCR), partial least-squares regression (PLS) and regression using artificial neural networks (ANN). The pathway of mammalian gluconeogenesis was analyzed using [U−13C]glucose as tracer. A set of data was simulated by randomly selecting physiologically appropriate metabolic fluxes for the 9 steps of this pathway as independent variables. The isotope labeling patterns of key intermediates in the pathway were then calculated for each set of fluxes, yielding 29 dependent variables. Two thousand sets were created, allowing independent training and test data. Regression models were asked to predict the nine fluxes, given only the 29 isotopomers. For large training sets (>50) the artificial neural network model was superior, capturing 95% of the variability in the gluconeogenic flux, whereas the three linear models captured only 75%. This reflects the ability of neural networks to capture the inherent non-linearities of the metabolic system. The effect of error in the variables and the addition of random variables to the data set was considered. Model sensitivities were used to find the isotopomers that most influenced the predicted flux values. These studies provide the first test of multivariate regression models for the analysis of isotopomer flux data. They provide insight for metabolomics and the future of isotopic tracers in metabolic research where the underlying physiology is complex or unknown.We acknowledge the support of NIH Grant DK58533 and the DuPont-MIT Alliance.  相似文献   

6.
Comparisons between skin colorimetry reports have been hampered by the common use of two different types of portable reflectometers, which sample reflectances at different wavelengths. In an attempt to provide direct comparability between the two machines, multiple linear regression equations were derived from reflectance spectrophotometry readings on 308 Black Caribs and 175 Creoles in Belize, Central America, using both machines. Cross validation tests show the coefficients presented are applicable to independent data sets and generally applicable to other heavily pigmented populations. Comparisons with previously published conversion formulae, which were from very small samples using simple linear regression, show a definite improvement in predictive accuracy when using multiple regression equations based on a large sample.  相似文献   

7.
8.
Multi-wavelength fluorescence spectroscopy was evaluated as a tool for on-line monitoring of recombinant Escherichia coli cultivations expressing human basic fibroblast growth factor (hFGF-2). The data sets for the various combinations of the excitation and emission spectra from batch cultivations were analyzed using principal component analysis. Chemometric models (the partial least squares method) were developed for correlating the fluorescence data and the experimentally measured variables such as the biomass and glucose concentrations as well as the carbon dioxide production rate. Excellent correlations were obtained for these variables for the calibration cultivations. The predictability of these models was further tested in batch and fed-batch cultivations. The batch cultivations were well predicted by the PLS models for biomass, glucose concentrations and carbon dioxide production rate (RMSEPs were respectively 5%, 7%, 9%). However, when tested for biomass concentrations in fed-batch cultivations (with final biomass three times higher than the highest calibration data) the models had good predictability at high growth rates (RMSEPs were 3% and 4%, respectively for uninduced and induced fed-batch cultivations), which was as good as for the batch cultivations used for developing the models (RMSEPs were 3% and 5%, respectively for uninduced and induced batch cultivations). The fed-batch cultivations performed at low growth rates exhibited much higher fluorescence for fluorophores such as flavin and NAD(P)H as compared to fed-batch cultivations at high growth rate. Therefore, the PLS models tended to over-predict the biomass concentrations at low growth rates. Obviously the cells changed their concentration of biogenic fluorophores depending on the growth rate. Although multi-wavelength fluorescence spectroscopy is a valuable tool for on-line monitoring of bioprocess, care must be taken to re-calibrate the PLS models at different growth rates to improve the accuracy of predictions.  相似文献   

9.
In situ near-infrared (NIR) spectroscopy and in-line electronic nose (EN) mapping were used to monitor and control a cholera-toxin producing Vibrio cholerae fed-batch cultivation carried out with a laboratory method as well as with a production method. Prediction models for biomass, glucose and acetate using NIR spectroscopy were developed based on spectral identification and partial-least squares (PLS) regression resulting in high correlation to reference data (standard errors of prediction for biomass, glucose and acetate were 0.20 gl(-1), 0.26 gl(-1) and 0.28 gl(-1)). A compensation algorithm for aerated bioreactor disturbances was integrated in the model computation, which in particular improved the prediction by the biomass model. First, the NIR data were applied together with EN in-line data selected by principal component analysis (PCA) for generating a trajectory representation of the fed-batch cultivation. A correlation between the culture progression and EN signals was demonstrated, which proved to be beneficial in monitoring the culture quality. It was shown that a deviation from a normal cultivation behavior could easily be recognized and that the trajectory was able to alarm a bacterial contamination. Second, the NIR data indicated the potential of predicting the concentration of formed cholera toxin with a model prediction error of 0.020 gl(-1). Third, the on-line biomass prediction based on the NIR model was used to control the overflow metabolism acetate formation of the V. cholerae culture. The controller compared actual specific growth rate as estimated from the prediction with the critical acetate formation growth rate, and from that difference adjusted the glucose feed rate.  相似文献   

10.
Increasing concern over the implications of climate change for biodiversity has led to the use of species–climate envelope models to project species extinction risk under climate‐change scenarios. However, recent studies have demonstrated significant variability in model predictions and there remains a pressing need to validate models and to reduce uncertainties. Model validation is problematic as predictions are made for events that have not yet occurred. Resubstituition and data partitioning of present‐day data sets are, therefore, commonly used to test the predictive performance of models. However, these approaches suffer from the problems of spatial and temporal autocorrelation in the calibration and validation sets. Using observed distribution shifts among 116 British breeding‐bird species over the past ~20 years, we are able to provide a first independent validation of four envelope modelling techniques under climate change. Results showed good to fair predictive performance on independent validation, although rules used to assess model performance are difficult to interpret in a decision‐planning context. We also showed that measures of performance on nonindependent data provided optimistic estimates of models' predictive ability on independent data. Artificial neural networks and generalized additive models provided generally more accurate predictions of species range shifts than generalized linear models or classification tree analysis. Data for independent model validation and replication of this study are rare and we argue that perfect validation may not in fact be conceptually possible. We also note that usefulness of models is contingent on both the questions being asked and the techniques used. Implementations of species–climate envelope models for testing hypotheses and predicting future events may prove wrong, while being potentially useful if put into appropriate context.  相似文献   

11.
长江口为西太平洋最大的河口,评估其鱼类群落多样性分布能够为长江口生态系统的修复和管理提供科学依据.本研究基于2012—2014年长江口渔业监测数据,分别使用GAM模型和BRT模型建立各站点水域鱼类群落多样性指数与环境和时空因子之间的关系.结合线性回归方程,采用交叉验证的方式对模型的预测能力和拟合效果进行评价,并绘制了2014年长江口鱼类群落多样性指数和丰富度指数的空间分布图.结果表明: 盐度、pH和叶绿素a对多样性指数贡献最高,pH、溶解氧和叶绿素a是对丰富度指数贡献率最高的环境因子.BRT模型对于多样性指数和丰富度指数的拟合和预测结果均优于GAM模型.空间分布预测显示,相较于GAM模型,BRT模型能够对长江口小面积水域间的鱼类群落多样性作更好的区分,河口外侧水域的鱼类群落多样性明显高于河口内侧水域,而北支水域的多样性高于南支水域.  相似文献   

12.
In model building and model evaluation, cross‐validation is a frequently used resampling method. Unfortunately, this method can be quite time consuming. In this article, we discuss an approximation method that is much faster and can be used in generalized linear models and Cox’ proportional hazards model with a ridge penalty term. Our approximation method is based on a Taylor expansion around the estimate of the full model. In this way, all cross‐validated estimates are approximated without refitting the model. The tuning parameter can now be chosen based on these approximations and can be optimized in less time. The method is most accurate when approximating leave‐one‐out cross‐validation results for large data sets which is originally the most computationally demanding situation. In order to demonstrate the method's performance, it will be applied to several microarray data sets. An R package penalized, which implements the method, is available on CRAN.  相似文献   

13.
14.
As proteomic data sets increase in size and complexity, the necessity for database‐centric software systems able to organize, compare, and visualize all the proteomic experiments in a lab grows. We recently developed an integrated platform called high‐throughput autonomous proteomic pipeline (HTAPP) for the automated acquisition and processing of quantitative proteomic data, and integration of proteomic results with existing external protein information resources within a lab‐based relational database called PeptideDepot. Here, we introduce the peptide validation software component of this system, which combines relational database‐integrated electronic manual spectral annotation in Java with a new software tool in the R programming language for the generation of logistic regression spectral models from user‐supplied validated data sets and flexible application of these user‐generated models in automated proteomic workflows. This logistic regression spectral model uses both variables computed directly from SEQUEST output in addition to deterministic variables based on expert manual validation criteria of spectral quality. In the case of linear quadrupole ion trap (LTQ) or LTQ‐FTICR LC/MS data, our logistic spectral model outperformed both XCorr (242% more peptides identified on average) and the X!Tandem E‐value (87% more peptides identified on average) at a 1% false discovery rate estimated by decoy database approach.  相似文献   

15.
A number of a priori warfarin dosing algorithms, derived using linear regression methods, have been proposed. Although these dosing algorithms may have been validated using patients derived from the same centre, rarely have they been validated using a patient cohort recruited from another centre. In order to undertake external validation, two cohorts were utilised. One cohort formed by patients from a prospective trial and the second formed by patients in the control arm of the EU-PACT trial. Of these, 641 patients were identified as having attained stable dosing and formed the dataset used for validation. Predicted maintenance doses from six criterion fulfilling regression models were then compared to individual patient stable warfarin dose. Predictive ability was assessed with reference to several statistics including the R-square and mean absolute error. The six regression models explained different amounts of variability in the stable maintenance warfarin dose requirements of the patients in the two validation cohorts; adjusted R-squared values ranged from 24.2% to 68.6%. An overview of the summary statistics demonstrated that no one dosing algorithm could be considered optimal. The larger validation cohort from the prospective trial produced more consistent statistics across the six dosing algorithms. The study found that all the regression models performed worse in the validation cohort when compared to the derivation cohort. Further, there was little difference between regression models that contained pharmacogenetic coefficients and algorithms containing just non-pharmacogenetic coefficients. The inconsistency of results between the validation cohorts suggests that unaccounted population specific factors cause variability in dosing algorithm performance. Better methods for dosing that take into account inter- and intra-individual variability, at the initiation and maintenance phases of warfarin treatment, are needed.  相似文献   

16.
Multi-wavelength fluorescence was applied for on-line monitoring of cell mass and the antibiotic polymyxin B in Bacillus polymyxa cultivations. By varying the phosphate and nitrogen content of the medium different polymyxin-cell mass ratios could be obtained. Using this strategy, it was possible to investigate if multi-wavelength fluorescence is able to give independent prediction of the two parameters. Partial least square (PLS) regression was applied to establish mathematical relationships between off-line determined cell mass and polymyxin concentrations and on-line collected fluorescence data. For polymyxin one universal PLS model, with a correlation of 0.95 and a root mean square error of cross validation (RMSECV) of 35 mgl(-1), could be constructed. However, correlation between fluorescence and cell mass dry weight could not be established including data from all three types of cultivations. For data from each type of cultivation, separate models with high correlation and low RMSECV values were built. A large variation in cellular composition as a result of the different levels of nitrogen and phosphorus in the cultivations was the probable reason to the necessity of building three models. The results of the present investigation indicate that production of polymyxin is concomitantly regulated by phosphate and nitrogen as the highest polymyxin yield on cell mass, 0.17+/-0.01 gg(-1), was reached in the cultivations where both nitrogen and phosphate concentrations were kept low.  相似文献   

17.
The main objective of the present study was to investigate the use of in situ 2D fluorometry for monitoring key bioprocess variables in mammalian cell cultures, namely the concentration of viable cells and the concentration of recombinant proteins. All studies were conducted using a recombinant Baby Hamster Kidney (BHK) cell line expressing a fusion glycoprotein IgG1-IL2 cultured in batch and fed-batch modes. It was observed that the intensity of fluorescence signals in the excitation/emission wavelength range of amino acids, vitamins and NAD(P)H changed along culture time, although the dynamics of single fluorophors could not be correlated with the dynamics of the target state variables. Therefore, multivariate chemometric modeling was adopted as a calibration methodology. 2D fluorometry produced large volumes of redundant spectral data, which were first filtered by principal components analysis (PCA). Then, a partial least squares (PLS) regression was applied to correlate the reduced fluorescence maps with the target state variables. Two validation strategies were used to evaluate the predictive capacity of the developed PLS models. Accurate estimations of viable cells density (r(2) = 0.95; 99.2% of variance captured in the training set; r(2) = 0.91; 97.7% of variance captured in the validation set) and of glycoprotein concentration (r(2) = 0.99 and 99.7% of variance captured in the training set; r(2) = 0.99 and 99.3% of variance captured in the validation set) were obtained over a wide range of reactor operation conditions. The results presented herein confirm that 2D fluorometry constitutes a reliable methodology for on-line monitoring of viable cells and recombinant protein concentrations in mammalian cell cultures.  相似文献   

18.
Binary regression models for spatial data are commonly used in disciplines such as epidemiology and ecology. Many spatially referenced binary data sets suffer from location error, which occurs when the recorded location of an observation differs from its true location. When location error occurs, values of the covariates associated with the true spatial locations of the observations cannot be obtained. We show how a change of support (COS) can be applied to regression models for binary data to provide coefficient estimates when the true values of the covariates are unavailable, but the unknown location of the observations are contained within nonoverlapping arbitrarily shaped polygons. The COS accommodates spatial and nonspatial covariates and preserves the convenient interpretation of methods such as logistic and probit regression. Using a simulation experiment, we compare binary regression models with a COS to naive approaches that ignore location error. We illustrate the flexibility of the COS by modeling individual-level disease risk in a population using a binary data set where the locations of the observations are unknown but contained within administrative units. Our simulation experiment and data illustration corroborate that conventional regression models for binary data that ignore location error are unreliable, but that the COS can be used to eliminate bias while preserving model choice.  相似文献   

19.
A combined predictive and feedback control algorithm based on measurements of the concentration of glucose on-line has been developed to control fed-batch fermentations of Escherichia coli. The predictive control algorithm was based on the on-line calculation of glucose demand by the culture and plotting a linear regression to the next datum point to obtain a predicted glucose demand. This provided a predictive "coarse" control for the glucose-based nutrient feed. A direct feedback control using a proportional controller, based on glucose measurements every 2 min, fine-tuned the feed rate. These combined control schemes were used to maintain glucose concentrations in fed-batch fermentations as tight as 0.49 +/- 0.04 g/liter during growth of E. coli to high cell densities.  相似文献   

20.
A combined predictive and feedback control algorithm based on measurements of the concentration of glucose on-line has been developed to control fed-batch fermentations of Escherichia coli. The predictive control algorithm was based on the on-line calculation of glucose demand by the culture and plotting a linear regression to the next datum point to obtain a predicted glucose demand. This provided a predictive "coarse" control for the glucose-based nutrient feed. A direct feedback control using a proportional controller, based on glucose measurements every 2 min, fine-tuned the feed rate. These combined control schemes were used to maintain glucose concentrations in fed-batch fermentations as tight as 0.49 +/- 0.04 g/liter during growth of E. coli to high cell densities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号