首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The results of the cluster analysis of fermentation data are used for the supervision and on-line state estimation. The results of the classification are presented as the average over all fermentation runs belonging to the class as well as the standard deviation. With the help of the class information the on-line fermentation is associated with the best suiting class. Faults in the data such as spikes or total failure of the sensors are detected as the class information automatically supplies tolerance regions for the measurements. In case of a fault a reliable extrapolation for the time of the fault can be calculated. The approach is implemented in the real-time expert system tool G2 and is applied to data of the carbon dioxide evolution rate (CER) of an industrial antibiotic fermentation process.  相似文献   

2.

Background  

Increasingly researchers are turning to the use of haplotype analysis as a tool in population studies, the investigation of linkage disequilibrium, and candidate gene analysis. When the phase of the data is unknown, computational methods, in particular those employing the Expectation-Maximisation (EM) algorithm, are frequently used for estimating the phase and frequency of the underlying haplotypes. These methods have proved very successful, predicting the phase-known frequencies from data for which the phase is unknown with a high degree of accuracy. Recently there has been much speculation as to the effect of unknown, or missing allelic data – a common phenomenon even with modern automated DNA analysis techniques – on the performance of EM-based methods. To this end an EM-based program, modified to accommodate missing data, has been developed, incorporating non-parametric bootstrapping for the calculation of accurate confidence intervals.  相似文献   

3.
4.
5.
GEE with Gaussian estimation of the correlations when data are incomplete   总被引:4,自引:0,他引:4  
This paper considers a modification of generalized estimating equations (GEE) for handling missing binary response data. The proposed method uses Gaussian estimation of the correlation parameters, i.e., the estimating function that yields an estimate of the correlation parameters is obtained from the multivariate normal likelihood. The proposed method yields consistent estimates of the regression parameters when data are missing completely at random (MCAR). However, when data are missing at random (MAR), consistency may not hold. In a simulation study with repeated binary outcomes that are missing at random, the magnitude of the potential bias that can arise is examined. The results of the simulation study indicate that, when the working correlation matrix is correctly specified, the bias is almost negligible for the modified GEE. In the simulation study, the proposed modification of GEE is also compared to the standard GEE, multiple imputation, and weighted estimating equations approaches. Finally, the proposed method is illustrated using data from a longitudinal clinical trial comparing two therapeutic treatments, zidovudine (AZT) and didanosine (ddI), in patients with HIV.  相似文献   

6.
7.
Mass balance of a glacier is an accepted measure of how much mass a glacier gains or loses. In theory, it is typically computed by integral functional and empirically, it is approximated by arithmetic mean. However, the variability of such an approach was not studied satisfactory yet. In this paper we provide a dynamical system of mass balance measurements under the constrains of 2nd order model with exponentially decreasing covariance. We also provide locations of optimal measurements, so called designs. We study Ornstein–Uhlenbeck (OU) processes and sheets with linear drifts and introduce K optimal designs in the correlated processes setup. We provide a thorough comparison of equidistant, Latin Hypercube Samples (LHS), and factorial designs for D- and K-optimality as well as the variance. We show differences between these criteria and discuss the role of equidistant designs for the correlated process. In particular, applications to estimation of mass balance of Olivares Alfa and Beta glaciers in Chile is investigated showing that simple application of full raster design and kriging based on inter- and extrapolation of points can lead to increased variance. We also show how the removal of certain measurement points may increase the quality of the melting assessment while decreasing costs. Blow-ups of solutions of dynamical systems underline the empirically observed fact that in a homogenous glaciers around 11 well-positioned stakes suffices for mass balance measurement.  相似文献   

8.
High-throughput protein analysis by tandem mass spectrometry produces anywhere from thousands to millions of spectra that are being used for peptide and protein identifications. Though each spectrum corresponds only to one charged peptide (ion) state, repetitive database searches of multiple charge states are typically conducted since the resolution of many common mass spectrometers is not sufficient to determine the charge state. The resulting database searches are both error-prone and time-consuming. We describe a straightforward, accurate approach on charge state estimation (CHASTE). CHASTE relies on fragment ion peak distributions, and by using reliable logistic regression models, combines different measurements to improve its accuracy. CHASTE's performance has been validated on data sets, comprised of known peptide dissociation spectra, obtained by replicate analyses of our earlier developed protein standard mixture using ion trap mass spectrometers at different laboratories. CHASTE was able to reduce number of needed database searches by at least 60% and the number of redundant searches by at least 90% virtually without any informational loss. This greatly alleviates one of the major bottlenecks in high throughput peptide and protein identifications. Thresholds and parameter estimates can be tailored to specific analysis situations, pipelines, and instrumentations. CHASTE was implemented in Java GUI-based and command-line-based interfaces.  相似文献   

9.
10.
Mass spectrometric (MS) isotopomer analysis has become a standard tool for investigating biological systems using stable isotopes. In particular, metabolic flux analysis uses mass isotopomers of metabolic products typically formed from 13C-labeled substrates to quantitate intracellular pathway fluxes. In the current work, we describe a model-driven method of numerical bias estimation regarding MS isotopomer analysis. Correct bias estimation is crucial for measuring statistical qualities of measurements and obtaining reliable fluxes. The model we developed for bias estimation corrects a priori unknown systematic errors unique for each individual mass isotopomer peak. For validation, we carried out both computational simulations and experimental measurements. From stochastic simulations, it was observed that carbon mass isotopomer distributions and measurement noise can be determined much more precisely only if signals are corrected for possible systematic errors. By removing the estimated background signals, the residuals resulting from experimental measurement and model expectation became consistent with normality, experimental variability was reduced, and data consistency was improved. The method is useful for obtaining systematic error-free data from 13C tracer experiments and can also be extended to other stable isotopes. As a result, the reliability of metabolic fluxes that are typically computed from mass isotopomer measurements is increased.  相似文献   

11.
In this paper we propose a methodology to determine the structure of the pseudo-stoichiometric coefficient matrix kappa in a macroscopic mass balance based model. The first step consists in estimating the minimal number of reactions that must be taken into account to represent the main mass transfer within the bioreactor. This provides the dimension of kappa. Then we discuss the identifiability of the components of kappa and we propose a method to estimate their values. Finally we present a method to select among a set of possible macroscopic reaction networks those which are in agreement with the available measurements. These methods are illustrated with three examples: real data of the growth and biotransformation of the filamentous fungi Pycnoporus cinnabarinus, real data of an anaerobic digester involving a bacterial consortium degrading a mixture of organic substrates and a process of lipase production from olive oil by Candida rugosa.  相似文献   

12.
13.
Dupuis JA  Joachim J 《Biometrics》2006,62(3):706-712
We consider the problem of estimating the number of species of an animal community. It is assumed that it is possible to draw up a list of species liable to be present in this community. Data are collected from quadrat sampling. Models considered in this article separate the assumptions related to the experimental protocol and those related to the spatial distribution of species in the quadrats. Our parameterization enables us to incorporate prior information on the presence, detectability, and spatial density of species. Moreover, we elaborate procedures to build the prior distributions on these parameters from information furnished by external data. A simulation study is carried out to examine the influence of different priors on the performances of our estimator. We illustrate our approach by estimating the number of nesting bird species in a forest.  相似文献   

14.
15.
16.
17.
Leveraging information in aggregate data from external sources to improve estimation efficiency and prediction accuracy with smaller scale studies has drawn a great deal of attention in recent years. Yet, conventional methods often either ignore uncertainty in the external information or fail to account for the heterogeneity between internal and external studies. This article proposes an empirical likelihood-based framework to improve the estimation of the semiparametric transformation models by incorporating information about the t-year subgroup survival probability from external sources. The proposed estimation procedure incorporates an additional likelihood component to account for uncertainty in the external information and employs a density ratio model to characterize population heterogeneity. We establish the consistency and asymptotic normality of the proposed estimator and show that it is more efficient than the conventional pseudopartial likelihood estimator without combining information. Simulation studies show that the proposed estimator yields little bias and outperforms the conventional approach even in the presence of information uncertainty and heterogeneity. The proposed methodologies are illustrated with an analysis of a pancreatic cancer study.  相似文献   

18.
Ghosh D  Lin DY 《Biometrics》2003,59(4):877-885
Dependent censoring occurs in longitudinal studies of recurrent events when the censoring time depends on the potentially unobserved recurrent event times. To perform regression analysis in this setting, we propose a semiparametric joint model that formulates the marginal distributions of the recurrent event process and dependent censoring time through scale-change models, while leaving the distributional form and dependence structure unspecified. We derive consistent and asymptotically normal estimators for the regression parameters. We also develop graphical and numerical methods for assessing the adequacy of the proposed model. The finite-sample behavior of the new inference procedures is evaluated through simulation studies. An application to recurrent hospitalization data taken from a study of intravenous drug users is provided.  相似文献   

19.
Environmental DNA (eDNA) metabarcoding is increasingly used to study the present and past biodiversity. eDNA analyses often rely on amplification of very small quantities or degraded DNA. To avoid missing detection of taxa that are actually present (false negatives), multiple extractions and amplifications of the same samples are often performed. However, the level of replication needed for reliable estimates of the presence/absence patterns remains an unaddressed topic. Furthermore, degraded DNA and PCR/sequencing errors might produce false positives. We used simulations and empirical data to evaluate the level of replication required for accurate detection of targeted taxa in different contexts and to assess the performance of methods used to reduce the risk of false detections. Furthermore, we evaluated whether statistical approaches developed to estimate occupancy in the presence of observational errors can successfully estimate true prevalence, detection probability and false‐positive rates. Replications reduced the rate of false negatives; the optimal level of replication was strongly dependent on the detection probability of taxa. Occupancy models successfully estimated true prevalence, detection probability and false‐positive rates, but their performance increased with the number of replicates. At least eight PCR replicates should be performed if detection probability is not high, such as in ancient DNA studies. Multiple DNA extractions from the same sample yielded consistent results; in some cases, collecting multiple samples from the same locality allowed detecting more species. The optimal level of replication for accurate species detection strongly varies among studies and could be explicitly estimated to improve the reliability of results.  相似文献   

20.
Multilist population estimation with incomplete and partial stratification   总被引:2,自引:0,他引:2  
Multilist capture-recapture methods are commonly used to estimate the size of elusive populations. In many situations, lists are stratified by distinguishing features, such as age or sex. Stratification has often been used to reduce biases caused by heterogeneity in the probability of list membership among members of the population; however, it is increasingly common to find lists that are structurally not active in all strata. We develop a general method to deal with cases when not all lists are active in all strata using an expectation maximization (EM) algorithm. We use a flexible log-linear modeling framework that allows for list dependencies and differential probabilities of ascertainment in each list. Finally, we apply our method of estimating population size to two examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号