首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The analysis of nonlinear function-valued characters is very important in genetic studies, especially for growth traits of agricultural and laboratory species. Inference in nonlinear mixed effects models is, however, quite complex and is usually based on likelihood approximations or Bayesian methods. The aim of this paper was to present an efficient stochastic EM procedure, namely the SAEM algorithm, which is much faster to converge than the classical Monte Carlo EM algorithm and Bayesian estimation procedures, does not require specification of prior distributions and is quite robust to the choice of starting values. The key idea is to recycle the simulated values from one iteration to the next in the EM algorithm, which considerably accelerates the convergence. A simulation study is presented which confirms the advantages of this estimation procedure in the case of a genetic analysis. The SAEM algorithm was applied to real data sets on growth measurements in beef cattle and in chickens. The proposed estimation procedure, as the classical Monte Carlo EM algorithm, provides significance tests on the parameters and likelihood based model comparison criteria to compare the nonlinear models with other longitudinal methods.  相似文献   

2.
Summary HIV dynamics studies, based on differential equations, have significantly improved the knowledge on HIV infection. While first studies used simplified short‐term dynamic models, recent works considered more complex long‐term models combined with a global analysis of whole patient data based on nonlinear mixed models, increasing the accuracy of the HIV dynamic analysis. However statistical issues remain, given the complexity of the problem. We proposed to use the SAEM (stochastic approximation expectation‐maximization) algorithm, a powerful maximum likelihood estimation algorithm, to analyze simultaneously the HIV viral load decrease and the CD4 increase in patients using a long‐term HIV dynamic system. We applied the proposed methodology to the prospective COPHAR2–ANRS 111 trial. Very satisfactory results were obtained with a model with latent CD4 cells defined with five differential equations. One parameter was fixed, the 10 remaining parameters (eight with between‐patient variability) of this model were well estimated. We showed that the efficacy of nelfinavir was reduced compared to indinavir and lopinavir.  相似文献   

3.
Nonlinear mixed effects models are now widely used in biometrical studies, especially in pharmacokinetic research or for the analysis of growth traits for agricultural and laboratory species. Most of these studies, however, are often based on ML estimation procedures, which are known to be biased downwards. A few REML extensions have been proposed, but only for approximated methods. The aim of this paper is to present a REML implementation for nonlinear mixed effects models within an exact estimation scheme, based on an integration of the fixed effects and a stochastic estimation procedure. This method was implemented via a stochastic EM, namely the SAEM algorithm. The simulation study showed that the proposed REML estimation procedure considerably reduced the bias observed with the ML estimation, as well as the residual mean squared error of the variance parameter estimations, especially in the unbalanced cases. ML and REML based estimators of fixed effects were also compared via simulation. Although the two kinds of estimates were very close in terms of bias and mean square error, predictions of individual profiles were clearly improved when using REML vs. ML. An application of this estimation procedure is presented for the modelling of growth in lines of chicken.  相似文献   

4.
Multistate Markov models are frequently used to characterize disease processes, but their estimation from longitudinal data is often hampered by complex patterns of incompleteness. Two algorithms for estimating Markov chain models in the case of intermittent missing data in longitudinal studies, a stochastic EM algorithm and the Gibbs sampler, are described. The first can be viewed as a random perturbation of the EM algorithm and is appropriate when the M step is straightforward but the E step is computationally burdensome. It leads to a good approximation of the maximum likelihood estimates. The Gibbs sampler is used for a full Bayesian inference. The performances of the two algorithms are illustrated on two simulated data sets. A motivating example concerned with the modelling of the evolution of parasitemia by Plasmodium falciparum (malaria) in a cohort of 105 young children in Cameroon is described and briefly analyzed.  相似文献   

5.
We consider the estimation of a nonparametric smooth function of some event time in a semiparametric mixed effects model from repeatedly measured data when the event time is subject to right censoring. The within-subject correlation is captured by both cross-sectional and time-dependent random effects, where the latter is modeled by a nonhomogeneous Ornstein–Uhlenbeck stochastic process. When the censoring probability depends on other variables in the model, which often happens in practice, the event time data are not missing completely at random. Hence, the complete case analysis by eliminating all the censored observations may yield biased estimates of the regression parameters including the smooth function of the event time, and is less efficient. To remedy, we derive the likelihood function for the observed data by modeling the event time distribution given other covariates. We propose a two-stage pseudo-likelihood approach for the estimation of model parameters by first plugging an estimator of the conditional event time distribution into the likelihood and then maximizing the resulting pseudo-likelihood function. Empirical evaluation shows that the proposed method yields negligible biases while significantly reduces the estimation variability. This research is motivated by the project of hormone profile estimation around age at the final menstrual period for the cohort of women in the Michigan Bone Health and Metabolism Study.  相似文献   

6.
Friedl H  Kauermann G 《Biometrics》2000,56(3):761-767
A procedure is derived for computing standard errors of EM estimates in generalized linear models with random effects. Quadrature formulas are used to approximate the integrals in the EM algorithm, where two different approaches are pursued, i.e., Gauss-Hermite quadrature in the case of Gaussian random effects and nonparametric maximum likelihood estimation for an unspecified random effect distribution. An approximation of the expected Fisher information matrix is derived from an expansion of the EM estimating equations. This allows for inferential arguments based on EM estimates, as demonstrated by an example and simulations.  相似文献   

7.
Using the theory of random point processes, a method is presented whereby functional relationships between neurons can be detected and modeled. The method is based on a point process characterization involving stochastic intensities and an additive rate function model. Estimates are based on the maximum likelihood (ML) principle and asymptotic properties are examined in the absence of a stationarity assumption. An iterative algorithm that computes the ML estimates is presented. It is based on the expectation/maximization (EM) procedure of Dempster et al. (1977) and makes ML identification accessible to models requiring many parameters. Examples illustrating the use of the method are also presented. These examples are derived from simulations of simple neural systems that cannot be identified using correlation techniques. It is shown that the ML method correctly identifies each of these systems.  相似文献   

8.
Nonlinear mixed effects models allow investigating individual differences in drug concentration profiles (pharmacokinetics) and responses. Pharmacogenetics focuses on the genetic component of this variability. Two tests often used to detect a gene effect on a pharmacokinetic parameter are (1) the Wald test, assessing whether estimates for the gene effect are significantly different from 0 and (2) the likelihood ratio test comparing models with and without the genetic effect. Because those asymptotic tests show inflated type I error on small sample size and/or with unevenly distributed genotypes, we develop two alternatives and evaluate them by means of a simulation study. First, we assess the performance of the permutation test using the Wald and the likelihood ratio statistics. Second, for the Wald test we propose the use of the F-distribution with four different values for the denominator degrees of freedom. We also explore the influence of the estimation algorithm using both the first-order conditional estimation with interaction linearization-based algorithm and the stochastic approximation expectation maximization algorithm. We apply these methods to the analysis of the pharmacogenetics of indinavir in HIV patients recruited in the COPHAR2-ANRS 111 trial. Results of the simulation study show that the permutation test seems appropriate but at the cost of an additional computational burden. One of the four F-distribution-based approaches provides a correct type I error estimate for the Wald test and should be further investigated.  相似文献   

9.
Proportional hazard models with multivariate random effects (frailties) acting multiplicatively on the baseline hazard have recently become a topic of an intensive research. One of the main practical problems related to the models is the estimation of parameters. To this aim, several approaches based on the EM algorithm have been proposed. The major difference between these approaches is the method of the computation of conditional expectations required at the E-step. In this paper an alternative implementation of the EM algorithm is proposed, in which the expected values are computed with the use of the Laplace approximation. The method is computationally less demanding than the approaches developed previously. Its performance is assessed based on a simulation study and compared to a non-EM based estimation approach proposed by Ripatti and Palmgren (2000).  相似文献   

10.
Aitkin M 《Biometrics》1999,55(1):117-128
This paper describes an EM algorithm for nonparametric maximum likelihood (ML) estimation in generalized linear models with variance component structure. The algorithm provides an alternative analysis to approximate MQL and PQL analyses (McGilchrist and Aisbett, 1991, Biometrical Journal 33, 131-141; Breslow and Clayton, 1993; Journal of the American Statistical Association 88, 9-25; McGilchrist, 1994, Journal of the Royal Statistical Society, Series B 56, 61-69; Goldstein, 1995, Multilevel Statistical Models) and to GEE analyses (Liang and Zeger, 1986, Biometrika 73, 13-22). The algorithm, first given by Hinde and Wood (1987, in Longitudinal Data Analysis, 110-126), is a generalization of that for random effect models for overdispersion in generalized linear models, described in Aitkin (1996, Statistics and Computing 6, 251-262). The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully nonparametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters can be sensitive to the specification of a parametric form for the mixing distribution. The nonparametric analysis can be extended straightforwardly to general random parameter models, with full NPML estimation of the joint distribution of the random parameters. This can produce substantial computational saving compared with full numerical integration over a specified parametric distribution for the random parameters. A simple method is described for obtaining correct standard errors for parameter estimates when using the EM algorithm. Several examples are discussed involving simple variance component and longitudinal models, and small-area estimation.  相似文献   

11.
Two-part joint models for a longitudinal semicontinuous biomarker and a terminal event have been recently introduced based on frequentist estimation. The biomarker distribution is decomposed into a probability of positive value and the expected value among positive values. Shared random effects can represent the association structure between the biomarker and the terminal event. The computational burden increases compared to standard joint models with a single regression model for the biomarker. In this context, the frequentist estimation implemented in the R package frailtypack can be challenging for complex models (i.e., a large number of parameters and dimension of the random effects). As an alternative, we propose a Bayesian estimation of two-part joint models based on the Integrated Nested Laplace Approximation (INLA) algorithm to alleviate the computational burden and fit more complex models. Our simulation studies confirm that INLA provides accurate approximation of posterior estimates and to reduced computation time and variability of estimates compared to frailtypack in the situations considered. We contrast the Bayesian and frequentist approaches in the analysis of two randomized cancer clinical trials (GERCOR and PRIME studies), where INLA has a reduced variability for the association between the biomarker and the risk of event. Moreover, the Bayesian approach was able to characterize subgroups of patients associated with different responses to treatment in the PRIME study. Our study suggests that the Bayesian approach using the INLA algorithm enables to fit complex joint models that might be of interest in a wide range of clinical applications.  相似文献   

12.
Recently, the Bayesian least absolute shrinkage and selection operator (LASSO) has been successfully applied to multiple quantitative trait loci (QTL) mapping, which assigns the double-exponential prior and the Student’s t prior to QTL effect that lead to the shrinkage estimate of QTL effect. However, as reported by many researchers, the Bayesian LASSO usually cannot effectively shrink the effects of zero-effect QTL very close to zero. In this study, the double-exponential prior and Student’s t prior are modified so that the estimate of the effect for zero-effect QTL can be effectively shrunk toward zero. It is also found that the Student’s t prior is virtually the same as the Jeffreys’ prior, since both the shape and scale parameters of the scaled inverse Chi-square prior involved in the Student’s t prior are estimated very close to zero. Besides the two modified Bayesian Markov chain Monte Carlo (MCMC) algorithms, an expectation–maximization (EM) algorithm with use of the modified double-exponential prior is also adapted. The results shows that the three new methods perform similarly on true positive rate and false positive rate for QTL detection, and all of them outperform the Bayesian LASSO.  相似文献   

13.
Using evolutionary Expectation Maximization to estimate indel rates   总被引:4,自引:0,他引:4  
MOTIVATION: The Expectation Maximization (EM) algorithm, in the form of the Baum-Welch algorithm (for hidden Markov models) or the Inside-Outside algorithm (for stochastic context-free grammars), is a powerful way to estimate the parameters of stochastic grammars for biological sequence analysis. To use this algorithm for multiple-sequence evolutionary modelling, it would be useful to apply the EM algorithm to estimate not only the probability parameters of the stochastic grammar, but also the instantaneous mutation rates of the underlying evolutionary model (to facilitate the development of stochastic grammars based on phylogenetic trees, also known as Statistical Alignment). Recently, we showed how to do this for the point substitution component of the evolutionary process; here, we extend these results to the indel process. RESULTS: We present an algorithm for maximum-likelihood estimation of insertion and deletion rates from multiple sequence alignments, using EM, under the single-residue indel model owing to Thorne, Kishino and Felsenstein (the 'TKF91' model). The algorithm converges extremely rapidly, gives accurate results on simulated data that are an improvement over parsimonious estimates (which are shown to underestimate the true indel rate), and gives plausible results on experimental data (coronavirus envelope domains). Owing to the algorithm's close similarity to the Baum-Welch algorithm for training hidden Markov models, it can be used in an 'unsupervised' fashion to estimate rates for unaligned sequences, or estimate several sets of rates for sequences with heterogenous rates. AVAILABILITY: Software implementing the algorithm and the benchmark is available under GPL from http://www.biowiki.org/  相似文献   

14.
Challier L  Orr P  Robin JP 《Oecologia》2006,150(1):17-28
A new approach is presented here to better take into account inter-individual growth variability in age-structured models used for stock assessment. Cohort analysis requires knowledge of the age structure of the catch, generally derived from an age–length key and length-structure information. Age distribution at length is estimated by applying conditional quantile regression to a data set of lengths and ages estimated from calcareous parts. A “stochastic” age–length key that describes the probability of age-at-length is applied to the English Channel squid Loligo forbesi. Age distribution at length from quantile regression proved to be considerably less biased than that resulting from both polymodal decomposition (PD) and two separate slicing methods. Both catch-at-age and stock size were underestimated using classical methods. Estimations of fishing mortalities from classical methods were higher causing underestimation in yield simulations. Quantile regression offers a more complete statistical analysis of the stochastic relationships among random variables than mean regression and PD.  相似文献   

15.
The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate-based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug-resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an expectation–maximization (EM) algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.  相似文献   

16.
We consider an extension of linear mixed models by assuming a multivariate skew t distribution for the random effects and a multivariate t distribution for the error terms. The proposed model provides flexibility in capturing the effects of skewness and heavy tails simultaneously among continuous longitudinal data. We present an efficient alternating expectation‐conditional maximization (AECM) algorithm for the computation of maximum likelihood estimates of parameters on the basis of two convenient hierarchical formulations. The techniques for the prediction of random effects and intermittent missing values under this model are also investigated. Our methodologies are illustrated through an application to schizophrenia data.  相似文献   

17.
We live in a time where climate models predict future increases in environmental variability and biological invasions are becoming increasingly frequent. A key to developing effective responses to biological invasions in increasingly variable environments will be estimates of their rates of spatial spread and the associated uncertainty of these estimates. Using stochastic, stage-structured, integrodifference equation models, we show analytically that invasion speeds are asymptotically normally distributed with a variance that decreases in time. We apply our methods to a simple juvenile–adult model with stochastic variation in reproduction and an illustrative example with published data for the perennial herb, Calathea ovandensis. These examples buttressed by additional analysis reveal that increased variability in vital rates simultaneously slow down invasions yet generate greater uncertainty about rates of spatial spread. Moreover, while temporal autocorrelations in vital rates inflate variability in invasion speeds, the effect of these autocorrelations on the average invasion speed can be positive or negative depending on life history traits and how well vital rates “remember” the past.  相似文献   

18.
Huiping Xu  Bruce A. Craig 《Biometrics》2009,65(4):1145-1155
Summary Traditional latent class modeling has been widely applied to assess the accuracy of dichotomous diagnostic tests. These models, however, assume that the tests are independent conditional on the true disease status, which is rarely valid in practice. Alternative models using probit analysis have been proposed to incorporate dependence among tests, but these models consider restricted correlation structures. In this article, we propose a probit latent class model that allows a general correlation structure. When combined with some helpful diagnostics, this model provides a more flexible framework from which to evaluate the correlation structure and model fit. Our model encompasses several other PLC models but uses a parameter‐expanded Monte Carlo EM algorithm to obtain the maximum‐likelihood estimates. The parameter‐expanded EM algorithm was designed to accelerate the convergence rate of the EM algorithm by expanding the complete‐data model to include a larger set of parameters and it ensures a simple solution in fitting the PLC model. We demonstrate our estimation and model selection methods using a simulation study and two published medical studies.  相似文献   

19.
In statistical modelling, the effects of single-nucleotide polymorphisms (SNPs) are often regarded as time-independent. However, for traits recorded repeatedly, it is very interesting to investigate the behaviour of gene effects over time. In the analysis, simulated data from the 13th QTL-MAS Workshop (Wageningen, The Netherlands, April 2009) was used and the major goal was the modelling of genetic effects as time-dependent. For this purpose, a mixed model which describes each effect using the third-order Legendre orthogonal polynomials, in order to account for the correlation between consecutive measurements, is fitted. In this model, SNPs are modelled as fixed, while the environment is modelled as random effects. The maximum likelihood estimates of model parameters are obtained by the expectation–maximisation (EM) algorithm and the significance of the additive SNP effects is based on the likelihood ratio test, with p-values corrected for multiple testing. For each significant SNP, the percentage of the total variance contributed by this SNP is calculated. Moreover, by using a model which simultaneously incorporates effects of all of the SNPs, the prediction of future yields is conducted. As a result, 179 from the total of 453 SNPs covering 16 out of 18 true quantitative trait loci (QTL) were selected. The correlation between predicted and true breeding values was 0.73 for the data set with all SNPs and 0.84 for the data set with selected SNPs. In conclusion, we showed that a longitudinal approach allows for estimating changes of the variance contributed by each SNP over time and demonstrated that, for prediction, the pre-selection of SNPs plays an important role.  相似文献   

20.
Dewanji A  Sengupta D 《Biometrics》2003,59(4):1063-1070
In competing risks data, missing failure types (causes) is a very common phenomenon. In this work, we consider a general missing pattern in which, if a failure type is not observed, one observes a set of possible types containing the true type, along with the failure time. We first consider maximum likelihood estimation with missing-at-random assumption via the expectation maximization (EM) algorithm. We then propose a Nelson-Aalen type estimator for situations when certain information on the conditional probability of the true type given a set of possible failure types is available from the experimentalists. This is based on a least-squares type method using the relationships between hazards for different types and hazards for different combinations of missing types. We conduct a simulation study to investigate the performance of this method, which indicates that bias may be small, even for high proportion of missing data, for sufficiently large number of observations. The estimates are somewhat sensitive to misspecification of the conditional probabilities of the true types when the missing proportion is high. We also consider an example from an animal experiment to illustrate our methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号