首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Maximum likelihood Jukes-Cantor triplets: analytic solutions   总被引:1,自引:0,他引:1  
Maximum likelihood (ML) is a popular method for inferring a phylogenetic tree of the evolutionary relationship of a set of taxa, from observed homologous aligned genetic sequences of the taxa. Generally, the computation of the ML tree is based on numerical methods, which in a few cases, are known to converge to a local maximum on a tree, which is suboptimal. The extent of this problem is unknown, one approach is to attempt to derive algebraic equations for the likelihood equation and find the maximum points analytically. This approach has so far only been successful in the very simplest cases, of three or four taxa under the Neyman model of evolution of two-state characters. In this paper we extend this approach, for the first time, to four-state characters, the Jukes-Cantor model under a molecular clock, on a tree T on three taxa, a rooted triple. We employ spectral methods (Hadamard conjugation) to express the likelihood function parameterized by the path-length spectrum. Taking partial derivatives, we derive a set of polynomial equations whose simultaneous solution contains all critical points of the likelihood function. Using tools of algebraic geometry (the resultant of two polynomials) in the computer algebra packages (Maple), we are able to find all turning points analytically. We then employ this method on real sequence data and obtain realistic results on the primate-rodents divergence time.  相似文献   

2.
Zero‐truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well‐known software packages, and additional programming is often required. Motivated by the Rao–Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero‐truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data.  相似文献   

3.
Summary We investigate the use of a partial likelihood for estimation of the parameters of interest in spatio‐temporal point‐process models. We identify an important distinction between spatially discrete and spatially continuous models. We focus our attention on the spatially continuous case, which has not previously been considered. We use an inhomogeneous Poisson process and an infectious disease process, for which maximum‐likelihood estimation is tractable, to assess the relative efficiency of partial versus full likelihood, and to illustrate the relative ease of implementation of the former. We apply the partial‐likelihood method to a study of the nesting pattern of common terns in the Ebro Delta Natural Park, Spain.  相似文献   

4.
Goetghebeur E  Ryan L 《Biometrics》2000,56(4):1139-1144
We propose a semiparametric approach to the proportional hazards regression analysis of interval-censored data. An EM algorithm based on an approximate likelihood leads to an M-step that involves maximizing a standard Cox partial likelihood to estimate regression coefficients and then using the Breslow estimator for the unknown baseline hazards. The E-step takes a particularly simple form because all incomplete data appear as linear terms in the complete-data log likelihood. The algorithm of Turnbull (1976, Journal of the Royal Statistical Society, Series B 38, 290-295) is used to determine times at which the hazard can take positive mass. We found multiple imputation to yield an easily computed variance estimate that appears to be more reliable than asymptotic methods with small to moderately sized data sets. In the right-censored survival setting, the approach reduces to the standard Cox proportional hazards analysis, while the algorithm reduces to the one suggested by Clayton and Cuzick (1985, Applied Statistics 34, 148-156). The method is illustrated on data from the breast cancer cosmetics trial, previously analyzed by Finkelstein (1986, Biometrics 42, 845-854) and several subsequent authors.  相似文献   

5.
Huang J  Harrington D 《Biometrics》2002,58(4):781-791
The Cox proportional hazards model is often used for estimating the association between covariates and a potentially censored failure time, and the corresponding partial likelihood estimators are used for the estimation and prediction of relative risk of failure. However, partial likelihood estimators are unstable and have large variance when collinearity exists among the explanatory variables or when the number of failures is not much greater than the number of covariates of interest. A penalized (log) partial likelihood is proposed to give more accurate relative risk estimators. We show that asymptotically there always exists a penalty parameter for the penalized partial likelihood that reduces mean squared estimation error for log relative risk, and we propose a resampling method to choose the penalty parameter. Simulations and an example show that the bootstrap-selected penalized partial likelihood estimators can, in some instances, have smaller bias than the partial likelihood estimators and have smaller mean squared estimation and prediction errors of log relative risk. These methods are illustrated with a data set in multiple myeloma from the Eastern Cooperative Oncology Group.  相似文献   

6.
We study bias-reduced estimators of exponentially transformed parameters in general linear models (GLMs) and show how they can be used to obtain bias-reduced conditional (or unconditional) odds ratios in matched case-control studies. Two options are considered and compared: the explicit approach and the implicit approach. The implicit approach is based on the modified score function where bias-reduced estimates are obtained by using iterative procedures to solve the modified score equations. The explicit approach is shown to be a one-step approximation of this iterative procedure. To apply these approaches for the conditional analysis of matched case-control studies, with potentially unmatched confounding and with several exposures, we utilize the relation between the conditional likelihood and the likelihood of the unconditional logit binomial GLM for matched pairs and Cox partial likelihood for matched sets with appropriately setup data. The properties of the estimators are evaluated by using a large Monte Carlo simulation study and an illustration of a real dataset is shown. Researchers reporting the results on the exponentiated scale should use bias-reduced estimators since otherwise the effects can be under or overestimated, where the magnitude of the bias is especially large in studies with smaller sample sizes.  相似文献   

7.
We propose a robust Cox regression model with outliers. The model is fit by trimming the smallest contributions to the partial likelihood. To do so, we implement a Metropolis-type maximization routine, and show its convergence to a global optimum. We discuss global robustness properties of the approach, which is illustrated and compared through simulations. We finally fit the model on an original and on a benchmark data set.  相似文献   

8.
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.  相似文献   

9.
A branching stochastic process proposed earlier to model oligodendrocyte generation by O-2A progenitor cells under in vitro conditions does not allow invoking the maximum likelihood techniques for estimation purposes. To overcome this difficulty, we propose a partial likelihood function based on an embedded random walk model of clonal growth and differentiation of O-2A progenitor cells. Under certain conditions, the partial likelihood function yields consistent estimates of model parameters. The usefulness of this approach is illustrated with computer simulations and data analyses.  相似文献   

10.
There are two polar contemporary approaches to the constitutive modeling of arterial wall with anisotropy induced by collagen fibers. The first one is based on the angular integration (AI) of the strain energy on a unit sphere for the analytically defined fiber dispersion. The second one is based on the introduction of the generalized structure tensors (GST). AI approach is very involved computationally while GST approach requires somewhat complicated procedure for the exclusion of compressed fibers.We present some middle ground models, which are based on the use of 16 and 8 structure tensors. These models are moderately involved computationally and they allow excluding compressed fibers easily. We use the proposed models to study the role of the fiber dispersion in the constitutive modeling of the arterial wall. Particularly, we study the auxetic effect which can appear in anisotropic materials. The effect means thickening of the tissue in the direction perpendicular to its stretching. Such an effect was not observed in experiments while some simple anisotropic models do predict it. We show that more accurate account of the fiber dispersion suppresses the auxetic effect in a qualitative agreement with experimental observations.  相似文献   

11.
Proportional hazards model with covariates subject to measurement error.   总被引:1,自引:0,他引:1  
T Nakamura 《Biometrics》1992,48(3):829-838
When covariates of a proportional hazards model are subject to measurement error, the maximum likelihood estimates of regression coefficients based on the partial likelihood are asymptotically biased. Prentice (1982, Biometrika 69, 331-342) presents an example of such bias and suggests a modified partial likelihood. This paper applies the corrected score function method (Nakamura, 1990, Biometrika 77, 127-137) to the proportional hazards model when measurement errors are additive and normally distributed. The result allows a simple correction to the ordinary partial likelihood that yields asymptotically unbiased estimates; the validity of the correction is confirmed via a limited simulation study.  相似文献   

12.
Tsai WY 《Biometrika》2009,96(3):601-615
We obtain a pseudo-partial likelihood for proportional hazards models with biased-sampling data by embedding the biased-sampling data into left-truncated data. The log pseudo-partial likelihood of the biased-sampling data is the expectation of the log partial likelihood of the left-truncated data conditioned on the observed data. In addition, asymptotic properties of the estimator that maximize the pseudo-partial likelihood are derived. Applications to length-biased data, biased samples with right censoring and proportional hazards models with missing covariates are discussed.  相似文献   

13.
Pan W 《Biometrics》2000,56(1):199-203
We propose a general semiparametric method based on multiple imputation for Cox regression with interval-censored data. The method consists of iterating the following two steps. First, from finite-interval-censored (but not right-censored) data, exact failure times are imputed using Tanner and Wei's poor man's or asymptotic normal data augmentation scheme based on the current estimates of the regression coefficient and the baseline survival curve. Second, a standard statistical procedure for right-censored data, such as the Cox partial likelihood method, is applied to imputed data to update the estimates. Through simulation, we demonstrate that the resulting estimate of the regression coefficient and its associated standard error provide a promising alternative to the nonparametric maximum likelihood estimate. Our proposal is easily implemented by taking advantage of existing computer programs for right-censored data.  相似文献   

14.
Hidden Markov modeling (HMM) provides an effective approach for modeling single channel kinetics. Standard HMM is based on Baum's reestimation. As applied to single channel currents, the algorithm has the inability to optimize the rate constants directly. We present here an alternative approach by considering the problem as a general optimization problem. The quasi-Newton method is used for searching the likelihood surface. The analytical derivatives of the likelihood function are derived, thereby maximizing the efficiency of the optimization. Because the rate constants are optimized directly, the approach has advantages such as the allowance for model constraints and the ability to simultaneously fit multiple data sets obtained at different experimental conditions. Numerical examples are presented to illustrate the performance of the algorithm. Comparisons with Baum's reestimation suggest that the approach has a superior convergence speed when the likelihood surface is poorly defined due to, for example, a low signal-to-noise ratio or the aggregation of multiple states having identical conductances.  相似文献   

15.
We present an estimator of average regression effect under a non-proportional hazards model, where the regression effect of the covariates on the log hazard ratio changes with time. In the absence of censoring, the new estimate coincides with the usual partial likelihood estimate, both estimates being consistent for a parameter having an interpretation as an average population regression effect. In the presence of an independent censorship, the new estimate is still consistent for this same population parameter, whereas the partial likelihood estimate will converge to a different quantity that depends on censoring. We give an approximation of the population average effect as integral beta(t)dF(t). The new estimate is easy to compute, requiring only minor modifications to existing softwares. We illustrate the use of the average effect estimate on a breast cancer dataset from Institut Curie. The behavior of the estimator, its comparison with the partial likelihood estimate, as well as the approximation by integral beta(t)dF(t)are studied via simulation.  相似文献   

16.
Approximate likelihood ratios for general estimating functions   总被引:1,自引:0,他引:1  
The method of estimating functions (Godambe, 1991) is commonlyused when one desires to conduct inference about some parametersof interest but the full distribution of the observations isunknown. However, this approach may have limited utility, dueto multiple roots for the estimating function, a poorly behavedWald test, or lack of a goodness-of-fit test. This paper presentsapproximate likelihood ratios that can be used along with estimatingfunctions when any of these three problems occurs. We show thatthe approximate likelihood ratio provides correct large sampleinference under very general circumstances, including clustereddata and misspecified weights in the estimating function. Twomethods of constructing the approximate likelihood ratio, onebased on the quasi-likelihood approach and the other based onthe linear projection approach, are compared and shown to beclosely related. In particular we show that quasi-likelihoodis the limit of the projection approach. We illustrate the techniquewith two applications.  相似文献   

17.
A common approach for identifying loci influenced by positive selection involves scanning large portions of the genome for regions that are inconsistent with the neutral equilibrium model or represent outliers relative to the empirical distribution of some aspect of the data. Once identified, partial sequence is generated spanning this more localized region in order to quantify the site-frequency spectrum and evaluate the data with tests of neutrality and selection. This method is widely used as partial sequencing is less expensive with regard to both time and money. Here, we demonstrate that this approach can lead to biased maximum likelihood estimates of selection parameters and reduced rejection rates, with some parameter combinations resulting in clearly misleading results. Most significantly, for a commonly used sample size in Drosophila population genetics (i.e., n = 12), the estimate of the target of selection has a large mean square error and the strength of selection is severely under estimated when the true selected site has not been sampled. We propose sequencing approaches that are much more likely to accurately localize the target and estimate the strength of selection. Additionally, we examine the performance of a commonly used test of selection under a variety of recurrent and single sweep models.  相似文献   

18.
Brownian motions on coalescent structures have a biological relevance, either as an approximation of the stepwise mutation model for microsatellites, or as a model of spatial evolution considering the locations of individuals at successive generations. We discuss estimation procedures for the dispersal parameter of a Brownian motion defined on coalescent trees. First, we consider the mean square distance unbiased estimator and compute its variance. In a second approach, we introduce a phylogenetic estimator. Given the UPGMA topology, the likelihood of the parameter is computed thanks to a new dynamical programming method. By a proper correction, an unbiased estimator is derived from the pseudomaximum of the likelihood. The last approach consists of computing the likelihood by a Markov chain Monte Carlo sampling method. In the one-dimensional Brownian motion, this method seems less reliable than pseudomaximum-likelihood.  相似文献   

19.
The genetic length of a genome, in units of Morgans or centimorgans, is a fundamental characteristic of an organism. We propose a maximum likelihood method for estimating this quantity from counts of recombinants and nonrecombinants between marker locus pairs studied from a backcross linkage experiment, assuming no interference and equal chromosome lengths. This method allows the calculation of the standard deviation of the estimate and a confidence interval containing the estimate. Computer simulations have been performed to evaluate and compare the accuracy of the maximum likelihood method and a previously suggested method-of-moments estimator. Specifically, we have investigated the effects of the number of meioses, the number of marker loci, and variation in the genetic lengths of individual chromosomes on the estimate. The effect of missing data, obtained when the results of two separate linkage studies with a fraction of marker loci in common are pooled, is also investigated. The maximum likelihood estimator, in contrast to the method-of-moments estimator, is relatively insensitive to violation of the assumptions made during analysis and is the method of choice. The various methods are compared by application to partial linkage data from Xiphophorus.  相似文献   

20.
《动物分类学报》2017,(1):46-58
To distinguish species or populations using morphometric data is generally processed through multivariate analyses,in particular the discriminant analysis.We explored another approach based on the maximum likelihood method.Simple statistics based on the assumption of normal distribution at a single variable allows to compute the chance of observing a particular data (or sample) in a given reference group.When data are described by more than one variable,the maximum likelihood (MLi) approach allows to combine these chances to fmd the best fit for the data.Such approach assumes independence between variables.The assumptions of normal distribution of variables and independence between them are frequently not met in morphometrics,but improvements may be obtained after some mathematical transformations.Provided there is strict anatomical correspondence of variables between unknown and reference data,the MLi classification produces consistent classification.We explored this approach using various input data,and compared validated classification scores with the ones obtained after the Mahalanobis distance-based classification.The simplicity of the method,its fast computation,performance and versatility,make it an interesting complement to other classification techniques.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号