首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Parameter expanded and standard expectation maximisation algorithms are described for reduced rank estimation of covariance matrices by restricted maximum likelihood, fitting the leading principal components only. Convergence behaviour of these algorithms is examined for several examples and contrasted to that of the average information algorithm, and implications for practical analyses are discussed. It is shown that expectation maximisation type algorithms are readily adapted to reduced rank estimation and converge reliably. However, as is well known for the full rank case, the convergence is linear and thus slow. Hence, these algorithms are most useful in combination with the quadratically convergent average information algorithm, in particular in the initial stages of an iterative solution scheme.  相似文献   

2.
We explore the estimation of uncertainty in evolutionary parameters using a recently devised approach for resampling entire additive genetic variance–covariance matrices ( G ). Large‐sample theory shows that maximum‐likelihood estimates (including restricted maximum likelihood, REML) asymptotically have a multivariate normal distribution, with covariance matrix derived from the inverse of the information matrix, and mean equal to the estimated G . This suggests that sampling estimates of G from this distribution can be used to assess the variability of estimates of G , and of functions of G . We refer to this as the REML‐MVN method. This has been implemented in the mixed‐model program WOMBAT. Estimates of sampling variances from REML‐MVN were compared to those from the parametric bootstrap and from a Bayesian Markov chain Monte Carlo (MCMC) approach (implemented in the R package MCMCglmm). We apply each approach to evolvability statistics previously estimated for a large, 20‐dimensional data set for Drosophila wings. REML‐MVN and MCMC sampling variances are close to those estimated with the parametric bootstrap. Both slightly underestimate the error in the best‐estimated aspects of the G matrix. REML analysis supports the previous conclusion that the G matrix for this population is full rank. REML‐MVN is computationally very efficient, making it an attractive alternative to both data resampling and MCMC approaches to assessing confidence in parameters of evolutionary interest.  相似文献   

3.
Robust estimation of multivariate covariance components   总被引:1,自引:0,他引:1  
Dueck A  Lohr S 《Biometrics》2005,61(1):162-169
In many settings, such as interlaboratory testing, small area estimation in sample surveys, and heritability studies, investigators are interested in estimating covariance components for multivariate measurements. However, the presence of outliers can seriously distort estimates obtained using standard procedures such as maximum likelihood. We propose a procedure based on M-estimation for robustly estimating multivariate covariance components in the presence of outliers; the procedure applies to balanced and unbalanced data. We present an algorithm for computing the robust estimates and examine the performance of the estimator through a simulation study. The estimator is used to find covariance components and identify outliers in a study of variability of egg length and breadth measurements of American coots.  相似文献   

4.
5.
Aitkin M 《Biometrics》1999,55(1):117-128
This paper describes an EM algorithm for nonparametric maximum likelihood (ML) estimation in generalized linear models with variance component structure. The algorithm provides an alternative analysis to approximate MQL and PQL analyses (McGilchrist and Aisbett, 1991, Biometrical Journal 33, 131-141; Breslow and Clayton, 1993; Journal of the American Statistical Association 88, 9-25; McGilchrist, 1994, Journal of the Royal Statistical Society, Series B 56, 61-69; Goldstein, 1995, Multilevel Statistical Models) and to GEE analyses (Liang and Zeger, 1986, Biometrika 73, 13-22). The algorithm, first given by Hinde and Wood (1987, in Longitudinal Data Analysis, 110-126), is a generalization of that for random effect models for overdispersion in generalized linear models, described in Aitkin (1996, Statistics and Computing 6, 251-262). The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully nonparametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters can be sensitive to the specification of a parametric form for the mixing distribution. The nonparametric analysis can be extended straightforwardly to general random parameter models, with full NPML estimation of the joint distribution of the random parameters. This can produce substantial computational saving compared with full numerical integration over a specified parametric distribution for the random parameters. A simple method is described for obtaining correct standard errors for parameter estimates when using the EM algorithm. Several examples are discussed involving simple variance component and longitudinal models, and small-area estimation.  相似文献   

6.
ANDERSON  J. A.; BLAIR  V. 《Biometrika》1982,69(1):123-136
  相似文献   

7.
8.

Background

Estimation of genetic covariance matrices for multivariate problems comprising more than a few traits is inherently problematic, since sampling variation increases dramatically with the number of traits. This paper investigates the efficacy of regularized estimation of covariance components in a maximum likelihood framework, imposing a penalty on the likelihood designed to reduce sampling variation. In particular, penalties that "borrow strength" from the phenotypic covariance matrix are considered.

Methods

An extensive simulation study was carried out to investigate the reduction in average ''loss'', i.e. the deviation in estimated matrices from the population values, and the accompanying bias for a range of parameter values and sample sizes. A number of penalties are examined, penalizing either the canonical eigenvalues or the genetic covariance or correlation matrices. In addition, several strategies to determine the amount of penalization to be applied, i.e. to estimate the appropriate tuning factor, are explored.

Results

It is shown that substantial reductions in loss for estimates of genetic covariance can be achieved for small to moderate sample sizes. While no penalty performed best overall, penalizing the variance among the estimated canonical eigenvalues on the logarithmic scale or shrinking the genetic towards the phenotypic correlation matrix appeared most advantageous. Estimating the tuning factor using cross-validation resulted in a loss reduction 10 to 15% less than that obtained if population values were known. Applying a mild penalty, chosen so that the deviation in likelihood from the maximum was non-significant, performed as well if not better than cross-validation and can be recommended as a pragmatic strategy.

Conclusions

Penalized maximum likelihood estimation provides the means to ''make the most'' of limited and precious data and facilitates more stable estimation for multi-dimensional analyses. It should become part of our everyday toolkit for multivariate estimation in quantitative genetics.  相似文献   

9.
Researchers in observational survival analysis are interested in not only estimating survival curve nonparametrically but also having statistical inference for the parameter. We consider right-censored failure time data where we observe n independent and identically distributed observations of a vector random variable consisting of baseline covariates, a binary treatment at baseline, a survival time subject to right censoring, and the censoring indicator. We assume the baseline covariates are allowed to affect the treatment and censoring so that an estimator that ignores covariate information would be inconsistent. The goal is to use these data to estimate the counterfactual average survival curve of the population if all subjects are assigned the same treatment at baseline. Existing observational survival analysis methods do not result in monotone survival curve estimators, which is undesirable and may lose efficiency by not constraining the shape of the estimator using the prior knowledge of the estimand. In this paper, we present a one-step Targeted Maximum Likelihood Estimator (TMLE) for estimating the counterfactual average survival curve. We show that this new TMLE can be executed via recursion in small local updates. We demonstrate the finite sample performance of this one-step TMLE in simulations and an application to a monoclonal gammopathy data.  相似文献   

10.
In nutritional epidemiology, dietary intake assessed with a food frequency questionnaire is prone to measurement error. Ignoring the measurement error in covariates causes estimates to be biased and leads to a loss of power. In this paper, we consider an additive error model according to the characteristics of the European Prospective Investigation into Cancer and Nutrition (EPIC)‐InterAct Study data, and derive an approximate maximum likelihood estimation (AMLE) for covariates with measurement error under logistic regression. This method can be regarded as an adjusted version of regression calibration and can provide an approximate consistent estimator. Asymptotic normality of this estimator is established under regularity conditions, and simulation studies are conducted to empirically examine the finite sample performance of the proposed method. We apply AMLE to deal with measurement errors in some interested nutrients of the EPIC‐InterAct Study under a sensitivity analysis framework.  相似文献   

11.
Spectral models for covariance matrices   总被引:1,自引:0,他引:1  
Boik  Robert J. 《Biometrika》2002,89(1):159-182
  相似文献   

12.
Genetic data are useful for estimating the genealogical relationship or relatedness between individuals of unknown ancestry. We present a computer program, ml ‐relate that calculates maximum likelihood estimates of relatedness and relationship. ml ‐relate is designed for microsatellite data and can accommodate null alleles. It uses simulation to determine which relationships are consistent with genotype data and to compare putative relationships with alternatives. ml ‐relate runs on the Microsoft Windows operating system and is available from http://www.montana.edu/kalinowski .  相似文献   

13.
Two methods of computing Monte Carlo estimators of variance components using restricted maximum likelihood via the expectation-maximisation algorithm are reviewed. A third approach is suggested and the performance of the methods is compared using simulated data.  相似文献   

14.
There has recently been increased interest in the use of Markov Chain Monte Carlo (MCMC)-based Bayesian methods for estimating genetic maps. The advantage of these methods is that they can deal accurately with missing data and genotyping errors. Here we present an extension of the previous methods that makes the Bayesian method applicable to large data sets. We present an extensive simulation study examining the statistical properties of the method and comparing it with the likelihood method implemented in Mapmaker. We show that the Maximum A Posteriori (MAP) estimator of the genetic distances, corresponding to the maximum likelihood estimator, performs better than estimators based on the posterior expectation. We also show that while the performance is similar between Mapmaker and the MCMC-based method in the absence of genotyping errors, the MCMC-based method has a distinct advantage in the presence of genotyping errors. A similar advantage of the Bayesian method was not observed for missing data. We also re-analyse a recently published set of data from the eggplant and show that the use of the MCMC-based method leads to smaller estimates of genetic distances.  相似文献   

15.
In this work, we fit pattern-mixture models to data sets with responses that are potentially missing not at random (MNAR, Little and Rubin, 1987). In estimating the regression parameters that are identifiable, we use the pseudo maximum likelihood method based on exponential families. This procedure provides consistent estimators when the mean structure is correctly specified for each pattern, with further information on the variance structure giving an efficient estimator. The proposed method can be used to handle a variety of continuous and discrete outcomes. A test built on this approach is also developed for model simplification in order to improve efficiency. Simulations are carried out to compare the proposed estimation procedure with other methods. In combination with sensitivity analysis, our approach can be used to fit parsimonious semi-parametric pattern-mixture models to outcomes that are potentially MNAR. We apply the proposed method to an epidemiologic cohort study to examine cognition decline among elderly.  相似文献   

16.
17.
Do KA  Kirk K 《Biometrics》1999,55(1):174-181
Principal component analysis enhanced by the use of smoothing is used in conjunction with discriminant analysis techniques to devise a statistical classification method for the analysis of event-related potential data. A training set of premedication potentials collected from adolescents with attention-deficit hyperactive disorder (ADHD) is used to predict whether adolescents from an independent subject group will respond to long-term medication. Comparison of outcome prediction rates demonstrates that this method, which uses information from the whole ERP curve, is superior to the classification technique currently used by clinicians, which is based on a single ERP curve feature. The need to administer an initial dose of medication to classify patients is also eliminated.  相似文献   

18.
We derive the nonparametric maximum likelihood estimate (NPMLE) of the cumulative incidence functions for competing risks survival data subject to interval censoring and truncation. Since the cumulative incidence function NPMLEs give rise to an estimate of the survival distribution which can be undefined over a potentially larger set of regions than the NPMLE of the survival function obtained ignoring failure type, we consider an alternative pseudolikelihood estimator. The methods are then applied to data from a cohort of injecting drug users in Thailand susceptible to infection from HIV-1 subtypes B and E.  相似文献   

19.
Chen J  Lin D  Hochner H 《Biometrics》2012,68(3):869-877
Summary Case-control mother-child pair design represents a unique advantage for dissecting genetic susceptibility of complex traits because it allows the assessment of both maternal and offspring genetic compositions. This design has been widely adopted in studies of obstetric complications and neonatal outcomes. In this work, we developed an efficient statistical method for evaluating joint genetic and environmental effects on a binary phenotype. Using a logistic regression model to describe the relationship between the phenotype and maternal and offspring genetic and environmental risk factors, we developed a semiparametric maximum likelihood method for the estimation of odds ratio association parameters. Our method is novel because it exploits two unique features of the study data for the parameter estimation. First, the correlation between maternal and offspring SNP genotypes can be specified under the assumptions of random mating, Hardy-Weinberg equilibrium, and Mendelian inheritance. Second, environmental exposures are often not affected by offspring genes conditional on maternal genes. Our method yields more efficient estimates compared with the standard prospective method for fitting logistic regression models to case-control data. We demonstrated the performance of our method through extensive simulation studies and the analysis of data from the Jerusalem Perinatal Study.  相似文献   

20.
The pool adjacent violator algorithm Ayer et al. (1955, The Annals of Mathematical Statistics, 26, 641-647) has long been known to give the maximum likelihood estimator of a series of ordered binomial parameters, based on an independent observation from each distribution (see Barlow et al., 1972, Statistical Inference under Order Restrictions, Wiley, New York). This result has immediate application to estimation of a survival distribution based on current survival status at a set of monitoring times. This paper considers an extended problem of maximum likelihood estimation of a series of 'ordered' multinomial parameters p(i)= (p(1i),p(2i),.,p(mi)) for 1 相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号