首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A class of generalized linear mixed models can be obtained by introducing random effects in the linear predictor of a generalized linear model, e.g. a split plot model for binary data or count data. Maximum likelihood estimation, for normally distributed random effects, involves high-dimensional numerical integration, with severe limitations on the number and structure of the additional random effects. An alternative estimation procedure based on an extension of the iterative re-weighted least squares procedure for generalized linear models will be illustrated on a practical data set involving carcass classification of cattle. The data is analysed as overdispersed binomial proportions with fixed and random effects and associated components of variance on the logit scale. Estimates are obtained with standard software for normal data mixed models. Numerical restrictions pertain to the size of matrices to be inverted. This can be dealt with by absorption techniques familiar from e.g. mixed models in animal breeding. The final model fitted to the classification data includes four components of variance and a multiplicative overdispersion factor. Basically the estimation procedure is a combination of iterated least squares procedures and no full distributional assumptions are needed. A simulation study based on the classification data is presented. This includes a study of procedures for constructing confidence intervals and significance tests for fixed effects and components of variance. The simulation results increase confidence in the usefulness of the estimation procedure.  相似文献   

2.
The purpose of this paper is to present a procedure for obtaining approximate maximum likelihood estimates for compound binary response models. The extra binomial variation is incorporated into the model by adding random effects to the fixed effects on the probit (or logit) scale. Numerical integration techniques are used to arrive at a solution of the likelihood equations. The paper also presents an illustrating numerical example based on a large toxicological data set. The computations are carried out within the GLIM statistical package.  相似文献   

3.
Quantitative traits measured in human families can be analyzed to partition the total population variance into genetic and environmental components, or to elucidate the genetic mechanism involved. We review the estimation of variance components directly from human pedigree data, or in the form of path coefficients from correlations between pairs of relatives. To elucidate genetic mechanisms, a mixed model that allows for segregation at a major locus, a polygenic effect and a sibling environmental correlation is described for nuclear families. In each case appropriate likelihoods are derived as a basis, using numerical maximum likelihood methods, for parameter estimation and hypothesis testing. A general model is then described that allows for several familial sources of environmental variation, assortative mating, and both major gene and polygenic effects; and an algorithm for calculating the likelihood of a pedigree under this model is indicated. Finally, some of the remaining problems in this area of biometric analysis are pointed out.  相似文献   

4.
Targeted maximum likelihood estimation of a parameter of a data generating distribution, known to be an element of a semi-parametric model, involves constructing a parametric model through an initial density estimator with parameter ? representing an amount of fluctuation of the initial density estimator, where the score of this fluctuation model at ? = 0 equals the efficient influence curve/canonical gradient. The latter constraint can be satisfied by many parametric fluctuation models since it represents only a local constraint of its behavior at zero fluctuation. However, it is very important that the fluctuations stay within the semi-parametric model for the observed data distribution, even if the parameter can be defined on fluctuations that fall outside the assumed observed data model. In particular, in the context of sparse data, by which we mean situations where the Fisher information is low, a violation of this property can heavily affect the performance of the estimator. This paper presents a fluctuation approach that guarantees the fluctuated density estimator remains inside the bounds of the data model. We demonstrate this in the context of estimation of a causal effect of a binary treatment on a continuous outcome that is bounded. It results in a targeted maximum likelihood estimator that inherently respects known bounds, and consequently is more robust in sparse data situations than the targeted MLE using a naive fluctuation model. When an estimation procedure incorporates weights, observations having large weights relative to the rest heavily influence the point estimate and inflate the variance. Truncating these weights is a common approach to reducing the variance, but it can also introduce bias into the estimate. We present an alternative targeted maximum likelihood estimation (TMLE) approach that dampens the effect of these heavily weighted observations. As a substitution estimator, TMLE respects the global constraints of the observed data model. For example, when outcomes are binary, a fluctuation of an initial density estimate on the logit scale constrains predicted probabilities to be between 0 and 1. This inherent enforcement of bounds has been extended to continuous outcomes. Simulation study results indicate that this approach is on a par with, and many times superior to, fluctuating on the linear scale, and in particular is more robust when there is sparsity in the data.  相似文献   

5.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

6.
Gene diversity is an important measure of genetic variability in inbred populations. The survival of species in changing environments depends on, among other factors, the genetic variability of the population. In this communication, I have derived the uniformly minimum variance unbiased estimator of gene diversity. The proposed estimator of gene diversity does not assume that the inbreeding coefficient is known. I have also provided the approximate variance of this estimator according to Fisher's method. In addition, I have developed a numerical resampling-based method for obtaining variances and confidence intervals based on the maximum likelihood estimator and the uniformly minimum variance unbiased estimator. Efficiency in estimation of the gene diversity based on these two estimators is discussed. In accordance with the simulation results, I found that the uniformly minimum variance estimator developed in this report is more accurate for estimation of gene diversity than the maximum likelihood estimator.  相似文献   

7.
This article investigates maximum likelihood estimation with saturated and unsaturated models for correlated exchangeable binary data, when a sample of independent clusters of varying sizes is available. We discuss various parameterizations of these models, and propose using the EM algorithm to obtain maximum likelihood estimates. The methodology is illustrated by applications to a study of familial disease aggregation and to the design of a proposed group randomized cancer prevention trial.  相似文献   

8.
A mixed-model procedure for analysis of censored data assuming a multivariate normal distribution is described. A Bayesian framework is adopted which allows for estimation of fixed effects and variance components and prediction of random effects when records are left-censored. The procedure can be extended to right- and two-tailed censoring. The model employed is a generalized linear model, and the estimation equations resemble those arising in analysis of multivariate normal or categorical data with threshold models. Estimates of variance components are obtained using expressions similar to those employed in the EM algorithm for restricted maximum likelihood (REML) estimation under normality.  相似文献   

9.
Estimation of variance components in linear mixed models is important in clinical trial and longitudinal data analysis. It is also important in animal and plant breeding for accurately partitioning total phenotypic variance into genetic and environmental variances. Restricted maximum likelihood (REML) method is often preferred over the maximum likelihood (ML) method for variance component estimation because REML takes into account the lost degree of freedom resulting from estimating the fixed effects. The original restricted likelihood function involves a linear transformation of the original response variable (a collection of error contrasts). Harville's final form of the restricted likelihood function does not involve the transformation and thus is much easier to manipulate than the original restricted likelihood function. There are several different ways to show that the two forms of the restricted likelihood are equivalent. In this study, I present a much simpler way to derive Harville's restricted likelihood function. I first treat the fixed effects as random effects and call such a mixed model a pseudo random model (PDRM). I then construct a likelihood function for the PDRM. Finally, I let the variance of the pseudo random effects be infinity and show that the limit of the likelihood function of the PDRM is the restricted likelihood function.  相似文献   

10.
Xu S  Yonash N  Vallejo RL  Cheng HH 《Genetica》1998,104(2):171-178
A typical problem in mapping quantitative trait loci (QTLs) comes from missing QTL genotype. A routine method for parameter estimation involving missing data is the mixture model maximum likelihood method. We developed an alternative QTL mapping method that describes a mixture of several distributions by a single model with a heterogeneous residual variance. The two methods produce similar results, but the heterogeneous residual variance method is computationally much faster than the mixture model approach. In addition, the new method can automatically generate sampling variances of the estimated parameters. We derive the new method in the context of QTL mapping for binary traits in a F2 population. Using the heterogeneous residual variance model, we identified a QTL on chromosome IV that controls Marek's disease susceptibility in chickens. The QTL alone explains 7.2% of the total disease variation. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

11.
MIXED MODEL APPROACHES FOR ESTIMATING GENETIC VARIANCES AND COVARIANCES   总被引:62,自引:4,他引:58  
The limitations of methods for analysis of variance(ANOVA)in estimating genetic variances are discussed. Among the three methods(maximum likelihood ML, restricted maximum likelihood REML, and minimum norm quadratic unbiased estimation MINQUE)for mixed linear models, MINQUE method is presented with formulae for estimating variance components and covariances components and for predicting genetic effects. Several genetic models, which cannot be appropriately analyzed by ANOVA methods, are introduced in forms of mixed linear models. Genetic models with independent random effects can be analyzed by MINQUE(1)method whieh is a MINQUE method with all prior values setting 1. MINQUE(1)method can give unbiased estimation for variance components and covariance components, and linear unbiased prediction (LUP) for genetic effects. There are more complicate genetic models for plant seeds which involve correlated random effects. MINQUE(0/1)method, which is a MINQUE method with all prior covariances setting 0 and all prior variances setting 1, is suitable for estimating variance and covariance components in these models. Mixed model approaches have advantage over ANOVA methods for the capacity of analyzing unbalanced data and complicated models. Some problems about estimation and hypothesis test by MINQUE method are discussed.  相似文献   

12.
Masatoshi Nei  Fumio Tajima 《Genetics》1983,105(1):207-217
A simple method of the maximum likelihood estimation of the number of nucleotide substitutions is presented for the case where restriction sites data from many different restriction enzymes are available. An iteration method, based on nucleotide counting, is also developed. This method is simpler than the maximum likelihood method but gives the same estimate. A formula for computing the variance of a maximum likelihood estimate is also presented.  相似文献   

13.
Dai JY  LeBlanc M  Kooperberg C 《Biometrics》2009,65(1):178-187
Summary .  Recent results for case–control sampling suggest when the covariate distribution is constrained by gene-environment independence, semiparametric estimation exploiting such independence yields a great deal of efficiency gain. We consider the efficient estimation of the treatment–biomarker interaction in two-phase sampling nested within randomized clinical trials, incorporating the independence between a randomized treatment and the baseline markers. We develop a Newton–Raphson algorithm based on the profile likelihood to compute the semiparametric maximum likelihood estimate (SPMLE). Our algorithm accommodates both continuous phase-one outcomes and continuous phase-two biomarkers. The profile information matrix is computed explicitly via numerical differentiation. In certain situations where computing the SPMLE is slow, we propose a maximum estimated likelihood estimator (MELE), which is also capable of incorporating the covariate independence. This estimated likelihood approach uses a one-step empirical covariate distribution, thus is straightforward to maximize. It offers a closed-form variance estimate with limited increase in variance relative to the fully efficient SPMLE. Our results suggest exploiting the covariate independence in two-phase sampling increases the efficiency substantially, particularly for estimating treatment–biomarker interactions.  相似文献   

14.
E A Thompson  R G Shaw 《Biometrics》1990,46(2):399-413
Recent developments in the animal breeding literature facilitate estimation of the variance components in quantitative genetic models. However, computation remains intensive, and many of the procedures are restricted to specialized designs and models, unsuited to data arising from studies of natural populations. We develop algorithms that allow maximum likelihood estimation of variance components for data on arbitrary pedigree structures. The proposed methods can be implemented on microcomputers, since no intensive matrix computations or manipulations are involved. Although parts of our procedures have been previously presented, we unify these into an overall scheme whose intuitive justification clarifies the approach. Two examples are analyzed: one of data on a natural population of Salivia lyrata and the other of simulated data on an extended pedigree.  相似文献   

15.
Residual maximum likelihood has proved to be a successful approach to the estimation of variance components. In this paper, its counterpart in testing, the residual likelihood ratio test, is applied to testing the ratio of two variance components. The test is compared with the Wald test and the locally most powerful invariant test.  相似文献   

16.
It is shown that maximum likelihood estimation of variance components from twin data can be parameterized in the framework of linear mixed models. Standard statistical packages can be used to analyze univariate or multivariate data for simple models such as the ACE and CE models. Furthermore, specialized variance component estimation software that can handle pedigree data and user-defined covariance structures can be used to analyze multivariate data for simple and complex models, including those where dominance and/or QTL effects are fitted. The linear mixed model framework is particularly useful for analyzing multiple traits in extended (twin) families with a large number of random effects.  相似文献   

17.
Computer simulation was used to compare minimum variance quadratic estimation (MIVQUE), minimum norm quadratic unbiased estimation (MINQUE), restricted maximum likelihood (REML), maximum likelihood (ML), and Henderson's Method 3 (HM3) on the basis of variance among estimates, mean square error (MSE), bias and probability of nearness for estimation of both individual variance components and three ratios of variance components. The investigation also compared three procedures for dealing with negative estimates and included the use of both individual observations and plot means as the experimental unit of the analysis. The structure of data simulated (field design, mating designs, genetic architecture and imbalance) represented typical analysis problems in quantitative forest genetics. Results of comparing the estimation techniques demonstrated that: estimates of probability of nearness did not discriminate among techniques; bias was discriminatory among procedures for dealing with negative estimates but not among estimation techniques (except ML); sampling variance among estimates was discriminatory among procedures for dealing with negative estimates, estimation techniques and unit of observation; and MSE provided no additional information to variance of the estimates. HM3 and REML were the closest competitors under these criteria; however, REML demonstrated greater robustness to imbalance. Of the three negative estimate procedures, two are of practical significance and guidelines for their application are presented. Estimates from individual observations were always preferable to those from plot means over the experimental levels of this study.This is Journal Series NO. R-03768 of the Institute of Food and Agricultural Sciences  相似文献   

18.
Yau KK 《Biometrics》2001,57(1):96-102
A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by maximizing a log likelihood analogous to the likelihood associated with the best linear unbiased prediction (BLUP) at the initial step of estimation and is extended to obtain residual maximum likelihood (REML) estimators of the variance component. Estimating equations for a three-level hierarchical survival model are developed in detail, and such a model is applied to analyze a set of chronic granulomatous disease (CGD) data on recurrent infections as an illustration with both hospital and patient effects being considered as random. Only the latter gives a significant contribution. A simulation study is carried out to evaluate the performance of the REML estimators. Further extension of the estimation procedure to models with an arbitrary number of levels is also discussed.  相似文献   

19.
Ghosh D  Banerjee M  Biswas P 《Biometrics》2008,64(4):1009-1017
SUMMARY: In order to develop better treatment and screening programs for cancer prevention programs, it is important to be able to understand the natural history of the disease and what factors affect its progression. We focus on a particular framework first outlined by Kimmel and Flehinger (1991, Biometrics, 47, 987-1004) and in particular one of their limiting scenarios for analysis. Using an equivalence with a binary regression model, we characterize the nonparametric maximum likelihood estimation procedure for estimation of the tumor size distribution function and give associated asymptotic results. Extensions to semiparametric models and missing data are also described. Application to data from two cancer studies is used to illustrate the finite-sample behavior of the procedure.  相似文献   

20.
Marginal regression analysis of a multivariate binary response   总被引:2,自引:0,他引:2  
We propose the use of the mean parameter for regression analysisof a multivariate binary response. We model the associationusing dependence ratios defined in terms of the mean parameter,the components of which are the joint success probabilitiesof all orders. This permits flexible modelling of higher-orderassociations, using maximum likelihood estimation. We reanalysetwo data sets, one with variable cluster size and the othera longitudinal data set with constant cluster size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号