首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
Most models for incomplete data are formulated within the selection model framework. This paper studies similarities and differences of modeling incomplete data within both selection and pattern-mixture settings. The focus is on missing at random mechanisms and on categorical data. Point and interval estimation is discussed. A comparison of both approaches is done on side effects in a psychiatric study.  相似文献   

2.
3.
    
In many clinical trials both repeated measures data and event history data are simultaneously observed from the same subject. These two types of responses are usually correlated, because they are from the same subject. In this article, we propose a joint model for the combined analysis of repeated measures data and event history data in the framework of hierarchical generalized linear models. The correlation between repeated measures and event time is modelled by introducing a shared random effect. The model parameters are estimated using the hierarchical‐likelihood approach. The proposed model is illustrated using a real data set for the renal transplant patients.  相似文献   

4.
    
Summary .  Cancer registry records contain valuable data on provision of adjuvant therapies for cancer patients. Previous studies, however, have shown that these therapies are underreported in registry systems. Hence direct use of the registry data may lead to invalid analysis results. We propose first to impute correct treatment status, borrowing information from an additional source such as medical records data collected in a validation sample, and then to analyze the multiply imputed data, as in Yucel and Zaslavsky (2005,  Journal of the American Statistical Association   100, 1123–1132). We extend their models to multiple therapies using multivariate probit models with random effects. Our model takes into account the associations among different therapies in both administration and probability of reporting, as well as the multilevel structure (patients clustered within hospitals) of registry data. We use Gibbs sampling to estimate model parameters and impute treatment status. The proposed methodology is applied to the data from the Quality of Cancer Care project, in which stage II or III colorectal cancer patients were eligible to receive adjuvant chemotherapy and radiation therapy.  相似文献   

5.
GOLDSTEIN  H. 《Biometrika》1986,73(1):43-56
  相似文献   

6.
Models of nucleotide substitution were constructed for combined analyses of heterogeneous sequence data (such as those of multiple genes) from the same set of species. The models account for different aspects of the heterogeneity in the evolutionary process of different genes, such as differences in nucleotide frequencies, in substitution rate bias (for example, the transition/transversion rate bias), and in the extent of rate variation across sites. Model parameters were estimated by maximum likelihood and the likelihood ratio test was used to test hypotheses concerning sequence evolution, such as rate constancy among lineages (the assumption of a molecular clock) and proportionality of branch lengths for different genes. The example data from a segment of the mitochondrial genome of six hominoid species (human, common and pygmy chimpanzees, gorilla, orangutan, and siamang) were analyzed. Nucleotides at the three codon positions in the protein-coding regions and from the tRNA-coding regions were considered heterogeneous data sets. Statistical tests showed that the amount of evolution in the sequence data reflected in the estimated branch lengths can be explained by the codon-position effect and lineage effect of substitution rates. The assumption of a molecular clock could not be rejected when the data were analyzed separately or when the rate variation among sites was ignored. However, significant differences in substitution rate among lineages were found when the data sets were combined and when the rate variation among sites was accounted for in the models. Under the assumption that the orangutan and African apes diverged 13 million years ago, the combined analysis of the sequence data estimated the times for the human-chimpanzee separation and for the separation of the gorilla as 4.3 and 6.8 million years ago, respectively.  相似文献   

7.
    
We propose a mixed-effect linear model, as a particular case of the two-level regression model, for analyzing repeated measures made at completely irregular time points. The model allows for subject-level covariates, so as to study the trend and the variability of the individual growth curves. Application of this model is illustrated on a published data set.  相似文献   

8.
    
Missing data are ubiquitous in clinical and social research, and multiple imputation (MI) is increasingly the methodology of choice for practitioners. Two principal strategies for imputation have been proposed in the literature: joint modelling multiple imputation (JM‐MI) and full conditional specification multiple imputation (FCS‐MI). While JM‐MI is arguably a preferable approach, because it involves specification of an explicit imputation model, FCS‐MI is pragmatically appealing, because of its flexibility in handling different types of variables. JM‐MI has developed from the multivariate normal model, and latent normal variables have been proposed as a natural way to extend this model to handle categorical variables. In this article, we evaluate the latent normal model through an extensive simulation study and an application on data from the German Breast Cancer Study Group, comparing the results with FCS‐MI. We divide our investigation in four sections, focusing on (i) binary, (ii) categorical, (iii) ordinal, and (iv) count data. Using data simulated from both the latent normal model and the general location model, we find that in all but one extreme general location model setting JM‐MI works very well, and sometimes outperforms FCS‐MI. We conclude the latent normal model, implemented in the R package jomo , can be used with confidence by researchers, both for single and multilevel multiple imputation.  相似文献   

9.
    
It is not uncommon for biological anthropologists to analyze incomplete bioarcheological or forensic skeleton specimens. As many quantitative multivariate analyses cannot handle incomplete data, missing data imputation or estimation is a common preprocessing practice for such data. Using William W. Howells' Craniometric Data Set and the Goldman Osteometric Data Set, we evaluated the performance of multiple popular statistical methods for imputing missing metric measurements. Results indicated that multiple imputation methods outperformed single imputation methods, such as Bayesian principal component analysis (BPCA). Multiple imputation with Bayesian linear regression implemented in the R package norm2, the Expectation–Maximization (EM) with Bootstrapping algorithm implemented in Amelia, and the Predictive Mean Matching (PMM) method and several of the derivative linear regression models implemented in mice, perform well regarding accuracy, robustness, and speed. Based on the findings of this study, we suggest a practical procedure for choosing appropriate imputation methods.  相似文献   

10.
    
Wang CY  Wang N  Wang S 《Biometrics》2000,56(2):487-495
We consider regression analysis when covariate variables are the underlying regression coefficients of another linear mixed model. A naive approach is to use each subject's repeated measurements, which are assumed to follow a linear mixed model, and obtain subject-specific estimated coefficients to replace the covariate variables. However, directly replacing the unobserved covariates in the primary regression by these estimated coefficients may result in a significantly biased estimator. The aforementioned problem can be evaluated as a generalization of the classical additive error model where repeated measures are considered as replicates. To correct for these biases, we investigate a pseudo-expected estimating equation (EEE) estimator, a regression calibration (RC) estimator, and a refined version of the RC estimator. For linear regression, the first two estimators are identical under certain conditions. However, when the primary regression model is a nonlinear model, the RC estimator is usually biased. We thus consider a refined regression calibration estimator whose performance is close to that of the pseudo-EEE estimator but does not require numerical integration. The RC estimator is also extended to the proportional hazards regression model. In addition to the distribution theory, we evaluate the methods through simulation studies. The methods are applied to analyze a real dataset from a child growth study.  相似文献   

11.
Hierarchical likelihood approach for frailty models   总被引:5,自引:0,他引:5  
  相似文献   

12.
Robust estimation of multivariate covariance components   总被引:1,自引:0,他引:1  
Dueck A  Lohr S 《Biometrics》2005,61(1):162-169
In many settings, such as interlaboratory testing, small area estimation in sample surveys, and heritability studies, investigators are interested in estimating covariance components for multivariate measurements. However, the presence of outliers can seriously distort estimates obtained using standard procedures such as maximum likelihood. We propose a procedure based on M-estimation for robustly estimating multivariate covariance components in the presence of outliers; the procedure applies to balanced and unbalanced data. We present an algorithm for computing the robust estimates and examine the performance of the estimator through a simulation study. The estimator is used to find covariance components and identify outliers in a study of variability of egg length and breadth measurements of American coots.  相似文献   

13.
14.
    
Evidence synthesis, both qualitatively and quantitatively through meta-analysis, is central to the development of evidence-based medicine. Unfortunately, meta-analysis is often complicated by the suspicion that the available studies represent a biased subset of the evidence, possibly due to publication bias or other systematically different effects in small studies. A number of statistical methods have been proposed to address this, among which the trim-and-fill method and the Copas selection model are two of the most widely discussed. However, both methods have drawbacks: the trim-and-fill method is based on strong assumptions about the symmetry of the funnel plot; the Copas selection model is less accessible to systematic reviewers, and sometimes encounters estimation problems. In this article, we adopt a logistic selection model, and show how treatment effects can be rapidly estimated via multiple imputation. Specifically, we impute studies under a missing at random assumption, and then reweight to obtain estimates under nonrandom selection. Our proposal is computationally straightforward. It allows users to increase selection while monitoring the extent of remaining funnel plot asymmetry, and also visualize the results using the funnel plot. We illustrate our approach using a small meta-analysis of benign prostatic hyperplasia.  相似文献   

15.
SHIBATA  RITEI 《Biometrika》1976,63(1):117-126
  相似文献   

16.
17.
Reiter  Jerome P. 《Biometrika》2007,94(2):502-508
When performing multi-component significance tests with multiply-imputeddatasets, analysts can use a Wald-like test statistic and areference F-distribution. The currently employed degrees offreedom in the denominator of this F-distribution are derivedassuming an infinite sample size. For modest complete-data samplesizes, this degrees of freedom can be unrealistic; for example,it may exceed the complete-data degrees of freedom. This paperpresents an alternative denominator degrees of freedom thatis always less than or equal to the complete-data denominatordegrees of freedom, and equals the currently employed denominatordegrees of freedom for infinite sample sizes. Its advantagesover the currently employed degrees of freedom are illustratedwith a simulation.  相似文献   

18.
    
In cluster randomized trials (CRTs), identifiable clusters rather than individuals are randomized to study groups. Resulting data often consist of a small number of clusters with correlated observations within a treatment group. Missing data often present a problem in the analysis of such trials, and multiple imputation (MI) has been used to create complete data sets, enabling subsequent analysis with well-established analysis methods for CRTs. We discuss strategies for accounting for clustering when multiply imputing a missing continuous outcome, focusing on estimation of the variance of group means as used in an adjusted t-test or ANOVA. These analysis procedures are congenial to (can be derived from) a mixed effects imputation model; however, this imputation procedure is not yet available in commercial statistical software. An alternative approach that is readily available and has been used in recent studies is to include fixed effects for cluster, but the impact of using this convenient method has not been studied. We show that under this imputation model the MI variance estimator is positively biased and that smaller intraclass correlations (ICCs) lead to larger overestimation of the MI variance. Analytical expressions for the bias of the variance estimator are derived in the case of data missing completely at random, and cases in which data are missing at random are illustrated through simulation. Finally, various imputation methods are applied to data from the Detroit Middle School Asthma Project, a recent school-based CRT, and differences in inference are compared.  相似文献   

19.
20.
    
Longitudinal data often encounter missingness with monotone and/or intermittent missing patterns. Multiple imputation (MI) has been popularly employed for analysis of missing longitudinal data. In particular, the MI‐GEE method has been proposed for inference of generalized estimating equations (GEE) when missing data are imputed via MI. However, little is known about how to perform model selection with multiply imputed longitudinal data. In this work, we extend the existing GEE model selection criteria, including the “quasi‐likelihood under the independence model criterion” (QIC) and the “missing longitudinal information criterion” (MLIC), to accommodate multiple imputed datasets for selection of the MI‐GEE mean model. According to real data analyses from a schizophrenia study and an AIDS study, as well as simulations under nonmonotone missingness with moderate proportion of missing observations, we conclude that: (i) more than a few imputed datasets are required for stable and reliable model selection in MI‐GEE analysis; (ii) the MI‐based GEE model selection methods with a suitable number of imputations generally perform well, while the naive application of existing model selection methods by simply ignoring missing observations may lead to very poor performance; (iii) the model selection criteria based on improper (frequentist) multiple imputation generally performs better than their analogies based on proper (Bayesian) multiple imputation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号