首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The decisiveness of a data set has been defined as the degree to which all possible dichotomous trees for that data set differ in length, and the DD statistic (the data decisiveness index) has been proposed to measure this degree. In this paper, we first discuss an exact nonre cursive formula for the length of indecisive datasets (DD = 0) that consist of informative binary characters in which no missing entries are allowed. Next, the concept of indecisive data sets is extended to data sets in which missing entries may be present. Last, indecisive data sets with missing entries are used as an aid to construct hypothetical data sets that single out some of the factors that influence the DD statistic. On the basis of these examples, it is concluded that the concept of data decisiveness is too elusive to be captured into a single and simple index such as DD.  相似文献   

3.
Summary In individually matched case–control studies, when some covariates are incomplete, an analysis based on the complete data may result in a large loss of information both in the missing and completely observed variables. This usually results in a bias and loss of efficiency. In this article, we propose a new method for handling the problem of missing covariate data based on a missing‐data‐induced intensity approach when the missingness mechanism does not depend on case–control status and show that this leads to a generalization of the missing indicator method. We derive the asymptotic properties of the estimates from the proposed method and, using an extensive simulation study, assess the finite sample performance in terms of bias, efficiency, and 95% confidence coverage under several missing data scenarios. We also make comparisons with complete‐case analysis (CCA) and some missing data methods that have been proposed previously. Our results indicate that, under the assumption of predictable missingness, the suggested method provides valid estimation of parameters, is more efficient than CCA, and is competitive with other, more complex methods of analysis. A case–control study of multiple myeloma risk and a polymorphism in the receptor Inter‐Leukin‐6 (IL‐6‐α) is used to illustrate our findings.  相似文献   

4.
5.
6.
7.
8.
9.
In this article, we propose a class of semiparametric transformation rate models for recurrent event data subject to right censoring and potentially stopped by a terminating event (e.g., death). These transformation models include both additive rates model and proportional rates model as special cases. Respecting the property that no recurrent events can occur after the terminating event, we model the conditional recurrent event rate given survival. Weighted estimating equations are constructed to estimate the regression coefficients and baseline rate function. In particular, the baseline rate function is approximated by wavelet function. Asymptotic properties of the proposed estimators are derived and a data-dependent criterion is proposed for selecting the most suitable transformation. Simulation studies show that the proposed estimators perform well for practical sample sizes. The proposed methods are used in two real-data examples: a randomized trial of rhDNase and a community trial of vitamin A.  相似文献   

10.
Liang Li  Bo Hu  Tom Greene 《Biometrics》2009,65(3):737-745
Summary .  In many longitudinal clinical studies, the level and progression rate of repeatedly measured biomarkers on each subject quantify the severity of the disease and that subject's susceptibility to progression of the disease. It is of scientific and clinical interest to relate such quantities to a later time-to-event clinical endpoint such as patient survival. This is usually done with a shared parameter model. In such models, the longitudinal biomarker data and the survival outcome of each subject are assumed to be conditionally independent given subject-level severity or susceptibility (also called frailty in statistical terms). In this article, we study the case where the conditional distribution of longitudinal data is modeled by a linear mixed-effect model, and the conditional distribution of the survival data is given by a Cox proportional hazard model. We allow unknown regression coefficients and time-dependent covariates in both models. The proposed estimators are maximizers of an exact correction to the joint log likelihood with the frailties eliminated as nuisance parameters, an idea that originated from correction of covariate measurement error in measurement error models. The corrected joint log likelihood is shown to be asymptotically concave and leads to consistent and asymptotically normal estimators. Unlike most published methods for joint modeling, the proposed estimation procedure does not rely on distributional assumptions of the frailties. The proposed method was studied in simulations and applied to a data set from the Hemodialysis Study.  相似文献   

11.
Xue  Liugen; Zhu  Lixing 《Biometrika》2007,94(4):921-937
A semiparametric regression model for longitudinal data is considered.The empirical likelihood method is used to estimate the regressioncoefficients and the baseline function, and to construct confidenceregions and intervals. It is proved that the maximum empiricallikelihood estimator of the regression coefficients achievesasymptotic efficiency and the estimator of the baseline functionattains asymptotic normality when a bias correction is made.Two calibrated empirical likelihood approaches to inferencefor the baseline function are developed. We propose a groupwiseempirical likelihood procedure to handle the inter-series dependencefor the longitudinal semiparametric regression model, and employbias correction to construct the empirical likelihood ratiofunctions for the parameters of interest. This leads us to provea nonparametric version of Wilks' theorem. Compared with methodsbased on normal approximations, the empirical likelihood doesnot require consistent estimators for the asymptotic varianceand bias. A simulation compares the empirical likelihood andnormal-based methods in terms of coverage accuracies and averageareas/lengths of confidence regions/intervals.  相似文献   

12.
Sangbum Choi  Xuelin Huang 《Biometrics》2012,68(4):1126-1135
Summary We propose a semiparametrically efficient estimation of a broad class of transformation regression models for nonproportional hazards data. Classical transformation models are to be viewed from a frailty model paradigm, and the proposed method provides a unified approach that is valid for both continuous and discrete frailty models. The proposed models are shown to be flexible enough to model long‐term follow‐up survival data when the treatment effect diminishes over time, a case for which the PH or proportional odds assumption is violated, or a situation in which a substantial proportion of patients remains cured after treatment. Estimation of the link parameter in frailty distribution, considered to be unknown and possibly dependent on a time‐independent covariates, is automatically included in the proposed methods. The observed information matrix is computed to evaluate the variances of all the parameter estimates. Our likelihood‐based approach provides a natural way to construct simple statistics for testing the PH and proportional odds assumptions for usual survival data or testing the short‐ and long‐term effects for survival data with a cure fraction. Simulation studies demonstrate that the proposed inference procedures perform well in realistic settings. Applications to two medical studies are provided.  相似文献   

13.
Summary For analysis of genomic data, e.g., microarray data from gene expression profiling experiments, the two‐component mixture model has been widely used in practice to detect differentially expressed genes. However, it naïvely imposes strong exchangeability assumptions across genes and does not make active use of a priori information about intergene relationships that is currently available, e.g., gene annotations through the Gene Ontology (GO) project. We propose a general strategy that first generates a set of covariates that summarizes the intergene information and then extends the two‐component mixture model into a hierarchical semiparametric model utilizing the generated covariates through latent nonparametric regression. Simulations and analysis of real microarray data show that our method can outperform the naïve two‐component mixture model.  相似文献   

14.
15.
Most statistical solutions to the problem of statistical inferencewith missing data involve integration or expectation. This canbe done in many ways: directly or indirectly, analytically ornumerically, deterministically or stochastically. Missing-dataproblems can be formulated in terms of latent random variables,so that hierarchical likelihood methods of Lee & Nelder(1996) can be applied to missing-value problems to provide onesolution to the problem of integration of the likelihood. Theresulting methods effectively use a Laplace approximation tothe marginal likelihood with an additional adjustment to themeasures of precision to accommodate the estimation of the fixedeffects parameters. We first consider missing at random caseswhere problems are simpler to handle because the integrationdoes not need to involve the missing-value mechanism and thenconsider missing not at random cases. We also study tobit regressionand refit the missing not at random selection model to the antidepressanttrial data analyzed in Diggle & Kenward (1994).  相似文献   

16.
Biomedical studies often collect multivariate event time data from multiple clusters (either subjects or groups) within each of which event times for individuals are correlated and the correlation may vary in different classes. In such survival analyses, heterogeneity among clusters for shared and specific classes can be accommodated by incorporating parametric frailty terms into the model. In this article, we propose a Bayesian approach to relax the parametric distribution assumption for shared and specific‐class frailties by using a Dirichlet process prior while also allowing for the uncertainty of heterogeneity for different classes. Multiple cluster‐specific frailty selections rely on variable selection‐type mixture priors by applying mixtures of point masses at zero and inverse gamma distributions to the variance of log frailties. This selection allows frailties with zero variance to effectively drop out of the model. A reparameterization of log‐frailty terms is performed to reduce the potential bias of fixed effects due to variation of the random distribution and dependence among the parameters resulting in easy interpretation and faster Markov chain Monte Carlo convergence. Simulated data examples and an application to a lung cancer clinical trial are used for illustration.  相似文献   

17.
18.
针对基因芯片数据缺失问题,利用蛋白质相互作用关系与基因表达的内在联系,提出了一种利用蛋白质相互作用信息提高基因芯片缺失数据估计精度的方法.将蛋白质间的相互作用关系与基因表达数据间的距离相结合来计算基因间的表达相似度,根据这个新的相似性度量标准为含有缺失数据的基因选择更为合适的用于估计缺失值的基因集合.将新的相似性度量标准与传统的KNNimpute、 LLSimpute方法相结合,描述了对应的改进算法PPI-KNNimpute、 PPI-LLSimpute.对真实的数据集测试表明,蛋白质相互作用信息能有效改善基因缺失数据估计的精度.  相似文献   

19.
Bootstrap is a time-honoured distribution-free approach for attaching standard error to any statistic of interest, but has not received much attention for data with missing values especially when using imputation techniques to replace missing values. We propose a proportional bootstrap method that allows effective use of imputation techniques for all bootstrap samples. Five detcnninistic imputation techniques are examined and particular emphasis is placed on the estimation of standard error for correlation coefficient. Some real data examples are presented. Other possible applications of the proposed bootstrap method are discussed.  相似文献   

20.
Summary .  We consider semiparametric transition measurement error models for longitudinal data, where one of the covariates is measured with error in transition models, and no distributional assumption is made for the underlying unobserved covariate. An estimating equation approach based on the pseudo conditional score method is proposed. We show the resulting estimators of the regression coefficients are consistent and asymptotically normal. We also discuss the issue of efficiency loss. Simulation studies are conducted to examine the finite-sample performance of our estimators. The longitudinal AIDS Costs and Services Utilization Survey data are analyzed for illustration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号