首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 772 毫秒
1.
Mixed case interval‐censored data arise when the event of interest is known only to occur within an interval induced by a sequence of random examination times. Such data are commonly encountered in disease research with longitudinal follow‐up. Furthermore, the medical treatment has progressed over the last decade with an increasing proportion of patients being cured for many types of diseases. Thus, interest has grown in cure models for survival data which hypothesize a certain proportion of subjects in the population are not expected to experience the events of interest. In this article, we consider a two‐component mixture cure model for regression analysis of mixed case interval‐censored data. The first component is a logistic regression model that describes the cure rate, and the second component is a semiparametric transformation model that describes the distribution of event time for the uncured subjects. We propose semiparametric maximum likelihood estimation for the considered model. We develop an EM type algorithm for obtaining the semiparametric maximum likelihood estimators (SPMLE) of regression parameters and establish their consistency, efficiency, and asymptotic normality. Extensive simulation studies indicate that the SPMLE performs satisfactorily in a wide variety of settings. The proposed method is illustrated by the analysis of the hypobaric decompression sickness data from National Aeronautics and Space Administration.  相似文献   

2.
Zhang D  Lin X  Sowers M 《Biometrics》2000,56(1):31-39
We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations.  相似文献   

3.
Weibin Zhong  Guoqing Diao 《Biometrics》2023,79(3):1959-1971
Two-phase studies such as case-cohort and nested case-control studies are widely used cost-effective sampling strategies. In the first phase, the observed failure/censoring time and inexpensive exposures are collected. In the second phase, a subgroup of subjects is selected for measurements of expensive exposures based on the information from the first phase. One challenging issue is how to utilize all the available information to conduct efficient regression analyses of the two-phase study data. This paper proposes a joint semiparametric modeling of the survival outcome and the expensive exposures. Specifically, we assume a class of semiparametric transformation models and a semiparametric density ratio model for the survival outcome and the expensive exposures, respectively. The class of semiparametric transformation models includes the proportional hazards model and the proportional odds model as special cases. The density ratio model is flexible in modeling multivariate mixed-type data. We develop efficient likelihood-based estimation and inference procedures and establish the large sample properties of the nonparametric maximum likelihood estimators. Extensive numerical studies reveal that the proposed methods perform well under practical settings. The proposed methods also appear to be reasonably robust under various model mis-specifications. An application to the National Wilms Tumor Study is provided.  相似文献   

4.
Yu Z  Lin X  Tu W 《Biometrics》2012,68(2):429-436
We consider frailty models with additive semiparametric covariate effects for clustered failure time data. We propose a doubly penalized partial likelihood (DPPL) procedure to estimate the nonparametric functions using smoothing splines. We show that the DPPL estimators could be obtained from fitting an augmented working frailty model with parametric covariate effects, whereas the nonparametric functions being estimated as linear combinations of fixed and random effects, and the smoothing parameters being estimated as extra variance components. This approach allows us to conveniently estimate all model components within a unified frailty model framework. We evaluate the finite sample performance of the proposed method via a simulation study, and apply the method to analyze data from a study of sexually transmitted infections (STI).  相似文献   

5.
Huang Y  Gilbert PB 《Biometrics》2011,67(4):1442-1451
Recently a new definition of surrogate endpoint, the "principal surrogate," was proposed based on causal associations between treatment effects on the biomarker and on the clinical endpoint. Despite its appealing interpretation, limited research has been conducted to evaluate principal surrogates, and existing methods focus on risk models that consider a single biomarker. How to compare principal surrogate value of biomarkers or general risk models that consider multiple biomarkers remains an open research question. We propose to characterize a marker or risk model's principal surrogate value based on the distribution of risk difference between interventions. In addition, we propose a novel summary measure (the standardized total gain) that can be used to compare markers and to assess the incremental value of a new marker. We develop a semiparametric estimated-likelihood method to estimate the joint surrogate value of multiple biomarkers. This method accommodates two-phase sampling of biomarkers and is more widely applicable than existing nonparametric methods by incorporating continuous baseline covariates to predict the biomarker(s), and is more robust than existing parametric methods by leaving the error distribution of markers unspecified. The methodology is illustrated using a simulated example set and a real data set in the context of HIV vaccine trials.  相似文献   

6.
Auxiliary covariate data are often collected in biomedical studies when the primary exposure variable is only assessed on a subset of the study subjects. In this study, we investigate a semiparametric‐estimated likelihood estimation for the generalized linear mixed models (GLMM) in the presence of a continuous auxiliary variable. We use a kernel smoother to handle continuous auxiliary data. The method can be used to deal with missing or mismeasured covariate data problems in a variety of applications when an auxiliary variable is available and cluster sizes are not too small. Simulation study results show that the proposed method performs better than that which ignores the random effects in GLMM and that which only uses data in the validation data set. We illustrate the proposed method with a real data set from a recent environmental epidemiology study on the maternal serum 1,1‐dichloro‐2,2‐bis(p‐chlorophenyl) ethylene level in relationship to preterm births.  相似文献   

7.
Moming Li  Guoqing Diao  Jing Qin 《Biometrics》2020,76(4):1216-1228
We consider a two-sample problem where data come from symmetric distributions. Usual two-sample data with only magnitudes recorded, arising from case-control studies or logistic discriminant analyses, may constitute a symmetric two-sample problem. We propose a semiparametric model such that, in addition to symmetry, the log ratio of two unknown density functions is modeled in a known parametric form. The new semiparametric model, tailor-made for symmetric two-sample data, can also be viewed as a biased sampling model subject to symmetric constraint. A maximum empirical likelihood estimation approach is adopted to estimate the unknown model parameters, and the corresponding profile empirical likelihood ratio test is utilized to perform hypothesis testing regarding the two population distributions. Symmetry, however, comes with irregularity. It is shown that, under the null hypothesis of equal symmetric distributions, the maximum empirical likelihood estimator has degenerate Fisher information, and the test statistic has a mixture of χ2-type asymptotic distribution. Extensive simulation studies have been conducted to demonstrate promising statistical powers under correct and misspecified models. We apply the proposed methods to two real examples.  相似文献   

8.
Wen CC  Lin CT 《Biometrics》2011,67(3):760-769
Statistical inference based on right-censored data for the proportional hazards (PH) model with missing covariates has received considerable attention, but interval-censored or current status data with missing covariates has not yet been investigated. Our study is partly motivated by the analysis of fracture data from the 2005 National Health Interview Survey Original Database in Taiwan, where the occurrence of fractures was interval censored and the covariate osteoporosis was not reported for all residents. We assume that the data are realized from a PH model. A semiparametric maximum likelihood estimate implemented by a hybrid algorithm is proposed to analyze current status data with missing covariates. A comparison of the performance of our method with full-cohort analysis, complete-case analysis, and surrogate analysis is made via simulation with moderate sample sizes. The fracture data are then analyzed.  相似文献   

9.
Gray RJ 《Biometrics》2000,56(2):571-576
An estimator of the regression parameters in a semiparametric transformed linear survival model is examined. This estimator consists of a single Newton-like update of the solution to a rank-based estimating equation from an initial consistent estimator. An automated penalized likelihood algorithm is proposed for estimating the optimal weight function for the estimating equations and the error hazard function that is needed in the variance estimator. In simulations, the estimated optimal weights are found to give reasonably efficient estimators of the regression parameters, and the variance estimators are found to perform well. The methodology is applied to an analysis of prognostic factors in non-Hodgkin's lymphoma.  相似文献   

10.
Recent technological advances continue to provide noninvasive and more accurate biomarkers for evaluating disease status. One standard tool for assessing the accuracy of diagnostic tests is the receiver operating characteristic (ROC) curve. Few statistical methods exist to accommodate multiple continuous‐scale biomarkers in the framework of ROC analysis. In this paper, we propose a method to integrate continuous‐scale biomarkers to optimize classification accuracy. Specifically, we develop semiparametric transformation models for multiple biomarkers. We assume that unknown and marker‐specific transformations of biomarkers follow a multivariate normal distribution. Our models accommodate biomarkers subject to limits of detection and account for the dependence among biomarkers by including a subject‐specific random effect. We also propose a diagnostic measure using an optimal linear combination of the transformed biomarkers. Our diagnostic rule does not depend on any monotone transformation of biomarkers and is not sensitive to extreme biomarker values. Nonparametric maximum likelihood estimation (NPMLE) is used for inference. We show that the parameter estimators are asymptotically normal and efficient. We illustrate our semiparametric approach using data from the Endometriosis, Natural History, Diagnosis, and Outcomes (ENDO) study.  相似文献   

11.
Heritability is a population parameter of importance in evolution, plant and animal breeding, and human medical genetics. It can be estimated using pedigree designs and, more recently, using relationships estimated from markers. We derive the sampling variance of the estimate of heritability for a wide range of experimental designs, assuming that estimation is by maximum likelihood and that the resemblance between relatives is solely due to additive genetic variation. We show that well-known results for balanced designs are special cases of a more general unified framework. For pedigree designs, the sampling variance is inversely proportional to the variance of relationship in the pedigree and it is proportional to 1/N, whereas for population samples it is approximately proportional to 1/N2, where N is the sample size. Variation in relatedness is a key parameter in the quantification of the sampling variance of heritability. Consequently, the sampling variance is high for populations with large recent effective population size (e.g., humans) because this causes low variation in relationship. However, even using human population samples, low sampling variance is possible with high N.  相似文献   

12.
Liu D  Lin X  Ghosh D 《Biometrics》2007,63(4):1079-1088
We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.  相似文献   

13.
Outcome-dependent sampling (ODS) schemes can be a cost effective way to enhance study efficiency. The case-control design has been widely used in epidemiologic studies. However, when the outcome is measured on a continuous scale, dichotomizing the outcome could lead to a loss of efficiency. Recent epidemiologic studies have used ODS sampling schemes where, in addition to an overall random sample, there are also a number of supplemental samples that are collected based on a continuous outcome variable. We consider a semiparametric empirical likelihood inference procedure in which the underlying distribution of covariates is treated as a nuisance parameter and is left unspecified. The proposed estimator has asymptotic normality properties. The likelihood ratio statistic using the semiparametric empirical likelihood function has Wilks-type properties in that, under the null, it follows a chi-square distribution asymptotically and is independent of the nuisance parameters. Our simulation results indicate that, for data obtained using an ODS design, the semiparametric empirical likelihood estimator is more efficient than conditional likelihood and probability weighted pseudolikelihood estimators and that ODS designs (along with the proposed estimator) can produce more efficient estimates than simple random sample designs of the same size. We apply the proposed method to analyze a data set from the Collaborative Perinatal Project (CPP), an ongoing environmental epidemiologic study, to assess the relationship between maternal polychlorinated biphenyl (PCB) level and children's IQ test performance.  相似文献   

14.
Summary Identification of novel biomarkers for risk assessment is important for both effective disease prevention and optimal treatment recommendation. Discovery relies on the precious yet limited resource of stored biological samples from large prospective cohort studies. Case‐cohort sampling design provides a cost‐effective tool in the context of biomarker evaluation, especially when the clinical condition of interest is rare. Existing statistical methods focus on making efficient inference on relative hazard parameters from the Cox regression model. Drawing on recent theoretical development on the weighted likelihood for semiparametric models under two‐phase studies ( Breslow and Wellner, 2007 ), we propose statistical methods to evaluate accuracy and predictiveness of a risk prediction biomarker, with censored time‐to‐event outcome under stratified case‐cohort sampling. We consider nonparametric methods and a semiparametric method. We derive large sample properties of proposed estimators and evaluate their finite sample performance using numerical studies. We illustrate new procedures using data from Framingham Offspring Study to evaluate the accuracy of a recently developed risk score incorporating biomarker information for predicting cardiovascular disease.  相似文献   

15.
Two-stage design is a well-known cost-effective way for conducting biomedical studies when the exposure variable is expensive or difficult to measure. Recent research development further allowed one or both stages of the two-stage design to be outcome dependent on a continuous outcome variable. This outcome-dependent sampling feature enables further efficiency gain in parameter estimation and overall cost reduction of the study (e.g. Wang, X. and Zhou, H., 2010. Design and inference for cancer biomarker study with an outcome and auxiliary-dependent subsampling. Biometrics 66, 502-511; Zhou, H., Song, R., Wu, Y. and Qin, J., 2011. Statistical inference for a two-stage outcome-dependent sampling design with a continuous outcome. Biometrics 67, 194-202). In this paper, we develop a semiparametric mixed effect regression model for data from a two-stage design where the second-stage data are sampled with an outcome-auxiliary-dependent sample (OADS) scheme. Our method allows the cluster- or center-effects of the study subjects to be accounted for. We propose an estimated likelihood function to estimate the regression parameters. Simulation study indicates that greater study efficiency gains can be achieved under the proposed two-stage OADS design with center-effects when compared with other alternative sampling schemes. We illustrate the proposed method by analyzing a dataset from the Collaborative Perinatal Project.  相似文献   

16.
Chen Q  Ibrahim JG 《Biometrics》2006,62(1):177-184
We consider a class of semiparametric models for the covariate distribution and missing data mechanism for missing covariate and/or response data for general classes of regression models including generalized linear models and generalized linear mixed models. Ignorable and nonignorable missing covariate and/or response data are considered. The proposed semiparametric model can be viewed as a sensitivity analysis for model misspecification of the missing covariate distribution and/or missing data mechanism. The semiparametric model consists of a generalized additive model (GAM) for the covariate distribution and/or missing data mechanism. Penalized regression splines are used to express the GAMs as a generalized linear mixed effects model, in which the variance of the corresponding random effects provides an intuitive index for choosing between the semiparametric and parametric model. Maximum likelihood estimates are then obtained via the EM algorithm. Simulations are given to demonstrate the methodology, and a real data set from a melanoma cancer clinical trial is analyzed using the proposed methods.  相似文献   

17.
The case-cohort study involves two-phase samplings: simple random sampling from an infinite superpopulation at phase one and stratified random sampling from a finite cohort at phase two. Standard analyses of case-cohort data involve solution of inverse probability weighted (IPW) estimating equations, with weights determined by the known phase two sampling fractions. The variance of parameter estimates in (semi)parametric models, including the Cox model, is the sum of two terms: (i) the model-based variance of the usual estimates that would be calculated if full data were available for the entire cohort; and (ii) the design-based variance from IPW estimation of the unknown cohort total of the efficient influence function (IF) contributions. This second variance component may be reduced by adjusting the sampling weights, either by calibration to known cohort totals of auxiliary variables correlated with the IF contributions or by their estimation using these same auxiliary variables. Both adjustment methods are implemented in the R survey package. We derive the limit laws of coefficients estimated using adjusted weights. The asymptotic results suggest practical methods for construction of auxiliary variables that are evaluated by simulation of case-cohort samples from the National Wilms Tumor Study and by log-linear modeling of case-cohort data from the Atherosclerosis Risk in Communities Study. Although not semiparametric efficient, estimators based on adjusted weights may come close to achieving full efficiency within the class of augmented IPW estimators.  相似文献   

18.
Zhang J  Peng Y  Zhao O 《Biometrics》2011,67(4):1352-1360
The accelerated hazard model has been proposed for more than a decade. However, its application is still very limited, partly due to the complexity of the existing semiparametric estimation method. We propose a new semiparametric estimation method based on a kernel-smoothed approximation to the limit of a profile likelihood function of the model. The method leads to smooth estimating equations and is easy to use. The estimates from the method are proved to be consistent and asymptotically normal. Our numerical study shows that the new method is more efficient than the existing method. The proposed method is employed to reanalyze the data from a brain tumor treatment study.  相似文献   

19.
A generalized case-control (GCC) study, like the standard case-control study, leverages outcome-dependent sampling (ODS) to extend to nonbinary responses. We develop a novel, unifying approach for analyzing GCC study data using the recently developed semiparametric extension of the generalized linear model (GLM), which is substantially more robust to model misspecification than existing approaches based on parametric GLMs. For valid estimation and inference, we use a conditional likelihood to account for the biased sampling design. We describe analysis procedures for estimation and inference for the semiparametric GLM under a conditional likelihood, and we discuss problems with estimation and inference under a conditional likelihood when the response distribution is misspecified. We demonstrate the flexibility of our approach over existing ones through extensive simulation studies, and we apply the methodology to an analysis of the Asset and Health Dynamics Among the Oldest Old study, which motives our research. The proposed approach yields a simple yet versatile solution for handling ODS in a wide variety of possible response distributions and sampling schemes encountered in practice.  相似文献   

20.
Several maximum likelihood and distance matrix methods for estimating phylogenetic trees from homologous DNA sequences were compared when substitution rates at sites were assumed to follow a gamma distribution. Computer simulations were performed to estimate the probabilities that various tree estimation methods recover the true tree topology. The case of four species was considered, and a few combinations of parameters were examined. Attention was applied to discriminating among different sources of error in tree reconstruction, i.e., the inconsistency of the tree estimation method, the sampling error in the estimated tree due to limited sequence length, and the sampling error in the estimated probability due to the number of simulations being limited. Compared to the least squares method based on pairwise distance estimates, the joint likelihood analysis is found to be more robust when rate variation over sites is present but ignored and an assumption is thus violated. With limited data, the likelihood method has a much higher probability of recovering the true tree and is therefore more efficient than the least squares method. The concept of statistical consistency of a tree estimation method and its implications were explored, and it is suggested that, while the efficiency (or sampling error) of a tree estimation method is a very important property, statistical consistency of the method over a wide range of, if not all, parameter values is prerequisite.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号