Leveraging information in aggregate data from external sources to improve estimation efficiency and prediction accuracy with smaller scale studies has drawn a great deal of attention in recent years. Yet, conventional methods often either ignore uncertainty in the external information or fail to account for the heterogeneity between internal and external studies. This article proposes an empirical likelihood-based framework to improve the estimation of the semiparametric transformation models by incorporating information about the t-year subgroup survival probability from external sources. The proposed estimation procedure incorporates an additional likelihood component to account for uncertainty in the external information and employs a density ratio model to characterize population heterogeneity. We establish the consistency and asymptotic normality of the proposed estimator and show that it is more efficient than the conventional pseudopartial likelihood estimator without combining information. Simulation studies show that the proposed estimator yields little bias and outperforms the conventional approach even in the presence of information uncertainty and heterogeneity. The proposed methodologies are illustrated with an analysis of a pancreatic cancer study.  相似文献   
Large sample theory of semiparametric models based on maximum likelihood estimation (MLE) with shape constraint on the nonparametric component is well studied. Relatively less attention has been paid to the computational aspect of semiparametric MLE. The computation of semiparametric MLE based on existing approaches such as the expectation‐maximization (EM) algorithm can be computationally prohibitive when the missing rate is high. In this paper, we propose a computational framework for semiparametric MLE based on an inexact block coordinate ascent (BCA) algorithm. We show theoretically that the proposed algorithm converges. This computational framework can be applied to a wide range of data with different structures, such as panel count data, interval‐censored data, and degradation data, among others. Simulation studies demonstrate favorable performance compared with existing algorithms in terms of accuracy and speed. Two data sets are used to illustrate the proposed computational method. We further implement the proposed computational method in R package BCA1SG , available at CRAN.  相似文献   
AIM: Our objective was to identify the distribution of the endangered golden-cheeked warbler (Setophaga chrysoparia) in fragmented oak-juniper woodlands by applying a geoadditive semiparametric occupancy model to better assist decision-makers in identifying suitable habitat across the species breeding range on which conservation or mitigation activities can be focused and thus prioritize management and conservation planning. LOCATION: Texas, USA. METHODS: We used repeated double-observer detection/non-detection surveys of randomly selected (n = 287) patches of potential habitat to evaluate warbler patch-scale presence across the species breeding range. We used a geoadditive semiparametric occupancy model with remotely sensed habitat metrics (patch size and landscape composition) to predict patch-scale occupancy of golden-cheeked warblers in the fragmented oak-juniper woodlands of central Texas, USA. RESULTS: Our spatially explicit model indicated that golden-cheeked warbler patch occupancy declined from south to north within the breeding range concomitant with reductions in the availability of large habitat patches. We found that 59% of woodland patches, primarily in the northern and central portions of the warbler's range, were predicted to have occupancy probabilities ≤0.10 with only 3% of patches predicted to have occupancy probabilities >0.90. Our model exhibited high prediction accuracy (area under curve = 0.91) when validated using independently collected warbler occurrence data. MAIN CONCLUSIONS: We have identified a distinct spatial occurrence gradient for golden-cheeked warblers as well as a relationship between two measurable landscape characteristics. Because habitat-occupancy relationships were key drivers of our model, our results can be used to identify potential areas where conservation actions supporting habitat mitigation can occur and identify areas where conservation of future potential habitat is possible. Additionally, our results can be used to focus resources on maintenance and creation of patches that are more likely to harbour viable local warbler populations.  相似文献   
Hjort & Claeskens (2003) developed an asymptotic theoryfor model selection, model averaging and subsequent inferenceusing likelihood methods in parametric models, along with associatedconfidence statements. In this article, we consider a semiparametricversion of this problem, wherein the likelihood depends on parametersand an unknown function, and model selection/averaging is tobe applied to the parametric parts of the model. We show thatall the results of Hjort & Claeskens hold in the semiparametriccontext, if the Fisher information matrix for parametric modelsis replaced by the semiparametric information bound for semiparametricmodels, and if maximum likelihood estimators for parametricmodels are replaced by semiparametric efficient profile estimators.Our methods of proof employ Le Cam's contiguity lemmas, leadingto transparent results. The results also describe the behaviourof semiparametric model estimators when the parametric componentis misspecified, and also have implications for pointwise-consistentmodel selectors.  相似文献   
Data on doe longevity in a rabbit population were analysed using a semiparametric log-Normal animal frailty model. Longevity was defined as the time from the first positive pregnancy test to death or culling due to pathological problems. Does culled for other reasons had right censored records of longevity. The model included time dependent covariates associated with year by season, the interaction between physiological state and the number of young born alive, and between order of positive pregnancy test and physiological state. The model also included an additive genetic effect and a residual in log frailty. Properties of marginal posterior distributions of specific parameters were inferred from a full Bayesian analysis using Gibbs sampling. All of the fully conditional posterior distributions defining a Gibbs sampler were easy to sample from, either directly or using adaptive rejection sampling. The marginal posterior mean estimates of the additive genetic variance and of the residual variance in log frailty were 0.247 and 0.690.  相似文献   
Kaitlyn Cook  Wenbin Lu  Rui Wang 《Biometrics》2023,79(3):1670-1685
The Botswana Combination Prevention Project was a cluster-randomized HIV prevention trial whose follow-up period coincided with Botswana's national adoption of a universal test and treat strategy for HIV management. Of interest is whether, and to what extent, this change in policy modified the preventative effects of the study intervention. To address such questions, we adopt a stratified proportional hazards model for clustered interval-censored data with time-dependent covariates and develop a composite expectation maximization algorithm that facilitates estimation of model parameters without placing parametric assumptions on either the baseline hazard functions or the within-cluster dependence structure. We show that the resulting estimators for the regression parameters are consistent and asymptotically normal. We also propose and provide theoretical justification for the use of the profile composite likelihood function to construct a robust sandwich estimator for the variance. We characterize the finite-sample performance and robustness of these estimators through extensive simulation studies. Finally, we conclude by applying this stratified proportional hazards model to a re-analysis of the Botswana Combination Prevention Project, with the national adoption of a universal test and treat strategy now modeled as a time-dependent covariate.  相似文献   
This article develops hypothesis testing procedures for the stratified mark‐specific proportional hazards model with missing covariates where the baseline functions may vary with strata. The mark‐specific proportional hazards model has been studied to evaluate mark‐specific relative risks where the mark is the genetic distance of an infecting HIV sequence to an HIV sequence represented inside the vaccine. This research is motivated by analyzing the RV144 phase 3 HIV vaccine efficacy trial, to understand associations of immune response biomarkers on the mark‐specific hazard of HIV infection, where the biomarkers are sampled via a two‐phase sampling nested case‐control design. We test whether the mark‐specific relative risks are unity and how they change with the mark. The developed procedures enable assessment of whether risk of HIV infection with HIV variants close or far from the vaccine sequence are modified by immune responses induced by the HIV vaccine; this question is interesting because vaccine protection occurs through immune responses directed at specific HIV sequences. The test statistics are constructed based on augmented inverse probability weighted complete‐case estimators. The asymptotic properties and finite‐sample performances of the testing procedures are investigated, demonstrating double‐robustness and effectiveness of the predictive auxiliaries to recover efficiency. The finite‐sample performance of the proposed tests are examined through a comprehensive simulation study. The methods are applied to the RV144 trial.  相似文献   
Marginal structural models for time‐fixed treatments fit using inverse‐probability weighted estimating equations are increasingly popular. Nonetheless, the resulting effect estimates are subject to finite‐sample bias when data are sparse, as is typical for large‐sample procedures. Here we propose a semi‐Bayes estimation approach which penalizes or shrinks the estimated model parameters to improve finite‐sample performance. This approach uses simple symmetric data‐augmentation priors. Limited simulation experiments indicate that the proposed approach reduces finite‐sample bias and improves confidence‐interval coverage when the true values lie within the central “hill” of the prior distribution. We illustrate the approach with data from a nonexperimental study of HIV treatments.  相似文献   
Semiparametric smoothing methods are usually used to model longitudinal data, and the interest is to improve efficiency for regression coefficients. This paper is concerned with the estimation in semiparametric varying‐coefficient models (SVCMs) for longitudinal data. By the orthogonal projection method, local linear technique, quasi‐score estimation, and quasi‐maximum likelihood estimation, we propose a two‐stage orthogonality‐based method to estimate parameter vector, coefficient function vector, and covariance function. The developed procedures can be implemented separately and the resulting estimators do not affect each other. Under some mild conditions, asymptotic properties of the resulting estimators are established explicitly. In particular, the asymptotic behavior of the estimator of coefficient function vector at the boundaries is examined. Further, the finite sample performance of the proposed procedures is assessed by Monte Carlo simulation experiments. Finally, the proposed methodology is illustrated with an analysis of an acquired immune deficiency syndrome (AIDS) dataset.  相似文献   
