首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Balshaw RF  Dean CB 《Biometrics》2002,58(2):324-331
In many longitudinal studies, interest focuses on the occurrence rate of some phenomenon for the subjects in the study. When the phenomenon is nonterminating and possibly recurring, the result is a recurrent-event data set. Examples include epileptic seizures and recurrent cancers. When the recurring event is detectable only by an expensive or invasive examination, only the number of events occurring between follow-up times may be available. This article presents a semiparametric model for such data, based on a multiplicative intensity model paired with a fully flexible nonparametric baseline intensity function. A random subject-specific effect is included in the intensity model to account for the overdispersion frequently displayed in count data. Estimators are determined from quasi-likelihood estimating functions. Because only first- and second-moment assumptions are required for quasi-likelihood, the method is more robust than those based on the specification of a full parametric likelihood. Consistency of the estimators depends only on the assumption of the proportional intensity model. The semiparametric estimators are shown to be highly efficient compared with the usual parametric estimators. As with semiparametric methods in survival analysis, the method provides useful diagnostics for specific parametric models, including a quasi-score statistic for testing specific baseline intensity functions. The techniques are used to analyze cancer recurrences and a pheromone-based mating disruption experiment in moths. A simulation study confirms that, for many practical situations, the estimators possess appropriate small-sample characteristics.  相似文献   

2.
Ishiguro, Sakamoto, and Kitagawa (1997, Annals of the Institute of Statistical Mathematics 49, 411-434) proposed EIC as an extension of Akaike criterion (AIC); the idea leading to EIC is to correct the bias of the log-likelihood, considered as an estimator of the Kullback-Leibler information, using bootstrap. We develop this criterion for its use in multivariate semiparametric situations, and argue that it can be used for choosing among parametric and semiparametric estimators. A simulation study based on aregression model shows that EIC is better than its competitors although likelihood cross-validation performs nearly as well except for small sample size. Its use is illustrated by estimating the mean evolution of viral RNA levels in a group of infants infected by HIV.  相似文献   

3.
An accelerated failure time (AFT) model assuming a log-linear relationship between failure time and a set of covariates can be either parametric or semiparametric, depending on the distributional assumption for the error term. Both classes of AFT models have been popular in the analysis of censored failure time data. The semiparametric AFT model is more flexible and robust to departures from the distributional assumption than its parametric counterpart. However, the semiparametric AFT model is subject to producing biased results for estimating any quantities involving an intercept. Estimating an intercept requires a separate procedure. Moreover, a consistent estimation of the intercept requires stringent conditions. Thus, essential quantities such as mean failure times might not be reliably estimated using semiparametric AFT models, which can be naturally done in the framework of parametric AFT models. Meanwhile, parametric AFT models can be severely impaired by misspecifications. To overcome this, we propose a new type of the AFT model using a nonparametric Gaussian-scale mixture distribution. We also provide feasible algorithms to estimate the parameters and mixing distribution. The finite sample properties of the proposed estimators are investigated via an extensive stimulation study. The proposed estimators are illustrated using a real dataset.  相似文献   

4.
Weibin Zhong  Guoqing Diao 《Biometrics》2023,79(3):1959-1971
Two-phase studies such as case-cohort and nested case-control studies are widely used cost-effective sampling strategies. In the first phase, the observed failure/censoring time and inexpensive exposures are collected. In the second phase, a subgroup of subjects is selected for measurements of expensive exposures based on the information from the first phase. One challenging issue is how to utilize all the available information to conduct efficient regression analyses of the two-phase study data. This paper proposes a joint semiparametric modeling of the survival outcome and the expensive exposures. Specifically, we assume a class of semiparametric transformation models and a semiparametric density ratio model for the survival outcome and the expensive exposures, respectively. The class of semiparametric transformation models includes the proportional hazards model and the proportional odds model as special cases. The density ratio model is flexible in modeling multivariate mixed-type data. We develop efficient likelihood-based estimation and inference procedures and establish the large sample properties of the nonparametric maximum likelihood estimators. Extensive numerical studies reveal that the proposed methods perform well under practical settings. The proposed methods also appear to be reasonably robust under various model mis-specifications. An application to the National Wilms Tumor Study is provided.  相似文献   

5.
Inferring pH from diatoms: a comparison of old and new calibration methods   总被引:35,自引:20,他引:15  
Two new methods for inferring pH from diatoms are presented. Both are based on the observation that the relationships between diatom taxa and pH are often unimodal. The first method is maximum likelihood calibration based on Gaussian logit response curves of taxa against pH. The second is weighted averaging. In a lake with a particular pH, taxa with an optimum close to the lake pH will be most abundant, so an intuitively reasonable estimate of the lake pH is to take a weighted average of the pH optima of the species present.Optima and tolerances of diatom taxa were estimated from contemporary pH and proportional diatom counts in littoral zone samples from 97 pristine soft water lakes and pools in Western Europe. The optima showed a strong relation with Hustedt's pH preference groups. The two new methods were then compared with existing calibration methods on the basis of differences between inferred and observed pH in a test set of 62 additional samples taken between 1918 and 1983. The methods were ranked in order of performance as follows (between brackets the standard error of inferred pH in pH units); maximum likelihood (0.63) > weighted averaging (0.71) = multiple regression using pH groups (0.71) = the Gasse & Tekaia method (0.71) > Renberg & Hellberg's Index B (0.83) » multiple regression using taxa (2.2). The standard errors are larger than those usually obtained from surface sediment samples. The relatively large standard may be due to seasonal variation and to the effects of other factors such as humus content. The maximum likelihood method is statistically rigorous and can in principle be extended to allow for additional environmental factors. It is computer intensive however. The weighted averaging approach is a good approximation to the maximum likelihood method and is recommended as a practical and robust alternative.  相似文献   

6.
There has been growing interest in the likelihood paradigm of statistics, where statistical evidence is represented by the likelihood function and its strength is measured by likelihood ratios. The available literature in this area has so far focused on parametric likelihood functions, though in some cases a parametric likelihood can be robustified. This focused discussion on parametric models, while insightful and productive, may have left the impression that the likelihood paradigm is best suited to parametric situations. This article discusses the use of empirical likelihood functions, a well‐developed methodology in the frequentist paradigm, to interpret statistical evidence in nonparametric and semiparametric situations. A comparative review of literature shows that, while an empirical likelihood is not a true probability density, it has the essential properties, namely consistency and local asymptotic normality that unify and justify the various parametric likelihood methods for evidential analysis. Real examples are presented to illustrate and compare the empirical likelihood method and the parametric likelihood methods. These methods are also compared in terms of asymptotic efficiency by combining relevant results from different areas. It is seen that a parametric likelihood based on a correctly specified model is generally more efficient than an empirical likelihood for the same parameter. However, when the working model fails, a parametric likelihood either breaks down or, if a robust version exists, becomes less efficient than the corresponding empirical likelihood.  相似文献   

7.
Liang  Hua; Wu  Hulin; Zou  Guohua 《Biometrika》2008,95(3):773-778
The conventional model selection criterion, the Akaike informationcriterion, AIC, has been applied to choose candidate modelsin mixed-effects models by the consideration of marginal likelihood.Vaida & Blanchard (2005) demonstrated that such a marginalAIC and its small sample correction are inappropriate when theresearch focus is on clusters. Correspondingly, these authorssuggested the use of conditional AIC. Their conditional AICis derived under the assumption that the variance-covariancematrix or scaled variance-covariance matrix of random effectsis known. This note provides a general conditional AIC but withoutthese strong assumptions. Simulation studies show that the proposedmethod is promising.  相似文献   

8.
We propose parametric regression analysis of cumulative incidence function with competing risks data. A simple form of Gompertz distribution is used for the improper baseline subdistribution of the event of interest. Maximum likelihood inferences on regression parameters and associated cumulative incidence function are developed for parametric models, including a flexible generalized odds rate model. Estimation of the long-term proportion of patients with cause-specific events is straightforward in the parametric setting. Simple goodness-of-fit tests are discussed for evaluating a fixed odds rate assumption. The parametric regression methods are compared with an existing semiparametric regression analysis on a breast cancer data set where the cumulative incidence of recurrence is of interest. The results demonstrate that the likelihood-based parametric analyses for the cumulative incidence function are a practically useful alternative to the semiparametric analyses.  相似文献   

9.
Moming Li  Guoqing Diao  Jing Qin 《Biometrics》2020,76(4):1216-1228
We consider a two-sample problem where data come from symmetric distributions. Usual two-sample data with only magnitudes recorded, arising from case-control studies or logistic discriminant analyses, may constitute a symmetric two-sample problem. We propose a semiparametric model such that, in addition to symmetry, the log ratio of two unknown density functions is modeled in a known parametric form. The new semiparametric model, tailor-made for symmetric two-sample data, can also be viewed as a biased sampling model subject to symmetric constraint. A maximum empirical likelihood estimation approach is adopted to estimate the unknown model parameters, and the corresponding profile empirical likelihood ratio test is utilized to perform hypothesis testing regarding the two population distributions. Symmetry, however, comes with irregularity. It is shown that, under the null hypothesis of equal symmetric distributions, the maximum empirical likelihood estimator has degenerate Fisher information, and the test statistic has a mixture of χ2-type asymptotic distribution. Extensive simulation studies have been conducted to demonstrate promising statistical powers under correct and misspecified models. We apply the proposed methods to two real examples.  相似文献   

10.
Malka Gorfine  Li Hsu 《Biometrics》2011,67(2):415-426
Summary In this work, we provide a new class of frailty‐based competing risks models for clustered failure times data. This class is based on expanding the competing risks model of Prentice et al. (1978, Biometrics 34 , 541–554) to incorporate frailty variates, with the use of cause‐specific proportional hazards frailty models for all the causes. Parametric and nonparametric maximum likelihood estimators are proposed. The main advantages of the proposed class of models, in contrast to the existing models, are: (1) the inclusion of covariates; (2) the flexible structure of the dependency among the various types of failure times within a cluster; and (3) the unspecified within‐subject dependency structure. The proposed estimation procedures produce the most efficient parametric and semiparametric estimators and are easy to implement. Simulation studies show that the proposed methods perform very well in practical situations.  相似文献   

11.
Longitudinal data are common in clinical trials and observational studies, where missing outcomes due to dropouts are always encountered. Under such context with the assumption of missing at random, the weighted generalized estimating equation (WGEE) approach is widely adopted for marginal analysis. Model selection on marginal mean regression is a crucial aspect of data analysis, and identifying an appropriate correlation structure for model fitting may also be of interest and importance. However, the existing information criteria for model selection in WGEE have limitations, such as separate criteria for the selection of marginal mean and correlation structures, unsatisfactory selection performance in small‐sample setups, and so forth. In particular, there are few studies to develop joint information criteria for selection of both marginal mean and correlation structures. In this work, by embedding empirical likelihood into the WGEE framework, we propose two innovative information criteria named a joint empirical Akaike information criterion and a joint empirical Bayesian information criterion, which can simultaneously select the variables for marginal mean regression and also correlation structure. Through extensive simulation studies, these empirical‐likelihood‐based criteria exhibit robustness, flexibility, and outperformance compared to the other criteria including the weighted quasi‐likelihood under the independence model criterion, the missing longitudinal information criterion, and the joint longitudinal information criterion. In addition, we provide a theoretical justification of our proposed criteria, and present two real data examples in practice for further illustration.  相似文献   

12.
Often there is substantial uncertainty in the selection of confounderswhen estimating the association between an exposure and health.We define this type of uncertainty as `adjustment uncertainty'.We propose a general statistical framework for handling adjustmentuncertainty in exposure effect estimation for a large numberof confounders, we describe a specific implementation, and wedevelop associated visualization tools. Theoretical resultsand simulation studies show that the proposed method providesconsistent estimators of the exposure effect and its variance.We also show that, when the goal is to estimate an exposureeffect accounting for adjustment uncertainty, Bayesian modelaveraging with posterior model probabilities approximated usinginformation criteria can fail to estimate the exposure effectand can over- or underestimate its variance. We compare ourapproach to Bayesian model averaging using time series dataon levels of fine particulate matter and mortality.  相似文献   

13.
Schafer DW 《Biometrics》2001,57(1):53-61
This paper presents an EM algorithm for semiparametric likelihood analysis of linear, generalized linear, and nonlinear regression models with measurement errors in explanatory variables. A structural model is used in which probability distributions are specified for (a) the response and (b) the measurement error. A distribution is also assumed for the true explanatory variable but is left unspecified and is estimated by nonparametric maximum likelihood. For various types of extra information about the measurement error distribution, the proposed algorithm makes use of available routines that would be appropriate for likelihood analysis of (a) and (b) if the true x were available. Simulations suggest that the semiparametric maximum likelihood estimator retains a high degree of efficiency relative to the structural maximum likelihood estimator based on correct distributional assumptions and can outperform maximum likelihood based on an incorrect distributional assumption. The approach is illustrated on three examples with a variety of structures and types of extra information about the measurement error distribution.  相似文献   

14.
Summary Given a large number of t‐statistics, we consider the problem of approximating the distribution of noncentrality parameters (NCPs) by a continuous density. This problem is closely related to the control of false discovery rates (FDR) in massive hypothesis testing applications, e.g., microarray gene expression analysis. Our methodology is similar to, but improves upon, the existing approach by Ruppert, Nettleton, and Hwang (2007, Biometrics, 63, 483–495). We provide parametric, nonparametric, and semiparametric estimators for the distribution of NCPs, as well as estimates of the FDR and local FDR. In the parametric situation, we assume that the NCPs follow a distribution that leads to an analytically available marginal distribution for the test statistics. In the nonparametric situation, we use convex combinations of basis density functions to estimate the density of the NCPs. A sequential quadratic programming procedure is developed to maximize the penalized likelihood. The smoothing parameter is selected with the approximate network information criterion. A semiparametric estimator is also developed to combine both parametric and nonparametric fits. Simulations show that, under a variety of situations, our density estimates are closer to the underlying truth and our FDR estimates are improved compared with alternative methods. Data‐based simulations and the analyses of two microarray datasets are used to evaluate the performance in realistic situations.  相似文献   

15.
Yuan Y  Yin G 《Biometrics》2011,67(4):1543-1554
In the estimation of a dose-response curve, parametric models are straightforward and efficient but subject to model misspecifications; nonparametric methods are robust but less efficient. As a compromise, we propose a semiparametric approach that combines the advantages of parametric and nonparametric curve estimates. In a mixture form, our estimator takes a weighted average of the parametric and nonparametric curve estimates, in which a higher weight is assigned to the estimate with a better model fit. When the parametric model assumption holds, the semiparametric curve estimate converges to the parametric estimate and thus achieves high efficiency; when the parametric model is misspecified, the semiparametric estimate converges to the nonparametric estimate and remains consistent. We also consider an adaptive weighting scheme to allow the weight to vary according to the local fit of the models. We conduct extensive simulation studies to investigate the performance of the proposed methods and illustrate them with two real examples.  相似文献   

16.
Specification of an appropriate model is critical to valid statistical inference. Given the “true model” for the data is unknown, the goal of model selection is to select a plausible approximating model that balances model bias and sampling variance. Model selection based on information criteria such as AIC or its variant AICc, or criteria like CAIC, has proven useful in a variety of contexts including the analysis of open-population capture-recapture data. These criteria have not been intensively evaluated for closed-population capture-recapture models, which are integer parameter models used to estimate population size (N), and there is concern that they will not perform well. To address this concern, we evaluated AIC, AICc, and CAIC model selection for closed-population capture-recapture models by empirically assessing the quality of inference for the population size parameter N. We found that AIC-, AICc-, and CAIC-selected models had smaller relative mean squared errors than randomly selected models, but that confidence interval coverage on N was poor unless unconditional variance estimates (which incorporate model uncertainty) were used to compute confidence intervals. Overall, AIC and AICc outperformed CAIC, and are preferred to CAIC for selection among the closed-population capture-recapture models we investigated. A model averaging approach to estimation, using AIC, AICc, or CAIC to estimate weights, was also investigated and proved superior to estimation using AIC-, AICc-, or CAIC-selected models. Our results suggested that, for model averaging, AIC or AICc should be favored over CAIC for estimating weights.  相似文献   

17.
Xue  Liugen; Zhu  Lixing 《Biometrika》2007,94(4):921-937
A semiparametric regression model for longitudinal data is considered.The empirical likelihood method is used to estimate the regressioncoefficients and the baseline function, and to construct confidenceregions and intervals. It is proved that the maximum empiricallikelihood estimator of the regression coefficients achievesasymptotic efficiency and the estimator of the baseline functionattains asymptotic normality when a bias correction is made.Two calibrated empirical likelihood approaches to inferencefor the baseline function are developed. We propose a groupwiseempirical likelihood procedure to handle the inter-series dependencefor the longitudinal semiparametric regression model, and employbias correction to construct the empirical likelihood ratiofunctions for the parameters of interest. This leads us to provea nonparametric version of Wilks' theorem. Compared with methodsbased on normal approximations, the empirical likelihood doesnot require consistent estimators for the asymptotic varianceand bias. A simulation compares the empirical likelihood andnormal-based methods in terms of coverage accuracies and averageareas/lengths of confidence regions/intervals.  相似文献   

18.
Parametric and semiparametric cure models have been proposed for cure proportion estimation in cancer clinical research. In this paper, several parametric and semiparametric models are compared, and their estimation methods are discussed within the framework of the EM algorithm. We show that the semiparametric PH cure model can achieve efficiency levels similar to those of parametric cure models, provided that the failure time distribution is well specified and uncured patients have an increasing hazard rate. Therefore the semiparametric model is a viable alternative to parametric cure models. When the hazard rate of uncured patients is rapidly decreasing, the estimates from the semiparametric cure model tend to have large variations and biases. However, all other models also tend to have large variations and biases in this case.  相似文献   

19.
We consider the efficient estimation of a regression parameter in a partially linear additive nonparametric regression model from repeated measures data when the covariates are multivariate. To date, while there is some literature in the scalar covariate case, the problem has not been addressed in the multivariate additive model case. Ours represents a first contribution in this direction. As part of this work, we first describe the behavior of nonparametric estimators for additive models with repeated measures when the underlying model is not additive. These results are critical when one considers variants of the basic additive model. We apply them to the partially linear additive repeated-measures model, deriving an explicit consistent estimator of the parametric component; if the errors are in addition Gaussian, the estimator is semiparametric efficient. We also apply our basic methods to a unique testing problem that arises in genetic epidemiology; in combination with a projection argument we develop an efficient and easily computed testing scheme. Simulations and an empirical example from nutritional epidemiology illustrate our methods.  相似文献   

20.
In this paper, a semiparametric bivariate linear regression model for survival and quality-adjusted survival is investigated. Even with a parametric specification for the joint, distribution, maximum likelihood is not applicable because of induced informative censoring. We propose inference procedures based on estimating functions. The estimators are consistent and asymptotically normal. Hypothesis tests and confidence intervals may be constructed with easy-to-implement resampling techniques. Simultaneous regression modeling of survival and quality-adjusted survival has not been studied formally. Our methodology gives parameter estimates that are highly interpretable in the context of a cost-effectiveness analysis. The usefulness of the proposal is illustrated with a breast cancer dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号