期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An information ratio-based goodness-of-fit test for copula models on censored data

Tao Sun Yu Cheng Ying Ding 《Biometrics》2023,79(3):1713-1725

Copula is a popular method for modeling the dependence among marginal distributions in multivariate censored data. As many copula models are available, it is essential to check if the chosen copula model fits the data well for analysis. Existing approaches to testing the fitness of copula models are mainly for complete or right-censored data. No formal goodness-of-fit (GOF) test exists for interval-censored or recurrent events data. We develop a general GOF test for copula-based survival models using the information ratio (IR) to address this research gap. It can be applied to any copula family with a parametric form, such as the frequently used Archimedean, Gaussian, and D-vine families. The test statistic is easy to calculate, and the test procedure is straightforward to implement. We establish the asymptotic properties of the test statistic. The simulation results show that the proposed test controls the type-I error well and achieves adequate power when the dependence strength is moderate to high. Finally, we apply our method to test various copula models in analyzing multiple real datasets. Our method consistently separates different copula models for all these datasets in terms of model fitness. 相似文献

2.

Regression modeling of semicompeting risks data 总被引：1，自引：0，他引：1

Peng L Fine JP 《Biometrics》2007,63(1):96-108

Semicompeting risks data are often encountered in clinical trials with intermediate endpoints subject to dependent censoring from informative dropout. Unlike with competing risks data, dropout may not be dependently censored by the intermediate event. There has recently been increased attention to these data, in particular inferences about the marginal distribution of the intermediate event without covariates. In this article, we incorporate covariates and formulate their effects on the survival function of the intermediate event via a functional regression model. To accommodate informative censoring, a time-dependent copula model is proposed in the observable region of the data which is more flexible than standard parametric copula models for the dependence between the events. The model permits estimation of the marginal distribution under weaker assumptions than in previous work on competing risks data. New nonparametric estimators for the marginal and dependence models are derived from nonlinear estimating equations and are shown to be uniformly consistent and to converge weakly to Gaussian processes. Graphical model checking techniques are presented for the assumed models. Nonparametric tests are developed accordingly, as are inferences for parametric submodels for the time-varying covariate effects and copula parameters. A novel time-varying sensitivity analysis is developed using the estimation procedures. Simulations and an AIDS data analysis demonstrate the practical utility of the methodology. 相似文献

3.

Copula‐based semiparametric models for spatiotemporal data

Yanlin Tang Huixia J. Wang Ying Sun Amanda S. Hering 《Biometrics》2019,75(4):1156-1167

The joint analysis of spatial and temporal processes poses computational challenges due to the data's high dimensionality. Furthermore, such data are commonly non‐Gaussian. In this paper, we introduce a copula‐based spatiotemporal model for analyzing spatiotemporal data and propose a semiparametric estimator. The proposed algorithm is computationally simple, since it models the marginal distribution and the spatiotemporal dependence separately. Instead of assuming a parametric distribution, the proposed method models the marginal distributions nonparametrically and thus offers more flexibility. The method also provides a convenient way to construct both point and interval predictions at new times and locations, based on the estimated conditional quantiles. Through a simulation study and an analysis of wind speeds observed along the border between Oregon and Washington, we show that our method produces more accurate point and interval predictions for skewed data than those based on normality assumptions. 相似文献

4.

A class of goodness of fit tests for a copula based on bivariate right-censored data

Andersen PK Ekstrøm CT Klein JP Shu Y Zhang MJ 《Biometrical journal. Biometrische Zeitschrift》2005,47(6):815-824

The copula of a bivariate distribution, constructed by making marginal transformations of each component, captures all the information in the bivariate distribution about the dependence between two variables. For frailty models for bivariate data the choice of a family of distributions for the random frailty corresponds to the choice of a parametric family for the copula. A class of tests of the hypothesis that the copula is in a given parametric family, with unspecified association parameter, based on bivariate right censored data is proposed. These tests are based on first making marginal Kaplan-Meier transformations of the data and then comparing a non-parametric estimate of the copula to an estimate based on the assumed family of models. A number of options are available for choosing the scale and the distance measure for this comparison. Significance levels of the test are found by a modified bootstrap procedure. The procedure is used to check the appropriateness of a gamma or a positive stable frailty model in a set of survival data on Danish twins. 相似文献

5.

Modeling spatial survival data using semiparametric frailty models

Li Y Ryan L 《Biometrics》2002,58(2):287-297

We propose a new class of semiparametric frailty models for spatially correlated survival data. Specifically, we extend the ordinary frailty models by allowing random effects accommodating spatial correlations to enter into the baseline hazard function multiplicatively. We prove identifiability of the models and give sufficient regularity conditions. We propose drawing inference based on a marginal rank likelihood. No parametric forms of the baseline hazard need to be assumed in this semiparametric approach. Monte Carlo simulations and the Laplace approach are used to tackle the intractable integral in the likelihood function. Different spatial covariance structures are explored in simulations and the proposed methods are applied to the East Boston Asthma Study to detect prognostic factors leading to childhood asthma. 相似文献

6.

A Gaussian Copula Model for Multivariate Survival Data

Othus M Li Y 《Statistics in biosciences》2010,2(2):154-179

We consider a Gaussian copula model for multivariate survival times. Estimation of the copula association parameter is easily implemented with existing software using a two-stage estimation procedure. Using the Gaussian copula, we are able to test whether the association parameter is equal to zero. When the association term is positive, the model can be extended to incorporate cluster-level frailty terms. Asymptotic properties are derived under the two-stage estimation scheme. Simulation studies verify finite sample utility. We apply the method to a Children’s Oncology Group multicenter study of acute lymphoblastic leukemia. The analysis estimates marginal treatment effects and examines potential clustering within treatment institution. 相似文献

7.

Accelerated failure time modeling via nonparametric mixtures

Byungtae Seo Sangwook Kang 《Biometrics》2023,79(1):165-177

An accelerated failure time (AFT) model assuming a log-linear relationship between failure time and a set of covariates can be either parametric or semiparametric, depending on the distributional assumption for the error term. Both classes of AFT models have been popular in the analysis of censored failure time data. The semiparametric AFT model is more flexible and robust to departures from the distributional assumption than its parametric counterpart. However, the semiparametric AFT model is subject to producing biased results for estimating any quantities involving an intercept. Estimating an intercept requires a separate procedure. Moreover, a consistent estimation of the intercept requires stringent conditions. Thus, essential quantities such as mean failure times might not be reliably estimated using semiparametric AFT models, which can be naturally done in the framework of parametric AFT models. Meanwhile, parametric AFT models can be severely impaired by misspecifications. To overcome this, we propose a new type of the AFT model using a nonparametric Gaussian-scale mixture distribution. We also provide feasible algorithms to estimate the parameters and mixing distribution. The finite sample properties of the proposed estimators are investigated via an extensive stimulation study. The proposed estimators are illustrated using a real dataset. 相似文献

8.

Residuals for proportional hazards models with interval-censored survival data

Farrington CP 《Biometrics》2000,56(2):473-482

We develop diagnostic tools for use with proportional hazards models for interval-censored survival data. We propose counterparts to the Cox-Snell, Lagakos (or martingale), deviance, and Schoenfeld residuals. Many of the properties of these residuals carry over to the interval-censored case. In particular, the interval-censored versions of the Lagakos and Schoenfeld residuals may be derived as components of suitable score statistics. The Lagakos residuals may be used to check regression relationships, while the Schoenfeld residuals can help to detect nonproportional hazards in semiparametric models. The methods apply to parametric models and to the semiparametric model with discrete observation times. 相似文献

9.

Dose-response curve estimation: a semiparametric mixture approach

Yuan Y Yin G 《Biometrics》2011,67(4):1543-1554

In the estimation of a dose-response curve, parametric models are straightforward and efficient but subject to model misspecifications; nonparametric methods are robust but less efficient. As a compromise, we propose a semiparametric approach that combines the advantages of parametric and nonparametric curve estimates. In a mixture form, our estimator takes a weighted average of the parametric and nonparametric curve estimates, in which a higher weight is assigned to the estimate with a better model fit. When the parametric model assumption holds, the semiparametric curve estimate converges to the parametric estimate and thus achieves high efficiency; when the parametric model is misspecified, the semiparametric estimate converges to the nonparametric estimate and remains consistent. We also consider an adaptive weighting scheme to allow the weight to vary according to the local fit of the models. We conduct extensive simulation studies to investigate the performance of the proposed methods and illustrate them with two real examples. 相似文献

10.

Multivariate continuation ratio models: connections and caveats

Heagerty PJ Zeger SL 《Biometrics》2000,56(3):719-732

We develop semiparametric estimation methods for a pair of regressions that characterize the first and second moments of clustered discrete survival times. In the first regression, we represent discrete survival times through univariate continuation indicators whose expectations are modeled using a generalized linear model. In the second regression, we model the marginal pairwise association of survival times using the Clayton-Oakes cross-product ratio (Clayton, 1978, Biometrika 65, 141-151; Oakes, 1989, Journal of the American Statistical Association 84, 487-493). These models have recently been proposed by Shih (1998, Biometrics 54, 1115-1128). We relate the discrete survival models to multivariate multinomial models presented in Heagerty and Zeger (1996, Journal of the American Statistical Society 91, 1024-1036) and derive a paired estimating equations procedure that is computationally feasible for moderate and large clusters. We extend the work of Guo and Lin (1994, Biometrics 50, 632-639) and Shih (1998) to allow covariance weighted estimating equations and investigate the impact of weighting in terms of asymptotic relative efficiency. We demonstrate that the multinomial structure must be acknowledged when adopting weighted estimating equations and show that a naive use of GEE methods can lead to inconsistent parameter estimates. Finally, we illustrate the proposed methodology by analyzing psychological testing data previously summarized by TenHave and Uttal (1994, Applied Statistics 43, 371-384) and Guo and Lin (1994). 相似文献

11.

Bayesian semiparametric copula estimation with application to psychiatric genetics

下载免费PDF全文

Ori Rosen Wesley K. Thompson 《Biometrical journal. Biometrische Zeitschrift》2015,57(3):468-484

This paper proposes a semiparametric methodology for modeling multivariate and conditional distributions. We first build a multivariate distribution whose dependence structure is induced by a Gaussian copula and whose marginal distributions are estimated nonparametrically via mixtures of B‐spline densities. The conditional distribution of a given variable is obtained in closed form from this multivariate distribution. We take a Bayesian approach, using Markov chain Monte Carlo methods for inference. We study the frequentist properties of the proposed methodology via simulation and apply the method to estimation of conditional densities of summary statistics, used for computing conditional local false discovery rates, from genetic association studies of schizophrenia and cardiovascular disease risk factors. 相似文献

12.

A semiparametric joint model for cluster size and subunit-specific interval-censored outcomes

Chun Yin Lee Kin Yau Wong Kwok Fai Lam Dipankar Bandyopadhyay 《Biometrics》2023,79(3):2010-2022

Clustered data frequently arise in biomedical studies, where observations, or subunits, measured within a cluster are associated. The cluster size is said to be informative, if the outcome variable is associated with the number of subunits in a cluster. In most existing work, the informative cluster size issue is handled by marginal approaches based on within-cluster resampling, or cluster-weighted generalized estimating equations. Although these approaches yield consistent estimation of the marginal models, they do not allow estimation of within-cluster associations and are generally inefficient. In this paper, we propose a semiparametric joint model for clustered interval-censored event time data with informative cluster size. We use a random effect to account for the association among event times of the same cluster as well as the association between event times and the cluster size. For estimation, we propose a sieve maximum likelihood approach and devise a computationally-efficient expectation-maximization algorithm for implementation. The estimators are shown to be strongly consistent, with the Euclidean components being asymptotically normal and achieving semiparametric efficiency. Extensive simulation studies are conducted to evaluate the finite-sample performance, efficiency and robustness of the proposed method. We also illustrate our method via application to a motivating periodontal disease dataset. 相似文献

13.

A semiparametric model for the analysis of recurrent-event panel data

Balshaw RF Dean CB 《Biometrics》2002,58(2):324-331

In many longitudinal studies, interest focuses on the occurrence rate of some phenomenon for the subjects in the study. When the phenomenon is nonterminating and possibly recurring, the result is a recurrent-event data set. Examples include epileptic seizures and recurrent cancers. When the recurring event is detectable only by an expensive or invasive examination, only the number of events occurring between follow-up times may be available. This article presents a semiparametric model for such data, based on a multiplicative intensity model paired with a fully flexible nonparametric baseline intensity function. A random subject-specific effect is included in the intensity model to account for the overdispersion frequently displayed in count data. Estimators are determined from quasi-likelihood estimating functions. Because only first- and second-moment assumptions are required for quasi-likelihood, the method is more robust than those based on the specification of a full parametric likelihood. Consistency of the estimators depends only on the assumption of the proportional intensity model. The semiparametric estimators are shown to be highly efficient compared with the usual parametric estimators. As with semiparametric methods in survival analysis, the method provides useful diagnostics for specific parametric models, including a quasi-score statistic for testing specific baseline intensity functions. The techniques are used to analyze cancer recurrences and a pheromone-based mating disruption experiment in moths. A simulation study confirms that, for many practical situations, the estimators possess appropriate small-sample characteristics. 相似文献

14.

Modeling multivariate survival data by a semiparametric random effects proportional odds model

Lam KF Lee YW Leung TL 《Biometrics》2002,58(2):316-323

In this article, the focus is on the analysis of multivariate survival time data with various types of dependence structures. Examples of multivariate survival data include clustered data and repeated measurements from the same subject, such as the interrecurrence times of cancer tumors. A random effect semiparametric proportional odds model is proposed as an alternative to the proportional hazards model. The distribution of the random effects is assumed to be multivariate normal and the random effect is assumed to act additively to the baseline log-odds function. This class of models, which includes the usual shared random effects model, the additive variance components model, and the dynamic random effects model as special cases, is highly flexible and is capable of modeling a wide range of multivariate survival data. A unified estimation procedure is proposed to estimate the regression and dependence parameters simultaneously by means of a marginal-likelihood approach. Unlike the fully parametric case, the regression parameter estimate is not sensitive to the choice of correlation structure of the random effects. The marginal likelihood is approximated by the Monte Carlo method. Simulation studies are carried out to investigate the performance of the proposed method. The proposed method is applied to two well-known data sets, including clustered data and recurrent event times data. 相似文献

15.

A pathway for multivariate analysis of ecological communities using copulas

Marti J. Anderson Perry de Valpine Andrew Punnett Arden E. Miller 《Ecology and evolution》2019,9(6):3276-3294

We describe a new pathway for multivariate analysis of data consisting of counts of species abundances that includes two key components: copulas, to provide a flexible joint model of individual species, and dissimilarity‐based methods, to integrate information across species and provide a holistic view of the community. Individual species are characterized using suitable (marginal) statistical distributions, with the mean, the degree of over‐dispersion, and/or zero‐inflation being allowed to vary among a priori groups of sampling units. Associations among species are then modeled using copulas, which allow any pair of disparate types of variables to be coupled through their cumulative distribution function, while maintaining entirely the separate individual marginal distributions appropriate for each species. A Gaussian copula smoothly captures changes in an index of association that excludes joint absences in the space of the original species variables. A permutation‐based filter with exact family‐wise error can optionally be used a priori to reduce the dimensionality of the copula estimation problem. We describe in detail a Monte Carlo expectation maximization algorithm for efficient estimation of the copula correlation matrix with discrete marginal distributions (counts). The resulting fully parameterized copula models can be used to simulate realistic ecological community data under fully specified null or alternative hypotheses. Distributions of community centroids derived from simulated data can then be visualized in ordinations of ecologically meaningful dissimilarity spaces. Multinomial mixtures of data drawn from copula models also yield smooth power curves in dissimilarity‐based settings. Our proposed analysis pathway provides new opportunities to combine model‐based approaches with dissimilarity‐based methods to enhance understanding of ecological systems. We demonstrate implementation of the pathway through an ecological example, where associations among fish species were found to increase after the establishment of a marine reserve. 相似文献

16.

Ranking prognosis markers in cancer genomic studies

Ma S Song X 《Briefings in bioinformatics》2011,12(1):33-40

In cancer research, high-throughput genomic studies have been extensively conducted, searching for markers associated with cancer diagnosis, prognosis and variation in response to treatment. In this article, we analyze cancer prognosis studies and investigate ranking markers based on their marginal prognosis power. To avoid ambiguity, we focus on microarray gene expression studies where genes are the markers, but note that the methodology and results are applicable to other high-throughput studies. The objectives of this study are 2-fold. First, we investigate ranking markers under three commonly adopted semiparametric models, namely the Cox, accelerated failure time and additive risk models. Data analysis shows that the ranking may vary significantly under different models. Second, we describe a nonparametric concordance measure, which has roots in the time-dependent ROC (receiver operating characteristic) framework and relies on much weaker assumptions than the semiparametric models. In simulation, it is shown that ranking using the concordance measure is not sensitive to model specification whereas ranking under the semiparametric models is. In data analysis, the concordance measure generates rankings significantly different from those under the semiparametric models. 相似文献

17.

An Empirical Comparison of Parametric and Semiparametric Cure Models

Yingwei Peng K.C. Carriere 《Biometrical journal. Biometrische Zeitschrift》2002,44(8):1002-1014

Parametric and semiparametric cure models have been proposed for cure proportion estimation in cancer clinical research. In this paper, several parametric and semiparametric models are compared, and their estimation methods are discussed within the framework of the EM algorithm. We show that the semiparametric PH cure model can achieve efficiency levels similar to those of parametric cure models, provided that the failure time distribution is well specified and uncured patients have an increasing hazard rate. Therefore the semiparametric model is a viable alternative to parametric cure models. When the hazard rate of uncured patients is rapidly decreasing, the estimates from the semiparametric cure model tend to have large variations and biases. However, all other models also tend to have large variations and biases in this case. 相似文献

18.

Resampling-based multiple testing methods with covariate adjustment: application to investigation of antiretroviral drug susceptibility

Yang Y Degruttola V 《Biometrics》2008,64(2):329-336

Summary . Identifying genetic mutations that cause clinical resistance to antiretroviral drugs requires adjustment for potential confounders, such as the number of active drugs in a HIV-infected patient's regimen other than the one of interest. Motivated by this problem, we investigated resampling-based methods to test equal mean response across multiple groups defined by HIV genotype, after adjustment for covariates. We consider construction of test statistics and their null distributions under two types of model: parametric and semiparametric. The covariate function is explicitly specified in the parametric but not in the semiparametric approach. The parametric approach is more precise when models are correctly specified, but suffer from bias when they are not; the semiparametric approach is more robust to model misspecification, but may be less efficient. To help preserve type I error while also improving power in both approaches, we propose resampling approaches based on matching of observations with similar covariate values. Matching reduces the impact of model misspecification as well as imprecision in estimation. These methods are evaluated via simulation studies and applied to a data set that combines results from a variety of clinical studies of salvage regimens. Our focus is on relating HIV genotype to viral susceptibility to abacavir after adjustment for the number of active antiretroviral drugs (excluding abacavir) in the patient's regimen. 相似文献

19.

Comparison of Linear,Nonlinear and Semiparametric Mixed‐effects Models for Estimating HIV Dynamic Parameters

Hulin Wu Caixia Zhao Hua Liang 《Biometrical journal. Biometrische Zeitschrift》2004,46(2):233-245

The potency of antiretroviral agents in AIDS clinical trials can be assessed on the basis of an early viral response such as viral decay rate or change in viral load (number of copies of HIV RNA) of the plasma. Linear, parametric nonlinear, and semiparametric nonlinear mixed‐effects models have been proposed to estimate viral decay rates in viral dynamic models. However, before applying these models to clinical data, a critical question that remains to be addressed is whether these models produce coherent estimates of viral decay rates, and if not, which model is appropriate and should be used in practice. In this paper, we applied these models to data from an AIDS clinical trial of potent antiviral treatments and found significant incongruity in the estimated rates of reduction in viral load. Simulation studies indicated that reliable estimates of viral decay rate were obtained by using the parametric and semiparametric nonlinear mixed‐effects models. Our analysis also indicated that the decay rates estimated by using linear mixed‐effects models should be interpreted differently from those estimated by using nonlinear mixed‐effects models. The semiparametric nonlinear mixed‐effects model is preferred to other models because arbitrary data truncation is not needed. Based on real data analysis and simulation studies, we provide guidelines for estimating viral decay rates from clinical data. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献

20.

Flexible maximum likelihood methods for bivariate proportional hazards models

He W Lawless JF 《Biometrics》2003,59(4):837-848

This article presents methodology for multivariate proportional hazards (PH) regression models. The methods employ flexible piecewise constant or spline specifications for baseline hazard functions in either marginal or conditional PH models, along with assumptions about the association among lifetimes. Because the models are parametric, ordinary maximum likelihood can be applied; it is able to deal easily with such data features as interval censoring or sequentially observed lifetimes, unlike existing semiparametric methods. A bivariate Clayton model (1978, Biometrika 65, 141-151) is used to illustrate the approach taken. Because a parametric assumption about association is made, efficiency and robustness comparisons are made between estimation based on the bivariate Clayton model and "working independence" methods that specify only marginal distributions for each lifetime variable. 相似文献