首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Spatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available “indiCAR” model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log‐linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non‐log‐linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth‐indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two‐step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth‐indiCAR through simulation. Our results indicate that the smooth‐indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia.  相似文献   

2.
Summary In this article, we propose a positive stable shared frailty Cox model for clustered failure time data where the frailty distribution varies with cluster‐level covariates. The proposed model accounts for covariate‐dependent intracluster correlation and permits both conditional and marginal inferences. We obtain marginal inference directly from a marginal model, then use a stratified Cox‐type pseudo‐partial likelihood approach to estimate the regression coefficient for the frailty parameter. The proposed estimators are consistent and asymptotically normal and a consistent estimator of the covariance matrix is provided. Simulation studies show that the proposed estimation procedure is appropriate for practical use with a realistic number of clusters. Finally, we present an application of the proposed method to kidney transplantation data from the Scientific Registry of Transplant Recipients.  相似文献   

3.
Ecochard R  Clayton DG 《Biometrics》2000,56(4):1023-1029
Delay until conception is generally described by a mixture of geometric distributions. Weinberg and Gladen (1986, Biometrics 42, 547-560) proposed a regression generalization of the beta-geometric mixture model where covariates effects were expressed in terms of contrasts of marginal hazards. Scheike and Jensen (1997, Biometrics 53, 318-329) developed a frailty model for discrete event times data based on discrete-time analogues of Hougaard's results (1984, Biometrika 71, 75-83). This paper is on a generalization to a three-parameter family distribution and an extension to multivariate cases. The model allows the introduction of explanatory variables, including time-dependent variables at the subject-specific level, together with a choice from a flexible family of random effect distributions. This makes it possible, in the context of medically assisted conception, to include data sources with multiple pregnancies (or attempts at pregnancy) per couple.  相似文献   

4.
In randomized clinical trials where the times to event of two treatment groups are compared under a proportional hazards assumption, it has been established that omitting prognostic factors from the model entails an underestimation of the hazards ratio. Heterogeneity due to unobserved covariates in cancer patient populations is a concern since genomic investigations have revealed molecular and clinical heterogeneity in these populations. In HIV prevention trials, heterogeneity is unavoidable and has been shown to decrease the treatment effect over time. This article assesses the influence of trial duration on the bias of the estimated hazards ratio resulting from omitting covariates from the Cox analysis. The true model is defined by including an unobserved random frailty term in the individual hazard that reflects the omitted covariate. Three frailty distributions are investigated: gamma, log‐normal, and binary, and the asymptotic bias of the hazards ratio estimator is calculated. We show that the attenuation of the treatment effect resulting from unobserved heterogeneity strongly increases with trial duration, especially for continuous frailties that are likely to reflect omitted covariates, as they are often encountered in practice. The possibility of interpreting the long‐term decrease in treatment effects as a bias induced by heterogeneity and trial duration is illustrated by a trial in oncology where adjuvant chemotherapy in stage 1B NSCLC was investigated.  相似文献   

5.
King R  Brooks SP  Coulson T 《Biometrics》2008,64(4):1187-1195
SUMMARY: We consider the issue of analyzing complex ecological data in the presence of covariate information and model uncertainty. Several issues can arise when analyzing such data, not least the need to take into account where there are missing covariate values. This is most acutely observed in the presence of time-varying covariates. We consider mark-recapture-recovery data, where the corresponding recapture probabilities are less than unity, so that individuals are not always observed at each capture event. This often leads to a large amount of missing time-varying individual covariate information, because the covariate cannot usually be recorded if an individual is not observed. In addition, we address the problem of model selection over these covariates with missing data. We consider a Bayesian approach, where we are able to deal with large amounts of missing data, by essentially treating the missing values as auxiliary variables. This approach also allows a quantitative comparison of different models via posterior model probabilities, obtained via the reversible jump Markov chain Monte Carlo algorithm. To demonstrate this approach we analyze data relating to Soay sheep, which pose several statistical challenges in fully describing the intricacies of the system.  相似文献   

6.
Clustered interval‐censored data commonly arise in many studies of biomedical research where the failure time of interest is subject to interval‐censoring and subjects are correlated for being in the same cluster. A new semiparametric frailty probit regression model is proposed to study covariate effects on the failure time by accounting for the intracluster dependence. Under the proposed normal frailty probit model, the marginal distribution of the failure time is a semiparametric probit model, the regression parameters can be interpreted as both the conditional covariate effects given frailty and the marginal covariate effects up to a multiplicative constant, and the intracluster association can be summarized by two nonparametric measures in simple and explicit form. A fully Bayesian estimation approach is developed based on the use of monotone splines for the unknown nondecreasing function and a data augmentation using normal latent variables. The proposed Gibbs sampler is straightforward to implement since all unknowns have standard form in their full conditional distributions. The proposed method performs very well in estimating the regression parameters as well as the intracluster association, and the method is robust to frailty distribution misspecifications as shown in our simulation studies. Two real‐life data sets are analyzed for illustration.  相似文献   

7.
Summary Time varying, individual covariates are problematic in experiments with marked animals because the covariate can typically only be observed when each animal is captured. We examine three methods to incorporate time varying, individual covariates of the survival probabilities into the analysis of data from mark‐recapture‐recovery experiments: deterministic imputation, a Bayesian imputation approach based on modeling the joint distribution of the covariate and the capture history, and a conditional approach considering only the events for which the associated covariate data are completely observed (the trinomial model). After describing the three methods, we compare results from their application to the analysis of the effect of body mass on the survival of Soay sheep (Ovis aries) on the Isle of Hirta, Scotland. Simulations based on these results are then used to make further comparisons. We conclude that both the trinomial model and Bayesian imputation method perform best in different situations. If the capture and recovery probabilities are all high, then the trinomial model produces precise, unbiased estimators that do not depend on any assumptions regarding the distribution of the covariate. In contrast, the Bayesian imputation method performs substantially better when capture and recovery probabilities are low, provided that the specified model of the covariate is a good approximation to the true data‐generating mechanism.  相似文献   

8.
Summary Several statistical methods for detecting associations between quantitative traits and candidate genes in structured populations have been developed for fully observed phenotypes. However, many experiments are concerned with failure‐time phenotypes, which are usually subject to censoring. In this article, we propose statistical methods for detecting associations between a censored quantitative trait and candidate genes in structured populations with complex multiple levels of genetic relatedness among sampled individuals. The proposed methods correct for continuous population stratification using both population structure variables as covariates and the frailty terms attributable to kinship. The relationship between the time‐at‐onset data and genotypic scores at a candidate marker is modeled via a parametric Weibull frailty accelerated failure time (AFT) model as well as a semiparametric frailty AFT model, where the baseline survival function is flexibly modeled as a mixture of Polya trees centered around a family of Weibull distributions. For both parametric and semiparametric models, the frailties are modeled via an intrinsic Gaussian conditional autoregressive prior distribution with the kinship matrix being the adjacency matrix connecting subjects. Simulation studies and applications to the Arabidopsis thaliana line flowering time data sets demonstrated the advantage of the new proposals over existing approaches.  相似文献   

9.
Zhang H  Merikangas K 《Biometrics》2000,56(3):815-823
Coupled with environmental factors, genes contribute to numerous human diseases and traits. While there are many epidemiological methods to assess the familial clustering of traits, few are flexible enough to accommodate interactions between covariates and familial factors. In this paper, we propose and develop a frailty model that establishes an integrated framework to evaluate familial transmission of a disease by controlling for covariate effects and conveniently testing the interactions between covariates and familial factors. We also present a peeling algorithm that dramatically reduces the computational burden. This frailty model is employed to examine the familial transmission of major subtypes of alcoholism, namely, alcohol abuse and dependence. We conclude that alcohol dependence is strongly familial whereas alcohol abuse expresses a marginally significant pattern of familial transmission. Moreover, females manifest alcoholism at a lower threshold, and there is no sex-specific familial transmission of alcoholism after adjustment for the threshold effect.  相似文献   

10.
We present a model that describes the distribution of recurring times of a disease in presence of covariate effects. After a first occurrence of the disease in an individual, the time intervals between successive cases are supposed to be independent and to be a mixture of two distributions according to the issue of the previous treatment. Both sub‐distributions of the model and the mixture proportion are allowed to involve covariates. Parametric inference is considered and we illustrate the methods with data of a recurrent disease and with simulations, using piecewise constant baseline hazard functions.  相似文献   

11.
We present a parametric family of regression models for interval-censored event-time (survival) data that accomodates both fixed (e.g. baseline) and time-dependent covariates. The model employs a three-parameter family of survival distributions that includes the Weibull, negative binomial, and log-logistic distributions as special cases, and can be applied to data with left, right, interval, or non-censored event times. Standard methods, such as Newton-Raphson, can be employed to estimate the model and the resulting estimates have an asymptotically normal distribution about the true values with a covariance matrix that is consistently estimated by the information function. The deviance function is described to assess model fit and a robust sandwich estimate of the covariance may also be employed to provide asymptotically robust inferences when the model assumptions do not apply. Spline functions may also be employed to allow for non-linear covariates. The model is applied to data from a long-term study of type 1 diabetes to describe the effects of longitudinal measures of glycemia (HbA1c) over time (the time-dependent covariate) on the risk of progression of diabetic retinopathy (eye disease), an interval-censored event-time outcome.  相似文献   

12.
Stubbendick AL  Ibrahim JG 《Biometrics》2003,59(4):1140-1150
This article analyzes quality of life (QOL) data from an Eastern Cooperative Oncology Group (ECOG) melanoma trial that compared treatment with ganglioside vaccination to treatment with high-dose interferon. The analysis of this data set is challenging due to several difficulties, namely, nonignorable missing longitudinal responses and baseline covariates. Hence, we propose a selection model for estimating parameters in the normal random effects model with nonignorable missing responses and covariates. Parameters are estimated via maximum likelihood using the Gibbs sampler and a Monte Carlo expectation maximization (EM) algorithm. Standard errors are calculated using the bootstrap. The method allows for nonmonotone patterns of missing data in both the response variable and the covariates. We model the missing data mechanism and the missing covariate distribution via a sequence of one-dimensional conditional distributions, allowing the missing covariates to be either categorical or continuous, as well as time-varying. We apply the proposed approach to the ECOG quality-of-life data and conduct a small simulation study evaluating the performance of the maximum likelihood estimates. Our results indicate that a patient treated with the vaccine has a higher QOL score on average at a given time point than a patient treated with high-dose interferon.  相似文献   

13.
Chen B  Zhou XH 《Biometrics》2011,67(3):830-842
Longitudinal studies often feature incomplete response and covariate data. Likelihood-based methods such as the expectation-maximization algorithm give consistent estimators for model parameters when data are missing at random (MAR) provided that the response model and the missing covariate model are correctly specified; however, we do not need to specify the missing data mechanism. An alternative method is the weighted estimating equation, which gives consistent estimators if the missing data and response models are correctly specified; however, we do not need to specify the distribution of the covariates that have missing values. In this article, we develop a doubly robust estimation method for longitudinal data with missing response and missing covariate when data are MAR. This method is appealing in that it can provide consistent estimators if either the missing data model or the missing covariate model is correctly specified. Simulation studies demonstrate that this method performs well in a variety of situations.  相似文献   

14.
Georeferencing error is prevalent in datasets used to model species distributions, inducing uncertainty in covariate values associated with species occurrences that result in biased probability of occurrence estimates. Traditionally, this error has been dealt with at the data‐level by using only records with an acceptable level of error (filtering) or by summarizing covariates at sampling units by using measures of central tendency (averaging). Here we compare those previous approaches to a novel implementation of a Bayesian logistic regression with measurement error (ME), a seldom used method in species distribution modeling. We show that the ME model outperforms data‐level approaches on 1) specialist species and 2) when either sample sizes are small, the georeferencing error is large or when all georeferenced occurrences have a fixed level of error. Thus, for certain types of species and datasets the ME model is an effective method to reduce biases in probability of occurrence estimates and account for the uncertainty generated by georeferencing error. Our approach may be expanded for its use with presence‐only data as well as to include other sources of uncertainty in species distribution models.  相似文献   

15.
Summary .  Recurrent event data analyses are usually conducted under the assumption that the censoring time is independent of the recurrent event process. In many applications the censoring time can be informative about the underlying recurrent event process, especially in situations where a correlated failure event could potentially terminate the observation of recurrent events. In this article, we consider a semiparametric model of recurrent event data that allows correlations between censoring times and recurrent event process via frailty. This flexible framework incorporates both time-dependent and time-independent covariates in the formulation, while leaving the distributions of frailty and censoring times unspecified. We propose a novel semiparametric inference procedure that depends on neither the frailty nor the censoring time distribution. Large sample properties of the regression parameter estimates and the estimated baseline cumulative intensity functions are studied. Numerical studies demonstrate that the proposed methodology performs well for realistic sample sizes. An analysis of hospitalization data for patients in an AIDS cohort study is presented to illustrate the proposed method.  相似文献   

16.
In this paper, we consider selection based on the best predictor of animal additive genetic values in Gaussian linear mixed models, threshold models, Poisson mixed models, and log normal frailty models for survival data (including models with time-dependent covariates with associated fixed or random effects). In the different models, expressions are given (when these can be found – otherwise unbiased estimates are given) for prediction error variance, accuracy of selection and expected response to selection on the additive genetic scale and on the observed scale. The expressions given for non Gaussian traits are generalisations of the well-known formulas for Gaussian traits – and reflect, for Poisson mixed models and frailty models for survival data, the hierarchal structure of the models. In general the ratio of the additive genetic variance to the total variance in the Gaussian part of the model (heritability on the normally distributed level of the model) or a generalised version of heritability plays a central role in these formulas.  相似文献   

17.
Association-based linkage disequilibrium (LD) mapping is an increasingly important tool for localizing genes that show potential influence on human aging and longevity. As haplotypes contain more LD information than single markers, a haplotype-based LD approach can have increased power in detecting associations as well as increased robustness in statistical testing. In this paper, we develop a new statistical model to estimate haplotype relative risks (HRRs) on human survival using unphased multilocus genotype data from unrelated individuals in cross-sectional studies. Based on the proportional hazard assumption, the model can estimate haplotype risk and frequency parameters, incorporate observed covariates, assess interactions between haplotypes and the covariates, and investigate the modes of gene function. By introducing population survival information available from population statistics, we are able to develop a procedure that carries out the parameter estimation using a nonparametric baseline hazard function and estimates sex-specific HRRs to infer gene-sex interaction. We also evaluate the haplotype effects on human survival while taking into account individual heterogeneity in the unobserved genetic and nongenetic factors or frailty by introducing the gamma-distributed frailty into the survival function. After model validation by computer simulation, we apply our method to an empirical data set to measure haplotype effects on human survival and to estimate haplotype frequencies at birth and over the observed ages. Results from both simulation and model application indicate that our survival analysis model is an efficient method for inferring haplotype effects on human survival in population-based association studies.  相似文献   

18.
Incomplete covariate data are a common occurrence in studies in which the outcome is survival time. Further, studies in the health sciences often give rise to correlated, possibly censored, survival data. With no missing covariate data, if the marginal distributions of the correlated survival times follow a given parametric model, then the estimates using the maximum likelihood estimating equations, naively treating the correlated survival times as independent, give consistent estimates of the relative risk parameters Lipsitz et al. 1994 50, 842-846. Now, suppose that some observations within a cluster have some missing covariates. We show in this paper that if one naively treats observations within a cluster as independent, that one can still use the maximum likelihood estimating equations to obtain consistent estimates of the relative risk parameters. This method requires the estimation of the parameters of the distribution of the covariates. We present results from a clinical trial Lipsitz and Ibrahim (1996b) 2, 5-14 with five covariates, four of which have some missing values. In the trial, the clusters are the hospitals in which the patients were treated.  相似文献   

19.
Multivariate recurrent event data are usually encountered in many clinical and longitudinal studies in which each study subject may experience multiple recurrent events. For the analysis of such data, most existing approaches have been proposed under the assumption that the censoring times are noninformative, which may not be true especially when the observation of recurrent events is terminated by a failure event. In this article, we consider regression analysis of multivariate recurrent event data with both time‐dependent and time‐independent covariates where the censoring times and the recurrent event process are allowed to be correlated via a frailty. The proposed joint model is flexible where both the distributions of censoring and frailty variables are left unspecified. We propose a pairwise pseudolikelihood approach and an estimating equation‐based approach for estimating coefficients of time‐dependent and time‐independent covariates, respectively. The large sample properties of the proposed estimates are established, while the finite‐sample properties are demonstrated by simulation studies. The proposed methods are applied to the analysis of a set of bivariate recurrent event data from a study of platelet transfusion reactions.  相似文献   

20.
In this paper we develop a Bayesian approach to parameter estimation in a stochastic spatio-temporal model of the spread of invasive species across a landscape. To date, statistical techniques, such as logistic and autologistic regression, have outstripped stochastic spatio-temporal models in their ability to handle large numbers of covariates. Here we seek to address this problem by making use of a range of covariates describing the bio-geographical features of the landscape. Relative to regression techniques, stochastic spatio-temporal models are more transparent in their representation of biological processes. They also explicitly model temporal change, and therefore do not require the assumption that the species' distribution (or other spatial pattern) has already reached equilibrium as is often the case with standard statistical approaches. In order to illustrate the use of such techniques we apply them to the analysis of data detailing the spread of an invasive plant, Heracleum mantegazzianum, across Britain in the 20th Century using geo-referenced covariate information describing local temperature, elevation and habitat type. The use of Markov chain Monte Carlo sampling within a Bayesian framework facilitates statistical assessments of differences in the suitability of different habitat classes for H. mantegazzianum, and enables predictions of future spread to account for parametric uncertainty and system variability. Our results show that ignoring such covariate information may lead to biased estimates of key processes and implausible predictions of future distributions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号