首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ma S  Kosorok MR  Fine JP 《Biometrics》2006,62(1):202-210
As a useful alternative to Cox's proportional hazard model, the additive risk model assumes that the hazard function is the sum of the baseline hazard function and the regression function of covariates. This article is concerned with estimation and prediction for the additive risk models with right censored survival data, especially when the dimension of the covariates is comparable to or larger than the sample size. Principal component regression is proposed to give unique and numerically stable estimators. Asymptotic properties of the proposed estimators, component selection based on the weighted bootstrap, and model evaluation techniques are discussed. This approach is illustrated with analysis of the primary biliary cirrhosis clinical data and the diffuse large B-cell lymphoma genomic data. It is shown that this methodology is numerically stable and effective in dimension reduction, while still being able to provide satisfactory prediction and classification results.  相似文献   

2.
Regression modeling of competing crude failure probabilities   总被引:2,自引:0,他引:2  
In a randomized trial of tamoxifen therapy for breast cancer, women can experience tumor recurrence or die from competing causes. One goal of analysis is to describe the effect of tamoxifen on the probabilities of recurrence or death from other causes. To this end, we propose a semi-parametric transformation model for the crude failure probabilities of a competing risk, conditional on covariates. The model is developed as an extension of the standard approach to survival data with independent right censoring. Estimation of the regression coefficients is achieved with a rank-based least squares criterion. Simulations show that the procedure works well with practical sample sizes. A separate estimating function is developed for the baseline parameter. Prediction of covariate-adjusted failure probabilities is considered. The methodology is motivated and illustrated with data from the tamoxifen trial.  相似文献   

3.
Summary Clinicians are often interested in the effect of covariates on survival probabilities at prespecified study times. Because different factors can be associated with the risk of short‐ and long‐term failure, a flexible modeling strategy is pursued. Given a set of multiple candidate working models, an objective methodology is proposed that aims to construct consistent and asymptotically normal estimators of regression coefficients and average prediction error for each working model, that are free from the nuisance censoring variable. It requires the conditional distribution of censoring given covariates to be modeled. The model selection strategy uses stepup or stepdown multiple hypothesis testing procedures that control either the proportion of false positives or generalized familywise error rate when comparing models based on estimates of average prediction error. The context can actually be cast as a missing data problem, where augmented inverse probability weighted complete case estimators of regression coefficients and prediction error can be used ( Tsiatis, 2006 , Semiparametric Theory and Missing Data). A simulation study and an interesting analysis of a recent AIDS trial are provided.  相似文献   

4.
A nonproportional hazards Weibull accelerated failure time regression model   总被引:1,自引:0,他引:1  
K M Anderson 《Biometrics》1991,47(1):281-288
We present a study of risk factors measured in mean before age 50 and subsequent incidence of heart disease over 32 years of follow-up. The data are from the Framingham Heart Study. The standard accelerated failure time model assumes the logarithm of time until an event has a constant dispersion parameter and a location parameter that is a linear function of covariates. Parameters are estimated by maximum likelihood. We reject a standard Weibull model for these data in favor of a model with the dispersion parameter depending on the location parameter. This model suggests that the cumulative hazard ratio for two individuals shrinks towards unity over the follow-up period. Thus, not only the standard Weibull, but also the semiparametric proportional hazards (Cox) model is inadequate for this data. The model improvement appears particularly valuable when estimating the difference in predicted outcome probabilities for two individuals.  相似文献   

5.
Markov models for covariate dependence of binary sequences   总被引:3,自引:1,他引:2  
Suppose that a heterogeneous group of individuals is followed over time and that each individual can be in state 0 or state 1 at each time point. The sequence of states is assumed to follow a binary Markov chain. In this paper we model the transition probabilities for the 0 to 0 and 1 to 0 transitions by two logistic regressions, thus showing how the covariates relate to changes in state. With p covariates, there are 2(p + 1) parameters including intercepts, which we estimate by maximum likelihood. We show how to use transition probability estimates to test hypotheses about the probability of occupying state 0 at time i (i = 2, ..., T) and the equilibrium probability of state 0. These probabilities depend on the covariates. A recursive algorithm is suggested to estimate regression coefficients when some responses are missing. Extensions of the basic model which allow time-dependent covariates and nonstationary or second-order Markov chains are presented. An example shows the model applied to a study of the psychological impact of breast cancer in which women did or did not manifest distress at four time points in the year following surgery.  相似文献   

6.
In the last thirty years, there has been considerable interest in finding better models to fit data for probabilities of conception. An important early model was proposed by Barrett and Marshall (1969) and extended by Schwartz, MacDonald and Heuchel (1980). Recently, researchers have further extended these models by adding covariates. However, the increasingly complicated models are challenging to analyze with frequentist methods such as the EM algorithm. Bayesian models are more feasible, and the computation can be done via Markov chain Monte Carlo (MCMC). We consider a Bayesian model with an effect for protected intercourse to analyze data from the California Women's Reproductive Health Study and assess the effects of water contaminants and hormones. There are two main contributions in the paper. (1) For protected intercourse, we propose modeling the ratios of daily conception probabilities with protected intercourse to corresponding daily conception probabilities with unprotected intercourse. Due to the small sample size of our data set, we assume the ratios are the same for each day but unknown. (2) We consider Bayesian analysis under a unimodality assumption where the probabilities of conception increase before ovulation and decrease after ovulation. Gibbs sampling is used for finding the Bayesian estimates. There is some evidence that the two covariates affect fecundability.  相似文献   

7.
Bayesian multimodel inference for geostatistical regression models   总被引:2,自引:0,他引:2  
Johnson DS  Hoeting JA 《PloS one》2011,6(11):e25677
The problem of simultaneous covariate selection and parameter inference for spatial regression models is considered. Previous research has shown that failure to take spatial correlation into account can influence the outcome of standard model selection methods. A Markov chain Monte Carlo (MCMC) method is investigated for the calculation of parameter estimates and posterior model probabilities for spatial regression models. The method can accommodate normal and non-normal response data and a large number of covariates. Thus the method is very flexible and can be used to fit spatial linear models, spatial linear mixed models, and spatial generalized linear mixed models (GLMMs). The Bayesian MCMC method also allows a priori unequal weighting of covariates, which is not possible with many model selection methods such as Akaike's information criterion (AIC). The proposed method is demonstrated on two data sets. The first is the whiptail lizard data set which has been previously analyzed by other researchers investigating model selection methods. Our results confirmed the previous analysis suggesting that sandy soil and ant abundance were strongly associated with lizard abundance. The second data set concerned pollution tolerant fish abundance in relation to several environmental factors. Results indicate that abundance is positively related to Strahler stream order and a habitat quality index. Abundance is negatively related to percent watershed disturbance.  相似文献   

8.
The importance of multispecies models for understanding complex ecological processes and interactions is beginning to be realized. Recent developments, such as those by Lahoz‐Monfort et al. (2011), have enabled synchrony in demographic parameters across multiple species to be explored. Species in a similar environment would be expected to be subject to similar exogenous factors, although their response to each of these factors may be quite different. The ability to group species together according to how they respond to a particular measured covariate may be of particular interest to ecologists. We fit a multispecies model to two sets of similar species of garden bird monitored under the British Trust for Ornithology's Garden Bird Feeding Survey. Posterior model probabilities were estimated using the reversible jump algorithm to compare posterior support for competing models with different species sharing different subsets of regression coefficients. There was frequently good agreement between species with small asynchronous random‐effect components and those with posterior support for models with shared regression coefficients; however, this was not always the case. When groups of species were less correlated, greater uncertainty was found in whether regression coefficients should be shared or not. The methods outlined in this study can test additional hypotheses about the similarities or synchrony across multiple species that share the same environment. Through the use of posterior model probabilities, estimated using the reversible jump algorithm, we can detect multispecies responses in relation to measured covariates across any combination of species and covariates under consideration. The method can account for synchrony across species in relation to measured covariates, as well as unexplained variation accounted for using random effects. For more flexible, multiparameter distributions, the support for species‐specific parameters can also be measured.  相似文献   

9.
Summary Time varying, individual covariates are problematic in experiments with marked animals because the covariate can typically only be observed when each animal is captured. We examine three methods to incorporate time varying, individual covariates of the survival probabilities into the analysis of data from mark‐recapture‐recovery experiments: deterministic imputation, a Bayesian imputation approach based on modeling the joint distribution of the covariate and the capture history, and a conditional approach considering only the events for which the associated covariate data are completely observed (the trinomial model). After describing the three methods, we compare results from their application to the analysis of the effect of body mass on the survival of Soay sheep (Ovis aries) on the Isle of Hirta, Scotland. Simulations based on these results are then used to make further comparisons. We conclude that both the trinomial model and Bayesian imputation method perform best in different situations. If the capture and recovery probabilities are all high, then the trinomial model produces precise, unbiased estimators that do not depend on any assumptions regarding the distribution of the covariate. In contrast, the Bayesian imputation method performs substantially better when capture and recovery probabilities are low, provided that the specified model of the covariate is a good approximation to the true data‐generating mechanism.  相似文献   

10.
Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be 'kernelized'. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, where-akin to support vector classification-the margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches. AVAILABILITY: Software is available under the GNU Public License as an R package and can be obtained from the first author's website http://www.maths.bris.ac.uk/~maxle/software.html.  相似文献   

11.
1. Organisms balance current reproduction against future survival and reproduction, which results in life-history trade-offs. These trade-offs are also known as reproductive costs and may represent significant factors shaping life-history strategy for many species. 2. Using multistate mark-resight models and 26 years of mark-resight data (1979-2004), we estimated the costs of reproduction to survival and reproductive probabilities for Weddell seals in Erebus Bay, Antarctica and evaluated whether this species either conformed to the 'prudent parent' reproductive strategy predicted by life-history theory for long-lived mammals or alternatively, incurred costs to survival in order to reproduce in a variable environment (flexible-strategy hypothesis). 3. Results strongly supported the presence of reproductive costs to survival (mean annual survival probability was 0.91 for breeders vs. 0.94 for nonbreeders), a notable difference for a long-lived mammal, demonstrating that investment in reproduction does result in a cost to survival for Weddell seals, contrary to the prudent parent hypothesis. 4. Reproductive costs to subsequent reproductive probabilities were also present for first-time breeders (mean probability of breeding the next year was 31.3% lower for first-time breeders than for experienced breeders), thus supporting our prediction of the influence of breeding experience. 5. We detected substantial annual variation in survival and breeding probabilities. Breeding probabilities were negatively influenced by summer sea-ice extent, whereas weak evidence suggested that survival probabilities were affected more by winter sea-ice extent, and the direction of this effect was negative. However, a model with annual variation unrelated to any of our climate or sea-ice covariates performed best, indicating that further study will be needed to determine the appropriate mechanism or combination of mechanisms underlying this annual variation.  相似文献   

12.
Suitability of trees as hosts for epiphytic lichens are studied in a forest stand of size 25 ha. Suitability is measured as occupation probabilites which are modelled using hierarchical Bayesian approach. These probabilities are useful for an ecologist. They give smoothed spatial distribution map of suitability for each of the species and can be used in detecting high‐ and low‐probability areas. In addition, suitability is explained by tree‐level covariates. Spatial dependence, which is due to unobserved spatially structured covariates, is modelled through an unobserved Markov random field. Markov chain Monte Carlo method has been applied in Bayesian computation. The extensive spatial data consist of the occurrences of eight lichen species and one bryophyte on all of the 1253 potential host trees. In addition, coordinates of the trees and several tree characteristics have been recorded. The data have been analysed for four most abundant species: Lobaria pulmonaria, Nephroma bellum, Nephroma parile and Peltigera praetextata. The tree level parameters, subject to estimation, consist of the occurrence probabilities for each tree and for each lichen species. Model validation is discussed in detail and, in addition to Bayesian validation tools, the autologistic model and case‐control design based on logistic regression have been suggested for validation of covariate effects. As a result we present suitability maps for the four lichen species. We observed, that among the observed tree covariates, the diameter at breast height (DBH) correlates with lichen occurrence. Our modelling approach has close connections to disease mapping in spatial epidemiology.  相似文献   

13.
The risk difference is an intelligible measure for comparing disease incidence in two exposure or treatment groups. Despite its convenience in interpretation, it is less prevalent in epidemiological and clinical areas where regression models are required in order to adjust for confounding. One major barrier to its popularity is that standard linear binomial or Poisson regression models can provide estimated probabilities out of the range of (0,1), resulting in possible convergence issues. For estimating adjusted risk differences, we propose a general framework covering various constraint approaches based on binomial and Poisson regression models. The proposed methods span the areas of ordinary least squares, maximum likelihood estimation, and Bayesian inference. Compared to existing approaches, our methods prevent estimates and confidence intervals of predicted probabilities from falling out of the valid range. Through extensive simulation studies, we demonstrate that the proposed methods solve the issue of having estimates or confidence limits of predicted probabilities out of (0,1), while offering performance comparable to its alternative in terms of the bias, variability, and coverage rates in point and interval estimation of the risk difference. An application study is performed using data from the Prospective Registry Evaluating Myocardial Infarction: Event and Recovery (PREMIER) study.  相似文献   

14.
Paulino CD  Soares P  Neuhaus J 《Biometrics》2003,59(3):670-675
Motivated by a study of human papillomavirus infection in women, we present a Bayesian binomial regression analysis in which the response is subject to an unconstrained misclassification process. Our iterative approach provides inferences for the parameters that describe the relationships of the covariates with the response and for the misclassification probabilities. Furthermore, our approach applies to any meaningful generalized linear model, making model selection possible. Finally, it is straightforward to extend it to multinomial settings.  相似文献   

15.
Peng Y  Dear KB 《Biometrics》2000,56(1):237-243
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.  相似文献   

16.
Separate Cox analyses of all cause-specific hazards are the standard technique of choice to study the effect of a covariate in competing risks, but a synopsis of these results in terms of cumulative event probabilities is challenging. This difficulty has led to the development of the proportional subdistribution hazards model. If the covariate is known at baseline, the model allows for a summarizing assessment in terms of the cumulative incidence function. black Mathematically, the model also allows for including random time-dependent covariates, but practical implementation has remained unclear due to a certain risk set peculiarity. We use the intimate relationship of discrete covariates and multistate models to naturally treat time-dependent covariates within the subdistribution hazards framework. The methodology then straightforwardly translates to real-valued time-dependent covariates. As with classical survival analysis, including time-dependent covariates does not result in a model for probability functions anymore. Nevertheless, the proposed methodology provides a useful synthesis of separate cause-specific hazards analyses. We illustrate this with hospital infection data, where time-dependent covariates and competing risks are essential to the subject research question.  相似文献   

17.
Competing events concerning individual subjects are of interest in many medical studies. For example, leukemia-free patients surviving a bone marrow transplant are at risk of developing acute or chronic graft-versus-host disease, or they might develop infections. In this situation, competing risks models provide a natural framework to describe the disease. When incorporating covariates influencing the transition intensities, an obvious approach is to use Cox's proportional hazards model for each of the transitions separately. A practical problem then is how to deal with the abundance of regression parameters. Our objective is to describe the competing risks model in fewer parameters, both in order to avoid imprecise estimation in transitions with rare events and in order to facilitate interpretation of these estimates. Suppose that the regression parameters are gathered into a p x K matrix B, with p and K as the number of covariates and transitions, respectively. We propose the use of reduced rank models, where B is required to be of lower rank R, smaller than both p and K. One way to achieve this is to write B = AGamma(intercal) with A and Gamma matrices of dimensions p x R and K x R, respectively. We shall outline an algorithm to obtain estimates and their standard errors in a reduced rank proportional hazards model for competing risks and illustrate the approach on a competing risks model applied to 8966 leukemia patients from the European Group for Blood and Marrow Transplantation.  相似文献   

18.
Habitats in the Wadden Sea, a world heritage area, are affected by land subsidence resulting from natural gas extraction and by sea level rise. Here we describe a method to monitor changes in habitat types by producing sequential maps based on point information followed by mapping using a multinomial logit regression model with abiotic variables of which maps are available as predictors.In a 70 ha study area a total of 904 vegetation samples has been collected in seven sampling rounds with an interval of 2–3 years. Half of the vegetation plots was permanent, violating the assumption of independent data in multinomial logistic regression. This paper shows how this dependency can be accounted for by adding a random effect to the multinomial logit (MLN) model, thus becoming a mixed multinomial logit (MMNL) model. In principle all regression coefficients can be taken as random, but in this study only the intercepts are treated as location-specific random variables (random intercepts model). With six habitat types we have five intercepts, so that the number of extra model parameters becomes 15, 5 variances and 10 covariances.The likelihood ratio test showed that the MMNL model fitted significantly better than the MNL model with the same fixed effects. McFadden-R2 for the MMNL model was 0.467, versus 0.395 for the MNL model. The estimated coefficients of the MMNL and MNL model were comparable; those of altitude, the most important predictor, differed most. The MMNL model accounts for pseudo-replication at the permanent plots, which explains the larger standard errors of the MMNL coefficients. The habitat type at a given location-year combination was predicted by the habitat type with the largest predicted probability. The series of maps shows local trends in habitat types most likely driven by sea-level rise, soil subsidence, and a restoration project.We conclude that in environmental modeling of categorical variables using panel data, dependency of repeated observations at permanent plots should be accounted for. This will affect the estimated probabilities of the categories, and even stronger the standard errors of the regression coefficients.  相似文献   

19.
Many flexible extensions of the Cox proportional hazards model incorporate time-dependent (TD) and/or nonlinear (NL) effects of time-invariant covariates. In contrast, little attention has been given to the assessment of such effects for continuous time-varying covariates (TVCs). We propose a flexible regression B-spline–based model for TD and NL effects of a TVC. To account for sparse TVC measurements, we added to this model the effect of time elapsed since last observation (TEL), which acts as an effect modifier. TD, NL, and TEL effects are estimated with the iterative alternative conditional estimation algorithm. Furthermore, a simulation extrapolation (SIMEX)-like procedure was adapted to correct the estimated effects for random measurement errors in the observed TVC values. In simulations, TD and NL estimates were unbiased if the TVC was measured with a high frequency. With sparse measurements, the strength of the effects was underestimated but the TEL estimate helped reduce the bias, whereas SIMEX helped further to correct for bias toward the null due to “white noise” measurement errors. We reassessed the effects of systolic blood pressure (SBP) and total cholesterol, measured at two-year intervals, on cardiovascular risks in women participating in the Framingham Heart Study. Accounting for TD effects of SBP, cholesterol and age, the NL effect of cholesterol, and the TEL effect of SBP improved substantially the model's fit to data. Flexible estimates yielded clinically important insights regarding the role of these risk factors. These results illustrate the advantages of flexible modeling of TVC effects.  相似文献   

20.
Peng L  Fine JP 《Biometrics》2008,64(4):1080-1089
SUMMARY: In clinical trials and observational studies, it is often of scientific interest to evaluate the effects of covariates on complex multistate event probabilities. With discrete covariates, nonparametric tests may be constructed using estimates of the relevant quantities. With continuous covariates, a common approach is to arbitrarily discretize the covariates, which may lead to substantial information loss. Another strategy is to formulate the covariate effects in a regression model. Model-based tests may have either low power or be biased under misspecification. We propose nonparametric tests not requiring arbitrary discretization. The tests involve integrals of estimates continuously indexed by dichotomizations of the covariates. General asymptotic results are derived under null and alternative hypotheses, and verified using empirical process theory in several special cases. The tests are consistent under stochastic ordering, which arises naturally with multistate data. A novel nonparametric measure of covariate effect is studied as a natural byproduct of the testing procedure. Simulation studies and two real data analyses demonstrate the gains of the new testing procedure over those based either on categorization or on regression models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号