首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 17 毫秒
1.
Dementia, Alzheimer's disease in particular, is one of the major causes of disability and decreased quality of life among the elderly and a leading obstacle to successful aging. Given the profound impact on public health, much research has focused on the age-specific risk of developing dementia and the impact on survival. Early work has discussed various methods of estimating age-specific incidence of dementia, among which the illness-death model is popular for modeling disease progression. In this article we use multiple imputation to fit multi-state models for survival data with interval censoring and left truncation. This approach allows semi-Markov models in which survival after dementia depends on onset age. Such models can be used to estimate the cumulative risk of developing dementia in the presence of the competing risk of dementia-free death. Simulations are carried out to examine the performance of the proposed method. Data from the Honolulu Asia Aging Study are analyzed to estimate the age-specific and cumulative risks of dementia and to examine the effect of major risk factors on dementia onset and death.  相似文献   

2.
The illness-death model is the simplest multistate model where the transition from the initial state 0 to the absorbing state 2 may involve an intermediate state 1 (e.g., disease relapse). The impact of the transition into state 1 on the subsequent transition hazard to state 2 enables insight to be gained into the disease evolution. The standard approach of analysis is modeling the transition hazards from 0 to 2 and from 1 to 2, including time to illness as a time-varying covariate and measuring time from origin even after transition into state 1. The hazard from 1 to 2 can be also modeled separately using only patients in state 1, measuring time from illness and including time to illness as a fixed covariate. A recently proposed approach is a model where time after the transition into state 1 is measured in both scales and time to illness is included as a time-varying covariate. Another possibility is a model where time after transition into state 1 is measured only from illness and time to illness is included as a fixed covariate. Through theoretical reasoning and simulation protocols, we discuss the use of these models and we develop a practical strategy aiming to (a) validate the properties of the illness-death process, (b) estimate the impact of time to illness on the hazard from state 1 to 2, and (c) quantify the impact that the transition into state 1 has on the hazard of the absorbing state. The strategy is also applied to a literature dataset on diabetes.  相似文献   

3.
Semi-Markov and modulated renewal processes provide a large class of multi-state models which can be used for analysis of longitudinal failure time data. In biomedical applications, models of this kind are often used to describe evolution of a disease and assume that patient may move among a finite number of states representing different phases in the disease progression. Several authors proposed extensions of the proportional hazard model for regression analysis of these processes. In this paper, we consider a general class of censored semi-Markov and modulated renewal processes and propose use of transformation models for their analysis. Special cases include modulated renewal processes with interarrival times specified using transformation models, and semi-Markov processes with with one-step transition probabilities defined using copula-transformation models. We discuss estimation of finite and infinite dimensional parameters and develop an extension of the Gaussian multiplier method for setting confidence bands for transition probabilities and related parameters. A transplant outcome data set from the Center for International Blood and Marrow Transplant Research is used for illustrative purposes.  相似文献   

4.
1. The predictive modelling approach to bioassessment estimates the macroinvertebrate assemblage expected at a stream site if it were in a minimally disturbed reference condition. The difference between expected and observed assemblages then measures the departure of the site from reference condition. 2. Most predictive models employ site classification, followed by discriminant function (DF) modelling, to predict the expected assemblage from a suite of environmental variables. Stepwise DF analysis is normally used to choose a single subset of DF predictor variables with a high accuracy for classifying sites. An alternative is to screen all possible combinations of predictor variables, in order to identify several ‘best’ subsets that yield good overall performance of the predictive model. 3. We applied best‐subsets DF analysis to assemblage and environmental data from 199 reference sites in Oregon, U.S.A. Two sets of 66 best DF models containing between one and 14 predictor variables (that is, having model orders from one to 14) were developed, for five‐group and 11‐group site classifications. 4. Resubstitution classification accuracy of the DF models increased consistently with model order, but cross‐validated classification accuracy did not improve beyond seventh or eighth‐order models, suggesting that the larger models were overfitted. 5. Overall predictive model performance at model training sites, measured by the root‐mean‐squared error of the observed/expected species richness ratio, also improved steadily with DF model order. But high‐order DF models usually performed poorly at an independent set of validation sites, another sign of model overfitting. 6. Models selected by stepwise DF analysis showed evidence of overfitting and were outperformed by several of the best‐subsets models. 7. The group separation strength of a DF model, as measured by Wilks’Λ, was more strongly correlated with overall predictive model performance at training sites than was DF classification accuracy. 8. Our results suggest improved strategies for developing reliable, parsimonious predictive models. We emphasise the value of independent validation data for obtaining a realistic picture of model performance. We also recommend assessing not just one or two, but several, candidate models based on their overall performance as well as the performance of their DF component. 9. We provide links to our free software for stepwise and best‐subsets DF analysis.  相似文献   

5.
Liang Li  Bo Hu  Tom Greene 《Biometrics》2009,65(3):737-745
Summary .  In many longitudinal clinical studies, the level and progression rate of repeatedly measured biomarkers on each subject quantify the severity of the disease and that subject's susceptibility to progression of the disease. It is of scientific and clinical interest to relate such quantities to a later time-to-event clinical endpoint such as patient survival. This is usually done with a shared parameter model. In such models, the longitudinal biomarker data and the survival outcome of each subject are assumed to be conditionally independent given subject-level severity or susceptibility (also called frailty in statistical terms). In this article, we study the case where the conditional distribution of longitudinal data is modeled by a linear mixed-effect model, and the conditional distribution of the survival data is given by a Cox proportional hazard model. We allow unknown regression coefficients and time-dependent covariates in both models. The proposed estimators are maximizers of an exact correction to the joint log likelihood with the frailties eliminated as nuisance parameters, an idea that originated from correction of covariate measurement error in measurement error models. The corrected joint log likelihood is shown to be asymptotically concave and leads to consistent and asymptotically normal estimators. Unlike most published methods for joint modeling, the proposed estimation procedure does not rely on distributional assumptions of the frailties. The proposed method was studied in simulations and applied to a data set from the Hemodialysis Study.  相似文献   

6.
Epidemiologic models used for cancer risk prediction, such as the Gail model, are validated for populations undergoing regular screening but often have suboptimal individual predictive accuracy. Risk biomarkers may be employed to improve predictive accuracy based on the Gail or other epidemiologic models and, to the extent that they are reversible, may be used to assess response in phase I–II prevention trials. Risk biomarkers used as intermediate response endpoints include high mammographic breast density, intra-epithelial neoplasia, and cytomorphology with associated molecular markers such as Ki-67. At the present time these biomarkers may not be used to predict or monitor individual response to standard prevention interventions but are used in early phase clinical trials as preliminary indicators of efficacy.  相似文献   

7.
Ensemble forecasting is advocated as a way of reducing uncertainty in species distribution modeling (SDM). This is because it is expected to balance accuracy and robustness of SDM models. However, there are little available data regarding the spatial similarity of the combined distribution maps generated by different consensus approaches. Here, using eight niche-based models, nine split-sample calibration bouts (or nine random model-training subsets), and nine climate change scenarios, the distributions of 32 forest tree species in China were simulated under current and future climate conditions. The forecasting ensembles were combined to determine final consensual prediction maps for target species using three simple consensus approaches (average, frequency, and median [PCA]). Species’ geographic ranges changed (area change and shifting distance) in response to climate change, but the three consensual projections did not differ significantly with respect to how much or in which direction, but they did differ with respect to the spatial similarity of the three consensual predictions. Incongruent areas were observed primarily at the edges of species’ ranges. Multiple stepwise regression models showed the three factors (niche marginality and specialization, and niche model accuracy) to be related to the observed variations in consensual prediction maps among consensus approaches. Spatial correspondence among prediction maps was the highest when niche model accuracy was high and marginality and specialization were low. The difference in spatial predictions suggested that more attention should be paid to the range of spatial uncertainty before any decisions regarding specialist species can be made based on map outputs. The niche properties and single-model predictive performance provide promising insights that may further understanding of uncertainties in SDM.  相似文献   

8.
MOTIVATION: An important application of microarray technology is to relate gene expression profiles to various clinical phenotypes of patients. Success has been demonstrated in molecular classification of cancer in which the gene expression data serve as predictors and different types of cancer serve as a categorical outcome variable. However, there has been less research in linking gene expression profiles to the censored survival data such as patients' overall survival time or time to cancer relapse. It would be desirable to have models with good prediction accuracy and parsimony property. RESULTS: We propose to use the L(1) penalized estimation for the Cox model to select genes that are relevant to patients' survival and to build a predictive model for future prediction. The computational difficulty associated with the estimation in the high-dimensional and low-sample size settings can be efficiently solved by using the recently developed least-angle regression (LARS) method. Our simulation studies and application to real datasets on predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed procedure, which we call the LARS-Cox procedure, can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. The LARS-Cox regression gives better predictive performance than the L(2) penalized regression and a few other dimension-reduction based methods. CONCLUSIONS: We conclude that the proposed LARS-Cox procedure can be very useful in identifying genes relevant to survival phenotypes and in building a parsimonious predictive model that can be used for classifying future patients into clinically relevant high- and low-risk groups based on the gene expression profile and survival times of previous patients.  相似文献   

9.
In this paper we consider a general illness-death stochastic model in which the transition intensities all vary proportionally to a time function ϕ(t). We extend Chiang's earlier work to include processes which are both reversible and have parameters which are time varying, and obtain survival time distributions and the expectation and variance of survival times.  相似文献   

10.
In many clinical trials, multiple time‐to‐event endpoints including the primary endpoint (e.g., time to death) and secondary endpoints (e.g., progression‐related endpoints) are commonly used to determine treatment efficacy. These endpoints are often biologically related. This work is motivated by a study of bone marrow transplant (BMT) for leukemia patients, who may experience the acute graft‐versus‐host disease (GVHD), relapse of leukemia, and death after an allogeneic BMT. The acute GVHD is associated with the relapse free survival, and both the acute GVHD and relapse of leukemia are intermediate nonterminal events subject to dependent censoring by the informative terminal event death, but not vice versa, giving rise to survival data that are subject to two sets of semi‐competing risks. It is important to assess the impacts of prognostic factors on these three time‐to‐event endpoints. We propose a novel statistical approach that jointly models such data via a pair of copulas to account for multiple dependence structures, while the marginal distribution of each endpoint is formulated by a Cox proportional hazards model. We develop an estimation procedure based on pseudo‐likelihood and carry out simulation studies to examine the performance of the proposed method in finite samples. The practical utility of the proposed method is further illustrated with data from the motivating example.  相似文献   

11.
Chronic diseases impose a tremendous global health problem of the 21st century. Epidemiological and public health models help to gain insight into the distribution and burden of chronic diseases. Moreover, the models may help to plan appropriate interventions against risk factors. To provide accurate results, models often need to take into account three different time-scales: calendar time, age, and duration since the onset of the disease. Incidence and mortality often change with age and calendar time. In many diseases such as, for example, diabetes and dementia, the mortality of the diseased persons additionally depends on the duration of the disease. The aim of this work is to describe an algorithm and a flexible software framework for the simulation of populations moving in an illness-death model that describes the epidemiology of a chronic disease in the face of the different times-scales. We set up a discrete event simulation in continuous time involving competing risks using the freely available statistical software R. Relevant events are birth, the onset (or diagnosis) of the disease and death with or without the disease. The Lexis diagram keeps track of the different time-scales. Input data are birth rates, incidence and mortality rates, which can be given as numerical values on a grid. The algorithm manages the complex interplay between the rates and the different time-scales. As a result, for each subject in the simulated population, the algorithm provides the calendar time of birth, the age of onset of the disease (if the subject contracts the disease) and the age at death. By this means, the impact of interventions may be estimated and compared.  相似文献   

12.
We consider the impact of a possible intermediate event on a terminal event in an illness-death model with states 'initial', 'intermediate' and 'terminal'. One aim is to unambiguously describe the occurrence of the intermediate event in terms of the observable data, the problem being that the intermediate event may not occur. We propose to consider a random time interval, whose length is the time spent in the intermediate state. We derive an estimator of the joint distribution of the left and right limit of the random time interval from the Aalen-Johansen estimator of the matrix of transition probabilities and study its asymptotic properties. We apply our approach to hospital infection data. Estimating the distribution of the random time interval will usually be only a first step of an analysis. We illustrate this by analysing change in length of hospital stay following an infection and derive the large sample properties of the respective estimator.  相似文献   

13.
BACKGROUND AND AIMS: Two previous papers in this series evaluated model fit of eight thermal-germination models parameterized from constant-temperature germination data. The previous studies determined that model formulations with the fewest shape assumptions provided the best estimates of both germination rate and germination time. The purpose of this latest study was to evaluate the accuracy and efficiency of these same models in predicting germination time and relative seedlot performance under field-variable temperature scenarios. METHODS: The seeds of four rangeland grass species were germinated under 104 variable-temperature treatments simulating six planting dates at three field sites in south-western Idaho. Measured and estimated germination times for all subpopulations were compared for all models, species and temperature treatments. KEY RESULTS: All models showed similar, and relatively high, predictive accuracy for field-temperature simulations except for the iterative-probit-optimization (IPO) model, which exhibited systematic errors as a function of subpopulation. Highest efficiency was obtained with the statistical-gridding (SG) model, which could be directly parameterized by measured subpopulation rate data. Relative seedlot response predicted by thermal time coefficients was somewhat different from that estimated from mean field-variable temperature response as a function of subpopulation. CONCLUSIONS: All germination response models tested performed relatively well in estimating field-variable temperature response. IPO caused systematic errors in predictions of germination time, and may have degraded the physiological relevance of resultant cardinal-temperature parameters. Comparative indices based on expected field performance may be more ecologically relevant than indices derived from a broader range of potential thermal conditions.  相似文献   

14.

Background  

Microarray technology is increasingly used to identify potential biomarkers for cancer prognostics and diagnostics. Previously, we have developed the iterative Bayesian Model Averaging (BMA) algorithm for use in classification. Here, we extend the iterative BMA algorithm for application to survival analysis on high-dimensional microarray data. The main goal in applying survival analysis to microarray data is to determine a highly predictive model of patients' time to event (such as death, relapse, or metastasis) using a small number of selected genes. Our multivariate procedure combines the effectiveness of multiple contending models by calculating the weighted average of their posterior probability distributions. Our results demonstrate that our iterative BMA algorithm for survival analysis achieves high prediction accuracy while consistently selecting a small and cost-effective number of predictor genes.  相似文献   

15.
Many studies have investigated the relationships between electromyography (EMG) and torque production. A few investigators have used adjusted learning algorithms and feed-forward artificial neural networks (ANNs) to estimate joint torque in the elbow. This study sought to estimate net isokinetic knee torque using ANN models. Isokinetic knee extensor and flexor torque data were measured simultaneously with agonist and antagonist EMG during concentric and eccentric contractions at joint velocities of 30 degrees /s and 60 degrees /s. Age, gender, height, body mass, agonist EMG, antagonist EMG, joint position and joint velocity were entered as predictive variables of net torque. A three-layer ANN model was developed and trained using an adjusted back-propagation algorithm. Accuracy results were compared against those of forward stepwise regression models. Stepwise regression models included body mass, body height and joint position as the most influential predictors, followed by agonist EMG for concentric and eccentric contractions. Estimation of eccentric torque included antagonist EMG following the agonist activation. ANN models resulted in more accurate torque estimation (R=0.96), compared to the stepwise regression models (R=0.71). ANN model accuracy increased greatly when the number of hidden units increased from 5 to 10, continuing to increase gradually with additional hidden units. The average number of training epochs necessary for solution convergence and the relative accuracy of the model indicate a strong ability for the ANN model to generalize these estimations to a broader sample. The ANN model appears to be a feasible technique for estimating joint torque in the knee.  相似文献   

16.
互花米草成功入侵的关键是其生长繁殖能力以及对环境的适应能力,叶片含水率、相对叶绿素含量、碳氮比、总氮、总磷以及比叶面积等叶片功能性状反应的是互花米草对资源的利用能力以及环境的适应能力。以江苏盐城滨海湿地为研究对象,进行互花米草叶片功能性状与高光谱数据的关系研究。通过对原始光谱数据以及一阶微分转换光谱数据进行主成分分析提取新的主成分变量作为自变量分别建立不同性状的逐步回归、BP神经网络、支持向量机、随机森林4种预测模型,通过比较构建模型的R2以及RMSE选择最优模型,进而基于相关性分析得到的敏感波段构建最优模型,验证其准确性和适用性。研究结果发现:(1)一阶微分数据的建模效果优于原始光谱数据;(2)通过对不同功能性状的预测建模,发现4种模型的预测效果排序为:随机森林>支持向量机>BP神经网络>逐步回归,其中随机森林模型的准确性高、稳定性强,明显优于其他3种模型,而逐步回归模型的效果最差,不适用于互花米草叶片功能性状的高光谱建模;(3)通过对相关性分析得到的敏感波段建立随机森林模型,建模R2均大于0.90,验证R2介于0.73-0.95之间,进一步证实了随机森林模型的准确性和稳定性。研究结果表明,高光谱数据可以作为快速监测互花米草生长状况的有力手段,而随机森林模型可以作为高精度模型实现对互花米草不同叶片功能性状的估测。  相似文献   

17.
In response to environmental threats, numerous indicators have been developed to assess the impact of livestock farming systems on the environment. Some of them, notably those based on management practices have been reported to have low accuracy. This paper reports the results of a study aimed at assessing whether accuracy can be increased at a reasonable cost by mixing individual indicators into models. We focused on proxy indicators representing an alternative to the direct impact measurement on two grassland bird species, the lapwing Vanellus vanellus and the redshank Tringa totanus. Models were developed using stepwise selection procedures or Bayesian model averaging (BMA). Sensitivity, specificity, and probability of correctly ranking fields (area under the curve, AUC) were estimated for each individual indicator or model from observational data measured on 252 grazed plots during 2 years. The cost of implementation of each model was computed as a function of the number and types of input variables. Among all management indicators, 50% had an AUC lower than or equal to 0.50 and thus were not better than a random decision. Independently of the statistical procedure, models combining management indicators were always more accurate than individual indicators for lapwings only. In redshanks, models based either on BMA or some selection procedures were non-informative. Higher accuracy could be reached, for both species, with model mixing management and habitat indicators. However, this increase in accuracy was also associated with an increase in model cost. Models derived by BMA were more expensive and slightly less accurate than those derived with selection procedures. Analysing trade-offs between accuracy and cost of indicators opens promising application perspectives as time consuming and expensive indicators are likely to be of low practical utility.  相似文献   

18.
MOTIVATION: An important area of research in the postgenomics era is to relate high-dimensional genetic or genomic data to various clinical phenotypes of patients. Due to large variability in time to certain clinical events among patients, studying possibly censored survival phenotypes can be more informative than treating the phenotypes as categorical variables. Due to high dimensionality and censoring, building a predictive model for time to event is more difficult than the classification/linear regression problem. We propose to develop a boosting procedure using smoothing splines for estimating the general proportional hazards models. Such a procedure can potentially be used for identifying non-linear effects of genes on the risk of developing an event. RESULTS: Our empirical simulation studies showed that the procedure can indeed recover the true functional forms of the covariates and can identify important variables that are related to the risk of an event. Results from predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed method can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. In addition, there is clear evidence of non-linear effects of some genes on survival time.  相似文献   

19.

Background

Women having experienced several consecutive failing IVF cycles constitute a critical and particular subset of patients, for which growing perception of irremediable failure, increasing costs and IVF treatment related risks necessitate appropriate decision making when starting or not a new cycle. Predicting chances of LB might constitute a useful tool for discussion between the patient and the clinician. Our essential objective was to dispose of a simple and accurate prediction model for use in routine medical practice. The currently available predictive models applicable to general populations cannot be considered as accurate enough for this purpose.

Methods

Patients with at least four consecutive Failing cycles (CFCs) were selected. We constructed a predictive model of LB occurrence during the last cycle, by using a stepwise logistic regression, using all the baseline patient characteristics and intermediate stage variables during the four first cycles.

Results

On as set of 151 patients, we identified five determinant predictors: the number of previous cycles with at least one gestational sac (NGS), the mean number of good-quality embryos, age, male infertility (MI) aetiology and basal FSH. Our model was characterized by a much higher discrimination as the existing models (C-statistics=0.76), and an excellent calibration.

Conclusions

Couples having experienced multiple IVF failures need precise and appropriate information to decide to resume or interrupt their fertility project. Our essential objective was to dispose of a simple and accurate prediction model to allow a routine practice use. Our model is adapted to this purpose: It is very simple, combines five easily collected variables in a short calculation; it is more accurate than existing models, with a fair discrimination and a well calibrated prediction.  相似文献   

20.
Abstract. Statistical models of the realized niche of species are increasingly used, but systematic comparisons of alternative methods are still limited. In particular, only few studies have explored the effect of scale in model outputs. In this paper, we investigate the predictive ability of three statistical methods (generalized linear models, generalized additive models and classification tree analysis) using species distribution data at three scales: fine (Catalonia), intermediate (Portugal) and coarse (Europe). Four Mediterranean tree species were modelled for comparison. Variables selected by models were relatively consistent across scales and the predictive accuracy of models varied only slightly. However, there were slight differences in the performance of methods. Classification tree analysis had a lower accuracy than the generalized methods, especially at finer scales. The performance of generalized linear models also increased with scale. At the fine scale GLM with linear terms showed better accuracy than GLM with quadratic and polynomial terms. This is probably because distributions at finer scales represent a linear sub‐sample of entire realized niches of species. In contrast to GLM, the performance of GAM was constant across scales being more data‐oriented. The predictive accuracy of GAM was always at least equal to other techniques, suggesting that this modelling approach is more robust to variations of scale because it can deal with any response shape.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号