首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A method for fitting piecewise exponential regression models to censored survival data is described. Stratification is performed recursively, using a combination of statistical tests and residual analysis. The splitting criterion employed in cross-validation is the average squared error of the residuals. The bootstrap is employed to keep the probability of a type I error (the error of discovering two or more strata when there is only one) of the method close to a predetermined value. The proposed method can thus also serve as a formal goodness-of-fit test for the exponential regression model. Real and simulated data are used for illustration.  相似文献   

2.
Regression modeling of semicompeting risks data   总被引:1,自引:0,他引:1  
Peng L  Fine JP 《Biometrics》2007,63(1):96-108
Semicompeting risks data are often encountered in clinical trials with intermediate endpoints subject to dependent censoring from informative dropout. Unlike with competing risks data, dropout may not be dependently censored by the intermediate event. There has recently been increased attention to these data, in particular inferences about the marginal distribution of the intermediate event without covariates. In this article, we incorporate covariates and formulate their effects on the survival function of the intermediate event via a functional regression model. To accommodate informative censoring, a time-dependent copula model is proposed in the observable region of the data which is more flexible than standard parametric copula models for the dependence between the events. The model permits estimation of the marginal distribution under weaker assumptions than in previous work on competing risks data. New nonparametric estimators for the marginal and dependence models are derived from nonlinear estimating equations and are shown to be uniformly consistent and to converge weakly to Gaussian processes. Graphical model checking techniques are presented for the assumed models. Nonparametric tests are developed accordingly, as are inferences for parametric submodels for the time-varying covariate effects and copula parameters. A novel time-varying sensitivity analysis is developed using the estimation procedures. Simulations and an AIDS data analysis demonstrate the practical utility of the methodology.  相似文献   

3.
Fitting piecewise linear regression functions to biological responses   总被引:2,自引:0,他引:2  
An iterative approach was achieved for fitting piecewise linear functions to nonrectilinear responses of biological variables. This algorithm is used to estimate the parameters of the two (or more) regression functions and the separation point(s) (thresholds, sensitivities) by statistical approximation. Although it is often unknown whether the response of a biological variable is adequately described by one rectilinear regression function or by piecewise linear regression function(s) with separation point(s), an F test is proposed to determine whether one regression line is the optimal fitted function. A FORTRAN-77 program has been developed for estimating the optimal parameters and the coordinates of the separation point(s). A few sets of data illustrating this kind of problem in the analysis of thermoregulation, osmoregulation, and the neuronal responses are discussed.  相似文献   

4.
Li W  He C  Freudenberg J 《Genomics》2011,97(3):186-192
We introduce a piecewise linear regression called "hockey stick regression" to model the relationship between genetic and physical lengths of chromosomes in a genome. This piecewise linear regression is an extension of the two-parameter linear regression we proposed earlier [W. Li and J. Freudenberg, Two-parameter characterization of chromosome-scale recombination rate, Genome Res., 19 (2009) 2300-2307]. We use this, as well as the one-piece regression with a fixed y-intercept, to compare the two competing hypotheses concerning the minimum number of required chiasmata for meiosis: minimum one chiasma per chromosome (PC) and per chromosome arm (PA). Using statistical model selection and testing, we show that for human genome data, one-piece PC (PC1) is often in a statistical tie with two-piece PA model (PA2). If an upper bound for the segmentation point in two-piece regression is imposed, PC is usually the preferred model. This indicates that a presence of more than one chiasmata is rather caused by the relationship between chromosome size and chiasma formation than by cytogenetic constraints.  相似文献   

5.
Maternity length of stay (LOS) is an important measure of hospital activity, but its empirical distribution is often positively skewed. A two-component gamma mixture regression model has been proposed to analyze the heterogeneous maternity LOS. The problem is that observations collected from the same hospital are often correlated, which can lead to spurious associations and misleading inferences. To account for the inherent correlation, random effects are incorporated within the linear predictors of the two-component gamma mixture regression model. An EM algorithm is developed for the residual maximum quasi-likelihood estimation of the regression coefficients and variance component parameters. The approach enables the correct identification and assessment of risk factors affecting the short-stay and long-stay patient subgroups. In addition, the predicted random effects can provide information on the inter-hospital variations after adjustment for patient characteristics and health provision factors. A simulation study shows that the estimators obtained via the EM algorithm perform well in all the settings considered. Application to a set of maternity LOS data for women having obstetrical delivery with multiple complicating diagnoses is illustrated.  相似文献   

6.
Mallick BK  Denison DG  Smith AF 《Biometrics》1999,55(4):1071-1077
A Bayesian multivariate adaptive regression spline fitting approach is used to model univariate and multivariate survival data with censoring. The possible models contain the proportional hazards model as a subclass and automatically detect departures from this. A reversible jump Markov chain Monte Carlo algorithm is described to obtain the estimate of the hazard function as well as the survival curve.  相似文献   

7.
Ibrahim JG  Chen MH  Lipsitz SR 《Biometrics》1999,55(2):591-596
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.  相似文献   

8.
The Poisson regression model for the analysis of life table and follow-up data with covariates is presented. An example is presented to show how this technique can be used to construct a parsimonious model which describes a set of survival data. All parameters in the model, the hazard and survival functions are estimated by maximum likelihood.  相似文献   

9.
Summary Genetic association studies often investigate the effect of haplotypes on an outcome of interest. Haplotypes are not observed directly, and this complicates the inclusion of such effects in survival models. We describe a new estimating equations approach for Cox's regression model to assess haplotype effects for survival data. These estimating equations are simple to implement and avoid the use of the EM algorithm, which may be slow in the context of the semiparametric Cox model with incomplete covariate information. These estimating equations also lead to easily computable, direct estimators of standard errors, and thus overcome some of the difficulty in obtaining variance estimators based on the EM algorithm in this setting. We also develop an easily implemented goodness‐of‐fit procedure for Cox's regression model including haplotype effects. Finally, we apply the procedures presented in this article to investigate possible haplotype effects of the PAF‐receptor on cardiovascular events in patients with coronary artery disease, and compare our results to those based on the EM algorithm.  相似文献   

10.
In this paper a generalization of the Poisson regression model indexed by a shape parameter is proposed for the analysis of life table and follow-up data with concomitant variables. The model is suitable for analysis of extra-Poisson variation data. The model is used to fit the survival data given in Holford (1980). The model parameters, the hazard and survival functions are estimated by the method of maximum likelihood. The results obtained from this study seem to be comparable to those obtained by Chen (1988). Approximate tests of the dispersion and goodness-of-fit of the data to the model are also discussed.  相似文献   

11.
Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing.  相似文献   

12.
A statistical technique is given which can be used to estimate the parameters of the two-component model for cell survival from quantal response multifraction data. The method is a nonlinear logistic regression and relies on a mild assumption relating the probability of death to cell survival level. The method is demonstrated on mouse colon data, where more efficient estimates of the parameters are known, and the agreement is good. Also for some mouse lung LD50 data we obtain estimates of the parameters, and the fit to the data is shown to be better than that of linear-quadratic model.  相似文献   

13.
MOTIVATION: DNA microarrays allow the simultaneous measurement of thousands of gene expression levels in any given patient sample. Gene expression data have been shown to correlate with survival in several cancers, however, analysis of the data is difficult, since typically at most a few hundred patients are available, resulting in severely underdetermined regression or classification models. Several approaches exist to classify patients in different risk classes, however, relatively little has been done with respect to the prediction of actual survival times. We introduce CASPAR, a novel method to predict true survival times for the individual patient based on microarray measurements. CASPAR is based on a multivariate Cox regression model that is embedded in a Bayesian framework. A hierarchical prior distribution on the regression parameters is specifically designed to deal with high dimensionality (large number of genes) and low sample size settings, that are typical for microarray measurements. This enables CASPAR to automatically select small, most informative subsets of genes for prediction. RESULTS: Validity of the method is demonstrated on two publicly available datasets on diffuse large B-cell lymphoma (DLBCL) and on adenocarcinoma of the lung. The method successfully identifies long and short survivors, with high sensitivity and specificity. We compare our method with two alternative methods from the literature, demonstrating superior results of our approach. In addition, we show that CASPAR can further refine predictions made using clinical scoring systems such as the International Prognostic Index (IPI) for DLBCL and clinical staging for lung cancer, thus providing an additional tool for the clinician. An analysis of the genes identified confirms previously published results, and furthermore, new candidate genes correlated with survival are identified.  相似文献   

14.
MOTIVATION: It is important to predict the outcome of patients with diffuse large-B-cell lymphoma after chemotherapy, since the survival rate after treatment of this common lymphoma disease is <50%. Both clinically based outcome predictors and the gene expression-based molecular factors have been proposed independently in disease prognosis. However combining the high-dimensional genomic data and the clinically relevant information to predict disease outcome is challenging. RESULTS: We describe an integrated clinicogenomic modeling approach that combines gene expression profiles and the clinically based International Prognostic Index (IPI) for personalized prediction in disease outcome. Dimension reduction methods are proposed to produce linear combinations of gene expressions, while taking into account clinical IPI information. The extracted summary measures capture all the regression information of the censored survival phenotype given both genomic and clinical data, and are employed as covariates in the subsequent survival model formulation. A case study of diffuse large-B-cell lymphoma data, as well as Monte Carlo simulations, both demonstrate that the proposed integrative modeling improves the prediction accuracy, delivering predictions more accurate than those achieved by using either clinical data or molecular predictors alone.  相似文献   

15.
There is an increasing need to link the large amount of genotypic data, gathered using microarrays for example, with various phenotypic data from patients. The classification problem in which gene expression data serve as predictors and a class label phenotype as the binary outcome variable has been examined extensively, but there has been less emphasis in dealing with other types of phenotypic data. In particular, patient survival times with censoring are often not used directly as a response variable due to the complications that arise from censoring. We show that the issues involving censored data can be circumvented by reformulating the problem as a standard Poisson regression problem. The procedure for solving the transformed problem is a combination of two approaches: partial least squares, a regression technique that is especially effective when there is severe collinearity due to a large number of predictors, and generalized linear regression, which extends standard linear regression to deal with various types of response variables. The linear combinations of the original variables identified by the method are highly correlated with the patient survival times and at the same time account for the variability in the covariates. The algorithm is fast, as it does not involve any matrix decompositions in the iterations. We apply our method to data sets from lung carcinoma and diffuse large B-cell lymphoma studies to verify its effectiveness.  相似文献   

16.
Roy J  Lin X 《Biometrics》2000,56(4):1047-1054
Multiple outcomes are often used to properly characterize an effect of interest. This paper proposes a latent variable model for the situation where repeated measures over time are obtained on each outcome. These outcomes are assumed to measure an underlying quantity of main interest from different perspectives. We relate the observed outcomes using regression models to a latent variable, which is then modeled as a function of covariates by a separate regression model. Random effects are used to model the correlation due to repeated measures of the observed outcomes and the latent variable. An EM algorithm is developed to obtain maximum likelihood estimates of model parameters. Unit-specific predictions of the latent variables are also calculated. This method is illustrated using data from a national panel study on changes in methadone treatment practices.  相似文献   

17.
A longitudinal approach is proposed to map QTL affecting function-valued traits and to estimate their effect over time. The method is based on fitting mixed random regression models. The QTL allelic effects are modelled with random coefficient parametric curves and using a gametic relationship matrix. A simulation study was conducted in order to assess the ability of the approach to fit different patterns of QTL over time. It was found that this longitudinal approach was able to adequately fit the simulated variance functions and considerably improved the power of detection of time-varying QTL effects compared to the traditional univariate model. This was confirmed by an analysis of protein yield data in dairy cattle, where the model was able to detect QTL with high effect either at the beginning or the end of the lactation, that were not detected with a simple 305 day model.  相似文献   

18.
A random regression model for the analysis of "repeated" records in animal breeding is described which combines a random regression approach for additive genetic and other random effects with the assumption of a parametric correlation structure for within animal covariances. Both stationary and non-stationary correlation models involving a small number of parameters are considered. Heterogeneity in within animal variances is modelled through polynomial variance functions. Estimation of parameters describing the dispersion structure of such model by restricted maximum likelihood via an "average information" algorithm is outlined. An application to mature weight records of beef cow is given, and results are contrasted to those from analyses fitting sets of random regression coefficients for permanent environmental effects.  相似文献   

19.
Gray RJ 《Biometrics》2000,56(2):571-576
An estimator of the regression parameters in a semiparametric transformed linear survival model is examined. This estimator consists of a single Newton-like update of the solution to a rank-based estimating equation from an initial consistent estimator. An automated penalized likelihood algorithm is proposed for estimating the optimal weight function for the estimating equations and the error hazard function that is needed in the variance estimator. In simulations, the estimated optimal weights are found to give reasonably efficient estimators of the regression parameters, and the variance estimators are found to perform well. The methodology is applied to an analysis of prognostic factors in non-Hodgkin's lymphoma.  相似文献   

20.
 A common problem in mapping quantitative trait loci (QTLs) is that marker data are often incomplete. This includes missing data, dominant markers, and partially informative markers, arising in outbred populations. Here we briefly present an iteratively re-weighted least square method (IRWLS) to incorporate dominant and missing markers for mapping QTLs in four-way crosses under a heterogeneous variance model. The algorithm uses information from all markers in a linkage group to infer the QTL genotype. Monte Carlo simulations indicate that with half dominant markers, QTL detection is almost as efficient as with all co-dominant markers. However, the precision of the estimated QTL parameters generally decreases as more markers become missing or dominant. Notable differences are observed on the standard deviation of the estimated QTL position for varying levels of marker information content. The method is relatively simple so that more complex models including multiple QTLs or fixed effects can be fitted. Finally, the method can be readily extended to QTL mapping in full-sib families. Received: 16 June 1998 / Accepted: 29 September 1998  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号