首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Regression Trees for Survival Data — an Approach to Select Discontinuous Split Points by Rank Statistics
Authors:Rainer Schlittgen
Abstract:Regression trees allow to search for meaningful explanatory variables that have a non linear impact on the dependent variable. Often they are used when there are many covariates and one does not want to restrict attention to only few of them. To grow a tree at each stage one has to select a cut point for splitting a group into two subgroups. The basis for this are the maxima of the test statistics related to the possible splits due to every covariate. They or the resulting P-values are compared as measure of importance. If covariates have different numbers of missing values, ties, or even different measurement scales the covariates lead to different numbers of tests. Those with a higher number of tests have a greater chance to achieve a smaller P-value if they are not adjusted. This can lead to erroneous splits even if the P-values are looked at informally. There is some theoretical work by Miller and Siegmund (1982) and Lausen and Schumacher (1992) to give an adjustment rule. But the asymptotic is based on a continuum of split points and may not lead to a fair splitting rule when applied to smaller data sets or to covariates with only few different values. Here we develop an approach that allows determination of P-values for any number of splits. The only approximation that is used is the normal approximation of the test statistics. The starting point for this investigation has been a prospective study on the development of AIDS. This is presented here as the main application.
Keywords:AIDS  α  -adjustment  Exact P-value  Maximally selected rank statistic  Regression tree  Survival analysis
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号