首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 622 毫秒
1.
Determining the number of clusters using the weighted gap statistic   总被引:3,自引:0,他引:3  
Yan M  Ye K 《Biometrics》2007,63(4):1031-1037
Estimating the number of clusters in a data set is a crucial step in cluster analysis. In this article, motivated by the gap method (Tibshirani, Walther, and Hastie, 2001, Journal of the Royal Statistical Society B63, 411-423), we propose the weighted gap and the difference of difference-weighted (DD-weighted) gap methods for estimating the number of clusters in data using the weighted within-clusters sum of errors: a measure of the within-clusters homogeneity. In addition, we propose a "multilayer" clustering approach, which is shown to be more accurate than the original gap method, particularly in detecting the nested cluster structure of the data. The methods are applicable when the input data contain continuous measurements and can be used with any clustering method. Simulation studies and real data are investigated and compared among these proposed methods as well as with the original gap method.  相似文献   

2.
3.
4.
The local power of the efficient scores test statistic   总被引:1,自引:0,他引:1  
HARRIS  P.; PEERS  H. W. 《Biometrika》1980,67(3):525-529
  相似文献   

5.
6.
7.
8.
9.
10.
11.
For the case of the LEHMANN alternatives the paper presents some new facts on the MANN -WHITNEY statistic and, in particular, its variance V(p, m, n), where p = P(xi<yi). Explicit formulas for U and V are used to prove, among other things, the following propositions: For any m, n, V is a one-hump function of p, and the hump always lies in the interval (1/2(3 - √5), 1/2(√5 - 1)). If no restrictions are imposed on p the boundaries of this interval are sharp. Given s = m + n, V(1/2, s/2,s/2) is maximal among all values V(p, m, n). The formulas allow, moreover, the improvement of the known bounds for the variance of p? = U/mn.  相似文献   

12.
13.
14.
15.
The use of the score statistic to test whether a generalised distribution gives an improved fit over a non-generalised distribution is recommended. The score statistic for a generalised exponential family is derived. Several specific examples are given.  相似文献   

16.
In the evaluation of a biomarker for risk prediction, one can assess the performance of the biomarker in the population of interest by displaying the predictiveness curve. In conjunction with an assessment of the classification accuracy of a biomarker, the predictiveness curve is an important tool for assessing the usefulness of a risk prediction model. Inference for a single biomarker or for multiple biomarkers can be performed using summary measures of the predictiveness curve. We propose two partial summary measures, the partial total gain and the partial proportion of explained variation, that summarize the predictiveness curve over a restricted range of risk. The methods we describe can be used to compare two biomarkers when there are existing thresholds for risk stratification. We describe inferential tools for one and two samples that are shown to have adequate power in a simulation study. The methods are illustrated by assessing the accuracy of a risk score for predicting the onset of Alzheimer's disease.  相似文献   

17.
Fine JP  Tsiatis AA 《Biometrics》2000,56(1):145-153
During the interim stages of most large-scale clinical trials, knowledge that a patient is alive or dead is usually not up-to-date. This is due to the pattern of patient visits to hospitals as well as the administrative set-up used by the study to obtain information on vital status. On a two-armed study, if the process of ascertaining vital status is not the same in both treatment groups, then the standard method of testing based on the logrank statistic may not be applicable. Instead, an ad hoc modification to the logrank test, which artificially truncates follow-up prior to the time of analysis, is often used. These approaches have not been formally addressed in the literature. In the early stages of a clinical trial, severe bias or loss of power may result. For this situation, we propose a class of test statistics that extends the usual class of U statistics. Asymptotic normality is derived by reformulating the statistics in terms of counting processes and employing the theory of U statistics along with martingale techniques. For early interim analyses, a numerical study indicates that the new tests can be more powerful than the current practice when differential ascertainment is present. To illustrate the potential loss of information when lagging follow-up to control for ascertainment delays, we reanalyze an AIDS clinical trial with the truncated logrank and the new statistics.  相似文献   

18.
19.
Rank-based inference for the accelerated failure time model   总被引:10,自引:0,他引:10  
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号