首页 | 本学科首页   官方微博 | 高级检索  
     


Statistical estimation of cluster boundaries in gene expression profile data.
Authors:K Horimoto  H Toh
Affiliation:Laboratory of Mathematics, Saga Medical School, 5-1-1 Nabeshima, Saga, Saga 849-8501, Japan. horimoto@post.saga-med.ac.jp
Abstract:MOTIVATION: Gene expression profile data are rapidly accumulating due to advances in microarray techniques. The abundant data are analyzed by clustering procedures to extract the useful information about the genes inherent in the data. In the clustering analyses, the systematic determination of the boundaries of gene clusters, instead of by visual inspection and biological knowledge, still remains challenging. RESULTS: We propose a statistical procedure to estimate the number of clusters in the hierarchical clustering of the expression profiles. Following the hierarchical clustering, the statistical property of the profiles at the node in the dendrogram is evaluated by a statistics-based value: the variance inflation factor in the multiple regression analysis. The evaluation leads to an automatic determination of the cluster boundaries without any additional analyses and any biological knowledge of the measured genes. The performance of the present procedure is demonstrated on the profiles of 2467 yeast genes, with very promising results. AVAILABILITY: A set of programs will be electronically sent upon request. CONTACT: horimoto@post.saga-med.ac.jp; toh@beri.co.jp
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号