A grammar-based distance metric enables fast and accurate clustering of large sets of 16S sequences |
| |
Authors: | David J Russell Samuel F Way Andrew K Benson Khalid Sayood |
| |
Affiliation: | 1.Department of Electrical Engineering,University of Nebraska-Lincoln,Lincoln,USA;2.Department of Food Science and Technology,University of Nebraska-Lincoln,Lincoln,USA;3.Core for Applied Genomics and Ecology,University of Nebraska-Lincoln,Lincoln,USA |
| |
Abstract: | Background We propose a sequence clustering algorithm and compare the partition quality and execution time of the proposed algorithm with those of a popular existing algorithm. The proposed clustering algorithm uses a grammar-based distance metric to determine partitioning for a set of biological sequences. The algorithm performs clustering in which new sequences are compared with cluster-representative sequences to determine membership. If comparison fails to identify a suitable cluster, a new cluster is created. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|