An S-Curve-Based Approach of Identifying Biological Sequences |
| |
Authors: | Lian-peng Zhao Ying-hua Lv Chun Li Ming-hai Yao Xi-zi Jin |
| |
Institution: | (1) Bohai University, 121000 Jinzhou, China;(2) College of Computer, Northeast Normal University, 130117 Changchun, China;(3) College of Computer Science and Technology, Jilin University, 130117 Changchun, China |
| |
Abstract: | The main idea of S-curve diagram is to assign different angle values (from 0° to 180°) to different nucleotide acid residues
or to different protein amino acids, and then according to cos α
j
and sin α
j
, the values are accumulated to construct an S-curve diagram, which is in strict one-to-one correspondence with the biological
sequence. In addition, the S-curve diagram proves to be without the degeneracy phenomenon, so that both the degeneracy problem
represented by diagrams and the problem of visualization for biological sequence data are solved. Meanwhile, a new approach
to differentiate the similarity of biological sequences—the degree of similarity—is put forward on the basis of the S-curve
diagram. To put it in detail, the least square approach is first adopted to obtain a straight line equation according to the
S-curve diagram, then according to the distance formula of the point to the straight line, the average ratio of square sum
for the distance between the S-curve and the straight line is calculated, and finally, the similarity of the biological sequences
is presented by the new standard—the degree of similarity. As is shown by the experimental results, the S-curve diagram can
better represent biological sequences (such as protein’s) within Cartesian coordinate system, and the mutation point of biological
sequence. Thus, it turns out that the new standard—the degree of similarity is of obviously great advantage. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|