首页 | 本学科首页   官方微博 | 高级检索  
   检索      

DNA序列高维空间数字编码的运算法则
引用本文:陈惟昌,陈志义,陈志华,王自强,邱红霞.DNA序列高维空间数字编码的运算法则[J].生物物理学报,2001,17(3):542-549.
作者姓名:陈惟昌  陈志义  陈志华  王自强  邱红霞
作者单位:1. 中日友好临床医学研究所
2. 中国科学院自动化研究所
3. 中日友好临床医学研究所生物化学与分子生物学研究室
基金项目:国家自然科学基金项目(39770210)
摘    要:DNA序列的高维空间二进制数字编码,除可以对DNA序列的碱基结构、功能基团、碱基互补、氢键强弱等性质进行编码之外,还可以方便地进行 数学运算和逻辑运算。DNA序列高维空间数字编码的运算法则是:(1)根据DNA序列数码的奇偶性质,可以推导出其与末位碱基的对应关系。当DNA序列S的数值X(S)=4n,4n 1,4n 2,4n 3时,其末位碱基依次为C,T,A,G(n=0,1,2,…)。(2)提出DNA序列高维空间的表观维数Nv,数值维数Nx及差异维数Nd的概念。当Nd=0时,首位碱基为A或G,当Nd=2n或2n 1(n=1,2,…)时,首痊碱基为(C)^n或(C)^nT。(3)推导出DNA序列点突变(单核苷酸多态性SNP)的运算法则。(4)推导出DNA重复序列(Tandem repeat)的运算法则。(5)提出DNA子序列(subsequence)的概念并定义DNA子序列的定值部Xi(digital value)和定位部Qi(location value)及其计算公式。(6)推导出DNA序列的延长运算、删除运算、缺失运算、插入运算、转位运算、换位运算和置换运算等的运算法则。(7)通过按位加运算求得DNA序列的汉明距离dh,碱基距离dh‘,基团距离dh″和共轭距离dG以及这些距离的意义与联系。(8)分析结果表明DNA序列的数字编码比常规的字符编码在数学运算上具有明显的优越性。

关 键 词:DNA序列数字编码  奇偶数  表观维数  单核苷酸多态性  高维空间  表达序列标签  重复序列  DNA序列运算法则
文章编号:1000-6737(2001)03-0542-08
修稿时间:2000年12月29

OPERATIONAL RULES OF THE DIGITAL CODING OF DNA SEQUENCESIN HIGH DIMENSION SPACE
CHEN Wei-chang,CHEN Zhi-yi,CHEN Zhi-hua,WANG Zi-qiang,QIU Hong-xia.OPERATIONAL RULES OF THE DIGITAL CODING OF DNA SEQUENCESIN HIGH DIMENSION SPACE[J].Acta Biophysica Sinica,2001,17(3):542-549.
Authors:CHEN Wei-chang  CHEN Zhi-yi  CHEN Zhi-hua  WANG Zi-qiang  QIU Hong-xia
Institution:CHEN Wei-chang1,CHEN Zhi-yi2,CHEN Zhi-hua3,WANG Zi-qiang1,QIU Hong-xia1
Abstract:Digital coding of DNA sequence has great advantages of mathematical and logical operations. (1). According to the parity of DNA digital sequences, the last nucleotide bases can be determined. When the digital value of DNA sequence X(s)=4n, 4n+1, 4n+2, 4n+3, (n=0, 1, 2,…), the last nucleotide base is C, T, A, G respectively. (2). The difference between the visual dimension Nv and the digital dimension Nx is called the difference dimension Nd of DNA sequence. When Nd=0, the initial nucleotide is A or G, and when Nd=2n or 2n+1,(n=1, 2,…), then the initial nucleotide bases are (C)n or (C)nT. (3). Operation rules for three kinds of point mutation of DNA sequences (transition, transversion and transformation) are derived. (4). The digital coding for a tandem repeat (Sp)n is, X(Sp)n=X(Sp)(2np-1)/(2p-1).(5). DNA sequence Sk with m subsequences, X(Sk)=X(Si)Qi. X(Si) and Qi are the digital value and location value of the DNA subsequence Si repectively. (6). The formulae of truncation operation, the elongation operation, the deletion operation, the insertion operation, the translocation operation, the transformation operation and the substitution operation of DNA subsequences are also deduced. (7). The Hamming value of even bits Vh′ in DNA sequence represents the number of purine base and the Hamming value of odd bits Vh″ is the number of keto base. (8). The relationship of the Hamming distance dh, the base distance db, the functional group distance df and the conjugate distance dG between two DNA sequences are also discussed.
Keywords:Digital coding of DNA sequence  Parity  Visual dimension  DNA subsequence  Single nucleotide polymorphism(SNP)  Expressed sequence tags(EST)  Tandem repeat  Operation rules for DNA sequences!
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《生物物理学报》浏览原始摘要信息
点击此处可从《生物物理学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号