首页 | 本学科首页   官方微博 | 高级检索  
   检索      

基因组序列8-mer频次使用规律及与物种进化的关系
引用本文:朱孝先,杨镇,段成妍,吕文萍,李宏.基因组序列8-mer频次使用规律及与物种进化的关系[J].生物信息学,2016,14(4):195-202.
作者姓名:朱孝先  杨镇  段成妍  吕文萍  李宏
作者单位:内蒙古大学物理科学与技术学院,呼和浩特 010021,内蒙古大学物理科学与技术学院,呼和浩特 010021,内蒙古大学物理科学与技术学院,呼和浩特 010021,内蒙古大学物理科学与技术学院,呼和浩特 010021,内蒙古大学物理科学与技术学院,呼和浩特 010021
基金项目:国家自然科学基金项目(No.31260219);国家级大学生创新训练计划项目(No.201512149)。
摘    要:基因组序列k-mer的非随机使用规律及包含的生物学意义一直是人们关注的问题,目前还没有根本性进展。本文以七个物种的全部基因序列为样本,得到各物种基因组序列的8-mer频谱分布。发现狗和牛的频谱有三个峰,而斑马鱼、青鳉鱼、秀丽线虫和酿酒酵母的频谱只有一个峰,鸡的频谱分布形状介于两者之间。将8-mer集合按照XY二核苷含量分类,结果显示只有CG二核苷分类下0CG、1CG和2CG三类子集的频谱形成各自独立的单峰分布。对照随机序列,发现0CG模体是随机进化的,1CG和2CG模体是定向进化的,它们的使用频次远小于随机频次,且这种独立进化分离规律具有物种普适性。三个CG子集频谱之间的距离是产生单峰或多峰现象的根本原因。将七个物种基因组序列标准化到109bp,比较发现1CG和2CG子集频谱与物种进化显著相关,0CG子集频谱与物种进化无显著关系。可以认为三种CG模体各自执行着不同的生物学功能。基因组序列8-mer的独立分离规律为揭示基因组结构、基因组进化以及模体的生物功能提供了一种新的思维方式。

关 键 词:基因组序列  8-mer频谱  CG二核苷分类  独立分离规律  基因组进化
收稿时间:2016/6/23 0:00:00
修稿时间:2016/8/24 0:00:00

Rules of 8-mer usage in genome sequences and its relation to genome evolution
Institution:School of Physical Science and Technology, Inner Mongolia University, Huhhot 010021, China,School of Physical Science and Technology, Inner Mongolia University, Huhhot 010021, China,School of Physical Science and Technology, Inner Mongolia University, Huhhot 010021, China,School of Physical Science and Technology, Inner Mongolia University, Huhhot 010021, China and School of Physical Science and Technology, Inner Mongolia University, Huhhot 010021, China
Abstract:The rules of k-mer non-random usage in genome sequences and its biological significance are important problems and its mechanism is still not clear. Based on seven genome sequences, the distributions of 8-mer frequency spectra were gotten. Results show that 8-mer spectra of dog and cow are trimodal and of zebra fish, medaka, nematode and yeast are unimodal. For chicken genome, the 8-mer spectrum is a medium between the two models. When the 8-mer set were classified into three subsets according to XY dinucleotide content, results show that only if in CG dinucleotide classification, the 0CG, 1CG and 2CG subsets form independent and unimodal distributions respectively. Compared with random sequences, it is found that 0CG motifs are the result of the random evolution, 1CG/2CG motifs are the result of the directed evolution and their frequencies are far low from the random frequencies. The rules of independent separation for the three CG subsets have species universality. Results indicate that the prime reasons about unimdals or multimodals of 8-mer spectra in different species are the distance differences of the three CG spectra. When seven genome sequences are normalized into 109 bp, results show that the spectra of 1CG and 2CG motifs are correlated significantly with genome evolution and of 0CG motifs has not obvious relation to genome evolution. We think that the three CG motifs have different biological functions. The rules of independent separation for the three CG subsets will provide a novel idea to research genome structures and evolutions and provide a method to reveal the functional elements in genome sequences.
Keywords:Genome sequence  8-mer spectrum  CG dinucleotide classification  Independent separation rule  Genome evolution
本文献已被 CNKI 等数据库收录!
点击此处可从《生物信息学》浏览原始摘要信息
点击此处可从《生物信息学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号