首页 | 本学科首页   官方微博 | 高级检索  
   检索      

基于Pacbio第三代测序技术的厚朴基因组测序分析
引用本文:尹彦棚,丁乔娇,罗加伟,林新娜,张 敏,彭 成,高继海.基于Pacbio第三代测序技术的厚朴基因组测序分析[J].广西植物,2021,41(8):1251-1262.
作者姓名:尹彦棚  丁乔娇  罗加伟  林新娜  张 敏  彭 成  高继海
作者单位:成都中医药大学 药学院, 西南特色中药资源国家重点实验室, 成都 611137
基金项目:四川省中医药管理局项目(2018QN001,2016ZY008); 中药学四川省科技厅创新团队(2017TD0001)[Supported by Sichuan Provincial Administration of Traditional Chinese Medicine Program(2018QN001, 2016ZY008); Innovation Team of Sichuan Science and Technology Department(2017TD0001)]。
摘    要:厚朴为著名的传统药用植物,归于木兰科、木兰属,于我国广泛种植,其树皮、根皮、枝皮、叶片、花、果实均能入药或食用。为获取厚朴全基因组序列信息,该文以厚朴叶片DNA为材料,采用Pacbio Sequel第三代测序技术构建厚朴全基因组数据库,并利用生物信息学方法对获得的核苷酸序列进行组装、功能注释以及进化分析研究。结果表明:(1)原始测序数据过滤后获得140.91 Gb三代数据,Read N50约为13 784bp,经过组装得到厚朴基因组大小为1.68 Gb,Contig N50约为222 069 bp,单拷贝基因完整性为81.0%。(2)组装后的序列通过与NR、KOG、KEGG等功能数据库比对,共有98.40%的基因得到了功能注释,其中KOG功能注释结果发现厚朴的蛋白功能主要集中在一般功能预测、翻译后修饰、蛋白质转换、伴侣以及信号转导机制; GO功能分类表明厚朴的基因集中在细胞组分及生物学过程; KEGG分析发现厚朴参与代谢通路的基因占主要地位。(3)通过与葡萄、拟南芥、水稻、杨树、银杏、无油樟、茶树及牛樟基因组的比对分析,发现厚朴23 424个基因中有20 801个基因可以分类到12 129个家族,其中有515个基因家族为厚朴所特有,而厚朴与牛樟(樟科)亲缘关系较近,两者的分化时间约在122.5百万年前(mya)。该研究首次利用第三代测序技术对厚朴全基因组解析,有利于对其进一步进行深入的开发与利用,也为研究其他药用植物全基因组奠定了基础。

关 键 词:厚朴  基因组  第三代测序技术  基因注释  药用植物
收稿时间:2020/3/23 0:00:00

Genomic sequencing analysis of Magnolia officinalis based on Pacbio's third-generation sequencing technology
YIN Yanpeng,DING Qiaojiao,LUO Jiawei,LIN Xinn,ZHANG Min,PENG Cheng,GAO Jihai.Genomic sequencing analysis of Magnolia officinalis based on Pacbio's third-generation sequencing technology[J].Guihaia,2021,41(8):1251-1262.
Authors:YIN Yanpeng  DING Qiaojiao  LUO Jiawei  LIN Xinn  ZHANG Min  PENG Cheng  GAO Jihai
Institution:Key Laboratory of Distinctive Chinese Medicine Resources in Southwest China, Pharmacy College, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
Abstract: Magnolia officinalis is a famous traditional medicinal plant, belonging to the Magnoliaceae family and Magnolia L. genus and being widely cultivated in China. Its barks, root barks, branch barks, leaves, flowers and fruits could be used as medicine or food. However, the whole genome information is little known for this plant species. In order to obtain the whole genome sequence information of M. officinalis, the leaf DNA was used as the material, and the third-generation sequencing technology of Pacbio Sequel was used to establish its nucleotide sequence database. Then genome assembly, function annotation and evolution analysis were carried out by bioinformatic methods. The results were as follows:(1)140.91 Gb the third-generation data were obtained after the original sequencing data, with the Read N50 about 13 784 bp. The assembled M. officinals genome size was 1.68 Gb, contig N50 being about 222 069 bp, and the integrity of single-copy gene being 81.0%.(2)98.40% of the genes from the assembled sequence got gene annotation after being compared with functional databases such as NR, KOG and KEGG. The result of KOG gene annotation was that the protein function of M. officinalis concentrated in the general functional prediction only, posttranslational modification, protein turnover, chaperones signal transduction mechanisms. GO functional classification indicated that the genes of M. officinalis concentrated on cell components and biological processes. KEGG analysis found that the M. officinalis genes mostly involved in metabolic pathways.(3)By comparative genomics analysis, the genomes of Vitis vinifera, Arabidopsis thaliana, Oryza sativa, Poplar trichocarpa, Ginkgo biloba, Amborella trichopoda, Camellia sinensis and Cinnamomum kanehirae were aligned. It was found that 20 801 of 23 424 genes in M. officinalis could be classified into 12 129 families, 515 gene families being unique to M. officinalis. The genetic evolution tree constructed from the genomes of the selected reference species pointed that the M. officinalis(Magnoliaceae)was closely related to Cinnamomum kanehirae(Lauraceae), and the divergence time between the two species was about 122.5 mya. It is the first time to use the third-generation sequencing technology to analyze the whole genome of M. officinalis in the study, which is conducive to its further development and utilization, and also provides the information for the study of the whole genome of other medicinal plants.
Keywords:Magnolia officinalis  genome  the third-generation sequencing technology  gene annotation  medicinal plant
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《广西植物》浏览原始摘要信息
点击此处可从《广西植物》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号