首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 156 毫秒
采用二代和三代测序技术分别对金针菇单核体菌株“6-3”进行测序,应用4种组装策略进行基因组的de novo组装,对比组装效果。基因组组装的参数方面,仅使用二代测序组装的效果最差,长度大于10kb的Contig全长只有24.6Mb,Contig N50只有23kb,组装率只有59.27%。采用三代组装二代校正的组装策略效果最好,长度大于10kb的Contig全长为38.3Mb,Contig N50为2.8Mb,组装率高达92.16%。保守单拷贝基因拼接效果方面,4种组装策略获得基因组序列与BUSCO数据库里的担子菌的保守单拷贝基因比对,基因完整性均大于94%。在组装准确性方面,经过PCR扩增、Sanger测序验证,三代组装二代校正的基因组序列完整并且连续,同时序列上碱基的SNP、InDel数量最少。综上所述,三代组装二代校正得到的基因组序列具有Contig N50值大、组装率高、碱基准确性高的特点,是食用菌基因组测序较为理想的方案。  相似文献   

以美洲大蠊Periplaneta americana为原料生产的康复新液等药品临床疗效显著,得到了广泛应用。本文以四川好医生攀西药业有限责任公司饲养的药用美洲大蠊为材料,首次采用Illumina Hi Seq 2000和Pac Bio SMRT测序平台开展了全基因组测序,并进行基因组组装、注释和分析。原始测序数据经过滤后得到1.4 Tb的二代测序数据和33.81 Gb的三代测序数据。组装结果表明,美洲大蠊基因组大小为3.26 Gb,这在已报道的昆虫基因组中仅次于东亚飞蝗Locusta migratoria。基因组重复序列含量为62.38%,杂合度为0.635%,表明其为复杂基因组。组装的Contig N50和scaffold N50长度分别为28.2 kb、315 kb,单拷贝基因完整性为88.1%,小片段文库测序数据平均比对率为99.8%,测序和组装质量满足后续分析要求。采用De novo预测、同源预测和基于转录本预测3种方法共注释到14 568个基因,其中92.4%的基因获得了功能注释。本研究首次完成了美洲大蠊的全基因组测序,也是大蠊属Periplaneta昆虫的第一个基因组,为美洲大蠊遗传进化分析和药用基因资源挖掘打下了重要基础。  相似文献   

厚朴为著名的传统药用植物,归于木兰科、木兰属,于我国广泛种植,其树皮、根皮、枝皮、叶片、花、果实均能入药或食用。为获取厚朴全基因组序列信息,该文以厚朴叶片DNA为材料,采用Pacbio Sequel第三代测序技术构建厚朴全基因组数据库,并利用生物信息学方法对获得的核苷酸序列进行组装、功能注释以及进化分析研究。结果表明:(1)原始测序数据过滤后获得140.91 Gb三代数据,Read N50约为13 784bp,经过组装得到厚朴基因组大小为1.68 Gb,Contig N50约为222 069 bp,单拷贝基因完整性为81.0%。(2)组装后的序列通过与NR、KOG、KEGG等功能数据库比对,共有98.40%的基因得到了功能注释,其中KOG功能注释结果发现厚朴的蛋白功能主要集中在一般功能预测、翻译后修饰、蛋白质转换、伴侣以及信号转导机制; GO功能分类表明厚朴的基因集中在细胞组分及生物学过程; KEGG分析发现厚朴参与代谢通路的基因占主要地位。(3)通过与葡萄、拟南芥、水稻、杨树、银杏、无油樟、茶树及牛樟基因组的比对分析,发现厚朴23 424个基因中有20 801个基因可以分类到12 129个家族,其中有515个基因家族为厚朴所特有,而厚朴与牛樟(樟科)亲缘关系较近,两者的分化时间约在122.5百万年前(mya)。该研究首次利用第三代测序技术对厚朴全基因组解析,有利于对其进一步进行深入的开发与利用,也为研究其他药用植物全基因组奠定了基础。  相似文献   

新疆沙冬青是中国荒漠地区代表性常绿阔叶植物,属于第三纪孑遗植物。其极强的逆境耐受性受到了研究者的广泛关注,但由于缺乏基因组序列,分子生物学研究水平进展缓慢。本研究对新疆沙冬青进行了基因组调查测序,共得到65 Gb大小的双端测序数据。结合基于K-mer分析和流式细胞分析的方法,预测基因组大小、杂合率和GC含量等特征,估计基因组大小为770~787 Mb。测序数据拼接构建得到contigs的N50为684 bp,总读长为0.538 Gb;进一步组装后scaffolds的N50为12.09 kb,总读长为0.602 Gb。对拼接数据进行SSR分子标记预测,共得到151858个SSR,其中二核苷酸重复单元比例最高为56.39%,在二核苷酸重复单元中,AT/TA组成形式占多数。本研究首次报道了荒漠植物新疆沙冬青的基因组特征,为后续基因组学研究提供参考。  相似文献   

香瓜茄又名人参果,具有抗氧化、抗肿瘤、抗糖尿病等多种生物活性。为丰富茄科作物基因组信息及进化发育历程,获取香瓜茄全基因组序列信息,同时为香瓜茄相关分子研究奠定基础。以香瓜茄植物组织为试验材料,基于Illumina HiSeq构建小片段文库进行基因组特征评估,利用PacBio三代测序技术、Hi-C技术构建及组装香瓜茄全基因组数据库。利用生物信息学方法对获得的基因组序列进行组装、功能注释以及进化分析研究。结果表明,获得54.11 Gb Illumina HiSeq数据;获得55.08 Gb PacBio数据,reads平均长度为14 179 bp;获得Hi-C数据量约143 Gb;拼接得到该基因组contig序列总长为1.16 Gb,Hi-C纠错后contig N50为22.63 Mb;Hi-C挂载染色体,共有1.12 Gb长度的序列可以挂载到12条染色体上,占比97.16%;其中,能够确定顺序和方向的序列长度为1.08 Gb,占定位染色体序列总长度的96.11%,得到基因组大小1.25 Gb;预测有64.22%的重复序列,41 571个基因,99.06%的基因可以注释到NR、GO、KEGG等数据库中;预测得到4 360个tRNA、5 677个rRNA、154个miRNA;得到449个假基因。香瓜茄与马铃薯的进化时间大约在12.82 MYA。  相似文献   

近年来,随着测序技术的不断发展,基因组测序技术渐趋成熟并在动物和植物基因组上获得了越来越多的成功,大量植物的基因组的草图和精细图不断地被公布出来。比较和分析了三代测序技术各自的特点,对测序前的准备、基因组组装、注释和比较基因组学等方面的研究进展进行了详细的评述,阐明了植物基因组研究的特点和难点。通过植物的全基因组测序,研究者不仅可以获得该植物基因组和重要功能基因的序列信息,为从分子水平研究植物的分子进化、基因组成和基因调控等提供了一定的依据,而且还对即将测序的植物基因组研究具有重要的借鉴意义。  相似文献   

梯棱羊肚菌MT1是在辽宁省大面积栽培的菌株.目前我国羊肚菌的种植区主要分布在川渝地区,能够在东北地区大面积种植的菌株极少,限制了东北地区羊肚菌产业的发展,因此需要对能够在东北地区大量栽培种植的羊肚菌菌株的基因信息进行全面了解.通过PacBio Sequel测序平台的CLR测序方式对羊肚菌菌株MT1进行全基因组测序,并对测序结果进行基因组组装、基因预测与功能注释.结果 表明:MT1基因组大小约为82.29 Mb,测序深度达到282x,N50达到1.028 Mb,GC含量为53.24%.在GO、KEGG、KOG、Swiss-Prot、Nr、CAZy数据库分别注释到9144、8810、2844、3642、9818、529个基因.从基因组层面上解析MT1菌株,为更好地筛选出适宜在东北地区栽培的优良羊肚菌菌株提供参考信息.  相似文献   

【背景】枯草芽孢杆菌N2-10是一株具有较强抑菌能力且能产纤维素酶等多种水解酶的革兰氏阳性菌,在发酵饲料中具有较大的应用潜力。【目的】通过获得枯草芽孢杆菌N2-10的全基因组序列信息,进一步解析菌株次级代谢产物合成基因信息,并通过比较基因组学分析菌株N2-10与模式菌株的差异性,为阐明N2-10抑菌和益生机制提供理论基础。【方法】通过二代Illumina NovaSeq联合三代PacBio Sequel测序平台,对菌株N2-10进行全基因组测序,将测序数据进行基因组组装、基因预测与功能注释,并利用比较基因组学分析N2-10与其他菌株的差异。【结果】菌株N2-10基因组大小为4 036 899 bp,GC含量为43.88%;共编码4 163个编码基因,所有编码基因总长度为3594369bp,编码区总长度占基因组总长度的89.04%;含有85个tRNA、10个5S rRNA、10个16S rRNA、10个23S rRNA,以及2个CRISPR-Cas、1个前噬菌体和6个基因岛;在GO (gene ontolog)、COG (clusters of orthologous groups of...  相似文献   

真菌基因组较其他真核生物基因组结构简单,长度短,易于测序、组装与注释,因此真菌基因组是研究真核生物基因组的模型。为研究真菌基因组组装策略,本研究基于Illumina HiSeq测序平台对烟曲霉菌株An16007基因组测序,分别使用5种de novo组装软件ABySS、SOAP-denovo、Velvet、MaSuRCA和IDBA-UD组装基因组,然后通过Augustus软件进行基因预测,BUSCO软件评估组装结果。研究发现,5种组装软件对基因组组装结果不同,ABySS组装的基因组较其他4种组装软件具有更高的完整性和准确性,且预测的基因数量较高,因此,ABySS更适合本研究基因组的组装。本研究提供了真菌de novo测序、组装及组装质量评估的技术流程,为基因组<100 Mb的真菌或其他生物基因组的研究提供参考。  相似文献   

【目的】Streptomyces sp. PRh5是从东乡野生稻(Oryza rufipogon Griff.)中分离获得的一株对细菌和真菌都具有较强抗菌活性的内生放线菌。为深入研究PRh5菌株抗菌机制及挖掘次级代谢产物基因资源,有必要解析PRh5菌株的基因组序列信息。【方法】采用高通量测序技术对PRh5菌株进行全基因组测序,然后使用相关软件对测序数据进行基因组组装、基因预测与功能注释、直系同源簇(COG)聚类分析、共线性分析及次级代谢产物合成基因簇预测等。【结果】基因组组装获得290 contigs,整个基因组大小约11.1 Mb,GC含量为71.1%,序列已提交至GenBank数据库,登录号为JABQ00000000。同时,预测得到50个次级代谢产物合成基因簇。【结论】将为Streptomyces sp. PRh5的功能基因组学研究及相关次级代谢产物的生物合成途径与异源表达研究提供基础。  相似文献   

High quality genome is of great significance for the mining of biological information resources of species. Up to now, the genomic information of several important economic flatfishes has been well explained. All these fishes are eyes on left side-type, and no high-quality genome of eyes on right side-type species has been reported. In this study, we applied a combined strategy involving stLFR and Hi-C technologies to generate sequencing data for constructing the chromosomal genome of Verasper variegates, which belongs to Pleuronectidae with characteristic of eyes on right side. The size of genome of V. variegatus is 556 Mb. More than 97.2% of BUSCO genes were detected, and N50 lengths of the contigs and scaffolds reached 79.8 Kb and 23.8 Mb, respectively, demonstrating the outstanding completeness and sequence continuity of the genome. A total of 22,199 protein-coding genes were predicted in the assembled genome, and more than 95% of those genes could be functionally annotated. Meanwhile, the genomic collinearity, gene family and phylogenetic analyses of similar species in Pleuronectiformes were also investigated and portrayed for metamorphosis and benthic adaptation. Sex related genes mapping has also been achieved at the chromosome level. This study is the first chromosomal level genome of a Pleuronectidae fish (V. variegatus). The chromosomal genome assembly constructed in this work will not only be valuable for conservation and aquaculture studies of the V. variegatus but will also be of general interest in the phylogenetic and taxonomic studies of Pleuronectiformes.  相似文献   

Sorghum is an important target of plant genomics. This cereal has unusual tolerance to adverse environments, a small genome (750 Mbp) relative to most other grasses, a diverse germplasm, and utility for comparative genomics with rice, maize and other grasses. In this study, a modified cDNA selection protocol was developed to aid the discovery and mapping of genes across an integrated genetic and physical map of the sorghum genome. BAC DNA from the sorghum genome map was isolated and covalently bound in arrayed tubes for efficient liquid handling. Amplifiable cDNA sequence tags were isolated by hybridization to individual sorghum BACs, cloned and sequenced. Analysis of a fully sequenced sorghum BAC indicated that about 80% of known or predicted genes were detected in the sequence tags, including multiple tags from different regions of individual genes. Data from cDNA selection using the fully sequenced BAC indicate that the occurrence of mislocated cDNA tags is very low. Analysis of 35 BACs (5.25 Mb) from sorghum linkage group B revealed (and therefore mapped) two sorghum genes and 58 sorghum ESTs. Additionally, 31 cDNA tags that had significant homologies to genes from other species were also isolated. The modified cDNA selection procedure described here will be useful for genome-wide gene discovery and EST mapping in sorghum, and for comparative genomics of sorghum, rice, maize and other grasses.  相似文献   

Ramie, Boehmeria nivea (L.) Gaudich, family Urticaceae, is a plant native to eastern Asia, and one of the world's oldest fibre crops. It is also used as animal feed and for the phytoremediation of heavy metal‐contaminated farmlands. Thus, the genome sequence of ramie was determined to explore the molecular basis of its fibre quality, protein content and phytoremediation. For further understanding ramie genome, different paired‐end and mate‐pair libraries were combined to generate 134.31 Gb of raw DNA sequences using the Illumina whole‐genome shotgun sequencing approach. The highly heterozygous B. nivea genome was assembled using the Platanus Genome Assembler, which is an effective tool for the assembly of highly heterozygous genome sequences. The final length of the draft genome of this species was approximately 341.9 Mb (contig N50 = 22.62 kb, scaffold N50 = 1,126.36 kb). Based on ramie genome annotations, 30,237 protein‐coding genes were predicted, and the repetitive element content was 46.3%. The completeness of the final assembly was evaluated by benchmarking universal single‐copy orthologous genes (BUSCO); 90.5% of the 1,440 expected embryophytic genes were identified as complete, and 4.9% were identified as fragmented. Phylogenetic analysis based on single‐copy gene families and one‐to‐one orthologous genes placed ramie with mulberry and cannabis, within the clade of urticalean rosids. Genome information of ramie will be a valuable resource for the conservation of endangered Boehmeria species and for future studies on the biogeography and characteristic evolution of members of Urticaceae.  相似文献   

He K  Ye Q  Zhu Y  Chen H  Wan QH  Fang SG 《Gene》2012,507(1):74-78
Chinese alligator (Alligator sinensis) is a rare and endangered species endemic to China. To better understand genetic details of the Chinese alligator genomic structure, a highly redundant bacterial artificial chromosome (BAC) library was constructed. This library consists of 216,238 clones with an average insert size of about 90kb, indicating that the library contains 6.8-fold genome equivalents. Subsequently, we constructed a 516kb contig map for the Chinese alligator olfactory receptor (OR) genes, which spans nine BAC clones, and subjected the BACs to full sequencing. The sequence analysis revealed that this contig contained 16 OR functional genes and meanwhile demonstrated that the nine BACs, which constituted the contig, overlapped correctly, proving the usability of this genome library. As a result, this BAC library could provide a useful platform for physical mapping, genome sequencing or complex analysis of targeted genomic regions for this rare species.  相似文献   

The greenfin horse‐faced filefish, Thamnaconus septentrionalis, is a valuable commercial fish species that is widely distributed in the Indo‐West Pacific Ocean. This fish has characteristic blue–green fins, rough skin and a spine‐like first dorsal fin. Thamnaconus septentrionalis is of conservation concern because its population has declined sharply, and it is an important marine aquaculture fish species in China. Genomic resources for the filefish are lacking, and no reference genome has been released. In this study, the first chromosome‐level genome of T. septentrionalis was constructed using nanopore sequencing and Hi‐C technology. A total of 50.95 Gb polished nanopore sequences were generated and were assembled into a 474.31‐Mb genome, accounting for 96.45% of the estimated genome size of this filefish. The assembled genome contained only 242 contigs, and the achieved contig N50 was 22.46 Mb, a surprisingly high value among all sequenced fish species. Hi‐C scaffolding of the genome resulted in 20 pseudochromosomes containing 99.44% of the total assembled sequences. The genome contained 67.35 Mb of repeat sequences, accounting for 14.2% of the assembly. A total of 22,067 protein‐coding genes were predicted, 94.82% of which were successfully annotated with putative functions. Furthermore, a phylogenetic tree was constructed using 1,872 single‐copy orthologous genes, and 67 unique gene families were identified in the filefish genome. This high‐quality assembled genome will be a valuable resource for a range of future genomic, conservation and breeding studies of T. septentrionalis.  相似文献   

A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号