首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
白芷为常用的药食同源物种,既是临床常用中药,又是香料,用途十分广泛。为获取白芷全基因组序列信息,该研究首次以杭白芷叶片DNA为材料,采用Nanopore测序技术构建杭白芷全基因组数据库,并利用生物信息学方法对获得的核苷酸序列进行组装、功能注释以及进化分析研究。结果表明:(1)原始测序数据过滤后获得662 Gb三代数据,Read N50约为32 932 bp,经过组装得到杭白芷基因组大小为5.6 Gb, Contig N50约为806 638 bp。(2)组装后的序列通过与KOG、GO、KEGG等功能数据库比对,得到了功能注释的基因占66.47%,KOG功能注释结果表明杭白芷的蛋白功能主要集中在一般功能预测、翻译后修饰、蛋白质转换、伴侣以及信号转导机制;GO功能分类表明杭白芷的基因集中在生物学过程及细胞组分;KEGG通路注释表明参与代谢途径的基因占主要地位。(3)杭白芷中鉴定到45个BGLU家族基因。该研究首次利用第三代测序技术对杭白芷全基因组进行解析,为杭白芷的系统生物学研究和BGLU在杭白芷生长发育中的后续功能研究提供了重要的理论参考。  相似文献   

2.
出芽短梗霉因其发酵产物种类的多样性而具有广阔的工业应用前景。本研究利用下一代测序技术,对一株高产普鲁兰多糖的出芽短梗霉菌株(Aureobasidium pullulans CCTCC M 2012259)全基因组进行测序、组装和生物信息学分析。研究表明,该菌株的基因组全长约为26.37 Mb,共包含36条scaffolds和76 contigs,Gen Bank登录号:PRJNA350822。利用Gene Mark-ES软件对该基因组进行基因预测,共得到10 069个编码蛋白的基因。使用Blastp将其与Uniprot KB数据库中所有已知真菌蛋白进行比对,发现有6 218个预测蛋白与Uniprot KB数据库中的4 925个已知蛋白高度相似。利用DAVID工具对这些蛋白进行GO基因功能注释、KEGG通路注释和蛋白酶分析,分别注释得到4 444条GO功能条目、1 566条KEGG通路条目和1 740条蛋白酶信息。测定与分析为今后针对出芽短梗霉的功能基因挖掘以及分子遗传改造等工作的开展奠定了坚实的理论基础。  相似文献   

3.
牟少华  李娟  李雪平  高健 《广西植物》2022,42(8):1383-1393
毛竹是我国重要的经济竹种,在长期栽培适应过程中产生了丰富的变异。为揭示毛竹竹秆变异变型的全基因组突变类型,以黄皮毛竹、金丝毛竹、绿皮花毛竹和花毛竹4个毛竹变型为实验材料,采用高通量重测序技术获得全基因组序列,进行单核苷多态性(SNP)、小片段插入缺失(InDel)和结构变异(SV)检测和注释,并将变异基因进行功能注释。结果表明:花毛竹基因组检测得到的基因变异数最多,为12 555个; 金丝毛竹样品变异位点数最少,为11 923个; 4个样品都有7 000多个变异基因得到功能注释。GO注释分类包括细胞组件、分子功能和生物过程三个基因功能分类体系的56个功能组。在细胞组件方面,叶绿素合成相关基因有2 431个; 在生物过程方面,参与类胡萝卜素合成过程的基因有75个,参与花青素合成过程中的调控以及紫外光下组织中花青素积累的相关基因有80个。COG分类表明参与复制、重组和修复的基因数为369个,信号转导机制的基因数为291个,转录的相关基因为222个。通过KEGG数据库系统地分析变异基因参与的黄酮类、类胡萝卜素等物质代谢合成途径。深入研究这些差异基因的调控途径,从DNA水平上解释竹秆的变异机制,可为深入研究毛竹种内丰富的多态性和遗传变异提供数据支持,阐析不同变异类型的基因家族、功能基因等遗传基础。  相似文献   

4.
《遗传》2020,(7)
随着测序技术的不断发展,产生了海量的基因组测序数据,极大地丰富了公共遗传数据资源。同时为了应对大量基因组数据的产生,基因组比较和注释算法、工具不断更新,使得联合多种注释工具得到更准确的蛋白编码基因的注释信息成为可能。目前公共数据库的原核生物基因组测序和装配有些是10多年前的,存在大量预测的功能未知的编码基因。为了提升美国国家生物信息中心(National Center for Biotechnology Information,NCBI)数据库中基因组的注释质量,本研究联合使用多种原核基因识别算法/软件和基因表达数据重注释1587个细菌和古细菌基因组。首先,利用Z曲线的33个变量从177个基因组原注释中识别获得3092个被过度注释为蛋白编码基因的序列;其次,通过同源比对为939个基因组中的4447个功能未知的蛋白编码基因注释上具体功能;最后,通过联合采用ZCURVE 3.0和Glimmer 3.02以及Prodigal这3种高精度的、广泛使用且基于算法不同而互补的基因识别软件来寻找漏注释基因。最终,从9个基因组中找到了2003个被漏注释的蛋白编码基因,这些基因属于多个蛋白质直系同源簇(clusters of orthologous groups of proteins, COG)。本研究使用新的工具并结合多组学数据重新注释早期测序的细菌和古细菌基因组,不仅为新测序菌株提供注释方法参考,而且这些重注释后得到的细菌基因序列也会对后续基础研究有所帮助。  相似文献   

5.
【目的】海单胞菌Marinomonas sp. FW-1是1株经验证可以获得高活性芳基硫酸酯酶的菌株。为深入研究FW-1菌株产芳基硫酸酯酶机制,进一步筛选高活性的芳基硫酸酯酶基因片段,有必要解析FW-1菌株的全基因组序列信息。【方法】本研究采用高通量测序技术对FW-1进行全基因组测序,使用相关软件对测序数据进行基因组装、基因预测与功能注释、COG聚类分析等。结合异源表达的方法对其不同基因片段所产生的芳基硫酸酯酶活性进行分析。【结果】全基因组测序结果表明该基因组大小为3964876 bp,GC含量为44.03%,编码3590个蛋白基因,含有78个tRNA和25个rRNA操纵子。从全基因组测序结果中找到22个可能具有芳基硫酸酯酶活性的基因,对其中4个进一步异源表达后发现FW-1中至少含有的3个具有芳基硫酸酯酶活性的基因,其均含有芳基硫酸酯酶的特异性氨基酸基团C-X-P-X-R基团。【结论】本研究首次报道了1株含有多个芳基硫酸酯酶基因序列的菌株FW-1的全基因组序列,分析了基因组的基本特征,为芳基硫酸酯酶的进一步应用提供了思路。  相似文献   

6.
【目的】海单胞菌Marinomonas sp. FW-1是1株经验证可以获得高活性芳基硫酸酯酶的菌株。为深入研究FW-1菌株产芳基硫酸酯酶机制,进一步筛选高活性的芳基硫酸酯酶基因片段,有必要解析FW-1菌株的全基因组序列信息。【方法】本研究采用高通量测序技术对FW-1进行全基因组测序,使用相关软件对测序数据进行基因组装、基因预测与功能注释、COG聚类分析等。结合异源表达的方法对其不同基因片段所产生的芳基硫酸酯酶活性进行分析。【结果】全基因组测序结果表明该基因组大小为3964876 bp,GC含量为44.03%,编码3590个蛋白基因,含有78个tRNA和25个rRNA操纵子。从全基因组测序结果中找到22个可能具有芳基硫酸酯酶活性的基因,对其中4个进一步异源表达后发现FW-1中至少含有的3个具有芳基硫酸酯酶活性的基因,其均含有芳基硫酸酯酶的特异性氨基酸基团C-X-P-X-R基团。【结论】本研究首次报道了1株含有多个芳基硫酸酯酶基因序列的菌株FW-1的全基因组序列,分析了基因组的基本特征,为芳基硫酸酯酶的进一步应用提供了思路。  相似文献   

7.
红球菌属微生物因其自身较强的有机物耐受性和较宽的降解谱,能够适应多种生境而被广泛应用于生物脱硫、石油污染修复、有毒有机化合物降解、污水处理等领域。本研究利用单分子PacBio测序技术,对一株耐有机溶剂的赤红球菌SD3 (Rhodococcus ruber SD3)全基因组进行测序并进行生物信息学分析。该菌株的全基因组长度大约为5.37 Mb,GC含量为70.63%,GenBank序列登录号为CP029146。使用Barrnap0.4.2和tRNAscan-SEv1.3.1软件对基因组中包含的rRNA基因和tRNA基因进行预测,发现有12个rRNA基因和53个tRNA基因。利用Glimmer3.02软件对该基因组进行基因预测,共得到5 120个编码蛋白的基因。将预测的蛋白序列同时与KEGG、STRING和GO三类数据库进行Blastp比对,共计2 836个蛋白基因获得COG功能注释,并且注释得到3 130条GO功能条目和2 190条KEGG通路条目。此外,基于荧光定量PCR的分析表明在甲苯和苯酚胁迫下,赤红球菌SD3中热休克蛋白DnaK的表达分别上调了29.87倍和3.93倍。这些研究结果为赤红球菌的遗传改造和揭示赤红球菌的有机溶剂耐受性机制提供了理论依据。  相似文献   

8.
厚朴为著名的传统药用植物,归于木兰科、木兰属,于我国广泛种植,其树皮、根皮、枝皮、叶片、花、果实均能入药或食用。为获取厚朴全基因组序列信息,该文以厚朴叶片DNA为材料,采用Pacbio Sequel第三代测序技术构建厚朴全基因组数据库,并利用生物信息学方法对获得的核苷酸序列进行组装、功能注释以及进化分析研究。结果表明:(1)原始测序数据过滤后获得140.91 Gb三代数据,Read N50约为13 784bp,经过组装得到厚朴基因组大小为1.68 Gb,Contig N50约为222 069 bp,单拷贝基因完整性为81.0%。(2)组装后的序列通过与NR、KOG、KEGG等功能数据库比对,共有98.40%的基因得到了功能注释,其中KOG功能注释结果发现厚朴的蛋白功能主要集中在一般功能预测、翻译后修饰、蛋白质转换、伴侣以及信号转导机制; GO功能分类表明厚朴的基因集中在细胞组分及生物学过程; KEGG分析发现厚朴参与代谢通路的基因占主要地位。(3)通过与葡萄、拟南芥、水稻、杨树、银杏、无油樟、茶树及牛樟基因组的比对分析,发现厚朴23 424个基因中有20 801个基因可以分类到12 129个家族,其中有515个基因家族为厚朴所特有,而厚朴与牛樟(樟科)亲缘关系较近,两者的分化时间约在122.5百万年前(mya)。该研究首次利用第三代测序技术对厚朴全基因组解析,有利于对其进一步进行深入的开发与利用,也为研究其他药用植物全基因组奠定了基础。  相似文献   

9.
为挖掘木贼镰孢(Fusarium equiseti (Corda) Sacc.)的产毒基因及明确其进化关系,通过BLAST软件与GO、KEGG、COG、E职NOG、CAZy等14个数据库结合的方法对其全基因组进行功能注释并挖掘产毒基因,进行系统进化分析及运用色谱技术研究产毒基因的分泌规律;以麦根腐平脐蠕孢、燕麦镰孢、尖...  相似文献   

10.
耐草甘膦菌株泛菌属S1536 (Pantoea rodasii S1536)是从草甘膦污染土壤中分离筛选出来的一种革兰氏阴性菌,对草甘膦的抗性能高达400 mmol/L。本研究首次通过PacBio RSII平台对泛菌属S1536进行全基因组测序,使用SMRT portal软件对reads进行组装,获得10个contigs,最终获得高质量且无间隔的泛菌属S1536基因组包含1条染色体和2个质粒,序列全长为5.16 Mb,其GC含量为55.16%。本研究进一步对基因组序列进行基因预测与功能注释、COG和GO聚类分析,预测了次级代谢产物合成基因簇及草甘膦抗性机制,最终得到编码基因有4 843个,4 656个蛋白获得COG功能注释,预测到2个NRPS类基因簇、1个thiopeptide类基因簇和1个hserlactone-arylpolyene类基因簇。莽草酸代谢途径关键酶EPSP合酶基因分析表明,泛菌属S1536中的EPSP合酶属于ClassⅠ型EPSPS。根据已报道的3株同一种Pantoea rodasii strain ND03、Pantoea rodasii strain DSM 26611和Pantoea rodasii strain LMG 26273的基因组信息进行全基因组关联比较分析,结果显示4株菌虽然属于同一种,但在进化中产生了较大差异。相关研究结果将为泛菌属S1536功能基因组学研究、耐除草剂机制的研究及耐除草剂基因的挖掘提供基础数据。  相似文献   

11.
Ouzounis CA  Karp PD 《Genome biology》2002,3(2):comment2001.1-comment20016
Annotation, the process by which structural or functional information is inferred for genes or proteins, is crucial for obtaining value from genome sequences. We define the process of annotating a previously annotated genome sequence as 're-annotation', and examine the strengths and weaknesses of current manual and automatic genome-wide re-annotation approaches.  相似文献   

12.
13.
This article reviews the advances in molecular genetics that have led to the identification of genes and markers associated with meat quality in pig. The development of a considerable number of annotated livestock genome sequences represents an incredibly rich source of information that can be used to identify candidate genes responsible for complex traits and quantitative trait loci effects. In pig, the huge amount of information emerging from the study of the genome has helped in the acquisition of new knowledge concerning biological systems and it is opening new opportunities for the genetic selection of this specie. Among the new fields of genomics recently developed, functional genomics and proteomics that allow considering many genes and proteins at the same time are very useful tools for a better understanding of the function and regulation of genes, and how these participate in complex networks controlling the phenotypic characteristics of a trait. In particular, global gene expression profiling at the mRNA and protein level can provide a better understanding of gene regulation that underlies biological functions and physiology related to the delivery of a better pig meat quality. Moreover, the possibility to realize an integrated approach of genomics and proteomics with bioinformatics tools is essential to obtain a complete exploitation of the available molecular genetics information. The development of this knowledge will benefit scientists, industry and breeders considering that the efficiency and accuracy of the traditional pig selection schemes will be improved by the implementation of molecular data into breeding programs.  相似文献   

14.
We have developed a rice (Oryza sativa) genome annotation database (Osa1) that provides structural and functional annotation for this emerging model species. Using the sequence of O. sativa subsp. japonica cv Nipponbare from the International Rice Genome Sequencing Project, pseudomolecules, or virtual contigs, of the 12 rice chromosomes were constructed. Our most recent release, version 3, represents our third build of the pseudomolecules and is composed of 98% finished sequence. Genes were identified using a series of computational methods developed for Arabidopsis (Arabidopsis thaliana) that were modified for use with the rice genome. In release 3 of our annotation, we identified 57,915 genes, of which 14,196 are related to transposable elements. Of these 43,719 non-transposable element-related genes, 18,545 (42.4%) were annotated with a putative function, 5,777 (13.2%) were annotated as encoding an expressed protein with no known function, and the remaining 19,397 (44.4%) were annotated as encoding a hypothetical protein. Multiple splice forms (5,873) were detected for 2,538 genes, resulting in a total of 61,250 gene models in the rice genome. We incorporated experimental evidence into 18,252 gene models to improve the quality of the structural annotation. A series of functional data types has been annotated for the rice genome that includes alignment with genetic markers, assignment of gene ontologies, identification of flanking sequence tags, alignment with homologs from related species, and syntenic mapping with other cereal species. All structural and functional annotation data are available through interactive search and display windows as well as through download of flat files. To integrate the data with other genome projects, the annotation data are available through a Distributed Annotation System and a Genome Browser. All data can be obtained through the project Web pages at http://rice.tigr.org.  相似文献   

15.
16.
The European rabbit (Oryctolagus cuniculus) is relevant in a large spectrum of fields: it is a livestock, a pet, a biomedical model and a biotechnology tool, a wild resource and a pest. The sequencing of the rabbit genome has opened new perspectives to study this lagomorph at the genome level. We herein investigated for the first time the O. cuniculus genome by array comparative genome hybridization (aCGH) and established a first copy number variation (CNV) genome map in this species comprising 155 copy number variation regions (CNVRs; 95 gains, 59 losses, 1 with both gain and loss) covering ~0.3% of the OryCun2.0 version. About 50% of the 155 CNVRs identified spanned 139 different protein coding genes, 110 genes of which were annotated or partially annotated (including Major Histocompatibility Complex genes) with 277 different gene ontology terms. Many rabbit CNVRs might have a functional relevance that should be further investigated.  相似文献   

17.
18.
Large-scale prokaryotic gene prediction and comparison to genome annotation   总被引:4,自引:0,他引:4  
MOTIVATION: Prokaryotic genomes are sequenced and annotated at an increasing rate. The methods of annotation vary between sequencing groups. It makes genome comparison difficult and may lead to propagation of errors when questionable assignments are adapted from one genome to another. Genome comparison either on a large or small scale would be facilitated by using a single standard for annotation, which incorporates a transparency of why an open reading frame (ORF) is considered to be a gene. RESULTS: A total of 143 prokaryotic genomes were scored with an updated version of the prokaryotic genefinder EasyGene. Comparison of the GenBank and RefSeq annotations with the EasyGene predictions reveals that in some genomes up to approximately 60% of the genes may have been annotated with a wrong start codon, especially in the GC-rich genomes. The fractional difference between annotated and predicted confirms that too many short genes are annotated in numerous organisms. Furthermore, genes might be missing in the annotation of some of the genomes. We predict 41 of 143 genomes to be over-annotated by >5%, meaning that too many ORFs are annotated as genes. We also predict that 12 of 143 genomes are under-annotated. These results are based on the difference between the number of annotated genes not found by EasyGene and the number of predicted genes that are not annotated in GenBank. We argue that the average performance of our standardized and fully automated method is slightly better than the annotation.  相似文献   

19.
20.
Automatic annotation of eukaryotic genes,pseudogenes and promoters   总被引:1,自引:0,他引:1  
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号