首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Novel genomes are today often annotated by small consortia or individuals whose background is not from bioinformatics.This audience requires tools that are easy to use.Such need has been addressed by several genome annotation tools and pipelines.Visualizing resulting annotation is a crucial step of quality control.The UCSC Genome Browser is a powerful and popular genome visualization tool.Assembly Hubs,which can be hosted on any publicly available web server,allow browsing genomes via UCSC Genome Browser servers.The steps for creating custom Assembly Hubs are well documented and the required tools are publicly available.However,the number of steps for creating a novel Assembly Hub is large.In some cases,the format of input files needs to be adapted,which is a difficult task for scientists without programming background.Here,we describe Make Hub,a novel command line tool that generates Assembly Hubs for the UCSC Genome Browser in a fully automated fashion.The pipeline also allows extending previously created Hubs by additional tracks.Make Hub is freely available for downloading at https://github.com/Gaius-Augustus/Make Hub.  相似文献   

2.

Background

Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs).

Results

The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced.

Conclusions

We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1826-4) contains supplementary material, which is available to authorized users.  相似文献   

3.
四川山鹧鸪Arborophila rufipectus是中国特有的珍稀濒危鸟类.本研究对1只成年雄性四川山鹧鸪个体的心脏、肝脏和肾脏进行了转录组测序、组装和注释.其原始序列过滤后分别产生了5.70 G、4.60 G和5.16 G数据.286661条转录本经过Trinity组装并去掉冗余后共得到234488个基因.BUS...  相似文献   

4.
The Genome Warehouse (GWH) is a public repository housing genome assembly data for a wide range of species and delivering a series of web services for genome data submission, storage, release, and sharing. As one of the core resources in the National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB; https://ngdc.cncb.ac.cn), GWH accepts both full and partial (chloroplast, mitochondrion, and plasmid) genome sequences with different assembly levels, as well as an update of existing genome assemblies. For each assembly, GWH collects detailed genome-related metadata of biological project, biological sample, and genome assembly, in addition to genome sequence and annotation. To archive high-quality genome sequences and annotations, GWH is equipped with a uniform and standardized procedure for quality control. Besides basic browse and search functionalities, all released genome sequences and annotations can be visualized with JBrowse. By May 21, 2021, GWH has received 19,124 direct submissions covering a diversity of 1108 species and has released 8772 of them. Collectively, GWH serves as an important resource for genome-scale data management and provides free and publicly accessible data to support research activities throughout the world. GWH is publicly accessible at https://ngdc.cncb.ac.cn/gwh.  相似文献   

5.
Wild castor grows in the high-altitude tropical desert of the African Plateau,a region known for high ultraviolet radiation,strong light,and extremely dry condition.To investigate the potential genetic basis of adaptation to both highland and tropical deserts,we generated a chromosome-level genome sequence assembly of the wild castor accession WT05,with a genome size of 316 Mb,a scaffold N50 of 31.93 Mb,and a contig N50 of 8.96 Mb,respectively.Compared with cultivated castor and other Euphorbiac...  相似文献   

6.
Next-generation sequencing(NGS) technologies generate thousands to millions of genetic variants per sample.Identification of potential disease-causal variants is labor intensive as it relies on filtering using various annotation metrics and consideration of multiple pathogenicity prediction scores.We have developed VPOT(variant prioritization ordering tool),a python-based command line tool that allows researchers to create a single fully customizable pathogenicity ranking score from any number of annotation values,each with a user-defined weighting.The use of VPOT can be informative when analyzing entire cohorts,as variants in a cohort can be prioritized.VPOT also provides additional functions to allow variant filtering based on a candidate gene list or by affected status in a family pedigree.VPOT outperforms similar tools in terms of efficacy,flexibility,scalability,and computational performance.VPOT is freely available for public use at Git Hub(https://github.com/VCCRI/VPOT/).Documentation for installation along with a user tutorial,a default parameter file,and test data are provided.  相似文献   

7.
A型流行性感冒病毒的负链RNA基因组由编码病毒中12个蛋白质的八个节段组成。在病毒组装的最后阶段,病毒体从细胞顶端胞浆膜突出时将这些基因组的病毒体(v)RNAs吸收进其中。基因组分段赋予了流感病毒进化的优势,但也提出了问题,在病毒体组装时需要八个节段每一个的至少一个复制本以产生完全有传染性的病毒颗粒。历史上一直存在争论:一方赞同确保足额的基因组合并的特异性包装机制;另一方赞同基因组节段被随机选择而不是以充足数量被包装以确保能自行产生合理比例病毒体的替代模式。近年来人们对该问题已达成一致意见:大多数病毒体仅包含八个节段,特异性机制为选择每个vRNA的某一复制本的确发挥了作用。本综述总结了得出这一结论所做的工作,叙述了在识别特异性包装信号方面最新的进展,讨论了这些RNA元素运转的可能机制。  相似文献   

8.
Alfalfa(Medicago sativa L.) is the most important legume forage crop worldwide with high nutritional value and yield.For a long time,the breeding of alfalfa was hampered by lacking reliable information on the autotetraploid genome and molecular markers linked to important agronomic traits.We herein reported the de novo assembly of the allele-aware chromosome-level genome of Zhongmu-4,a cultivar widely cultivated in China,and a comprehensive database of genomic variations based on resequencing of...  相似文献   

9.
RNA interference (RNAi) is a widely adopted tool for loss-of-function studies but RNAi results only have biological relevance if the reagents are appropriately mapped to genes. Several groups have designed and generated RNAi reagent libraries for studies in cells or in vivo for Drosophila and other species. At first glance, matching RNAi reagents to genes appears to be a simple problem, as each reagent is typically designed to target a single gene. In practice, however, the reagent–gene relationship is complex. Although the sequences of oligonucleotides used to generate most types of RNAi reagents are static, the reference genome and gene annotations are regularly updated. Thus, at the time a researcher chooses an RNAi reagent or analyzes RNAi data, the most current interpretation of the RNAi reagent–gene relationship, as well as related information regarding specificity (e.g., predicted off-target effects), can be different from the original interpretation. Here, we describe a set of strategies and an accompanying online tool, UP-TORR (for Updated Targets of RNAi Reagents; www.flyrnai.org/up-torr), useful for accurate and up-to-date annotation of cell-based and in vivo RNAi reagents. Importantly, UP-TORR automatically synchronizes with gene annotations daily, retrieving the most current information available, and for Drosophila, also synchronizes with the major reagent collections. Thus, UP-TORR allows users to choose the most appropriate RNAi reagents at the onset of a study, as well as to perform the most appropriate analyses of results of RNAi-based studies.  相似文献   

10.
11.
Published genomes frequently contain erroneous gene models that represent issues associated with identification of open reading frames,start sites,splice sites,and related structural features.The source of these inconsistencies is often traced back to integration across text file formats designed to describe long read alignments and predicted gene structures.In addition,the majority of gene prediction frameworks do not provide robust downstream filtering to remove problematic gene annotations,nor do they represent these annotations in a format consistent with current file standards.These frameworks also lack consideration for functional attributes,such as the presence or absence of protein domains that can be used for gene model validation.To provide oversight to the increasing number of published genome annotations,we present a software package,the Gene Filtering,Analysis,and Conversion(gFACs),to filter,analyze,and convert predicted gene models and alignments.The software operates across a wide range of alignment,analysis,and gene prediction files with a flexible framework for defining gene models with reliable structural and functional attributes.gFACs supports common downstream applications,including genome browsers,and generates extensive details on the filtering process,including distributions that can be visualized to further assess the proposed gene space.gFACs is freely available and implemented in Perl with support from Bio Perl libraries at https://gitlab.com/Plant Genomics Lab/gFACs.  相似文献   

12.
蛋白质组表达图谱用于基因组功能提示的可行性研究   总被引:1,自引:0,他引:1  
本文以ECO2DBASE(Edition 6) 为研究材料, 探讨了利用蛋白质组表达图谱提供的生命动态活动信息提高基因组功能提示效果的可行性。在设计出一套较为完整的细胞功能簇(CRC)聚类方案的基础上, 经考察,79 个蛋白质聚成4 个不同的CRC。结果显示出功能相关的蛋白质趋向于聚集在相同的CRC中, 如9 种氨酰tRNA 合成酶和4 种热休克蛋白分别准确地聚合到CRC2 和CRC3 中。这些结果提示: 在蛋白质组研究资料比较充分的前提下, 通过有效的算法, 蛋白质组表达图谱可以为基因组功能提示提供非常重要的序列相似性之外的功能信息  相似文献   

13.
The identification of oleaginous yeast species capable of simultaneously utilizing xylose and glucose as substrates to generate value-added biological products is an area of key economic interest. We have previously demonstrated that the Cutaneotrichosporon dermatis NICC30027 yeast strain is capable of simultaneously assimilating both xylose and glucose, resulting in considerable lipid accumulation. However, as no high-quality genome sequencing data or associated annotations for this strain are available at present, it remains challenging to study the metabolic mechanisms underlying this phenotype. Herein, we report a 39,305,439 bp draft genome assembly for C. dermatis NICC30027 comprised of 37 scaffolds, with 60.15% GC content. Within this genome, we identified 524 tRNAs, 142 sRNAs, 53 miRNAs, 28 snRNAs, and eight rRNA clusters. Moreover, repeat sequences totaling 1,032,129 bp in length were identified (2.63% of the genome), as were 14,238 unigenes that were 1,789.35 bp in length on average (64.82% of the genome). The NCBI non-redundant protein sequences (NR) database was employed to successfully annotate 11,795 of these unigenes, while 3,621 and 11,902 were annotated with the Swiss-Prot and TrEMBL databases, respectively. Unigenes were additionally subjected to pathway enrichment analyses using the Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Cluster of Orthologous Groups of proteins (COG), Clusters of orthologous groups for eukaryotic complete genomes (KOG), and Non-supervised Orthologous Groups (eggNOG) databases. Together, these results provide a foundation for future studies aimed at clarifying the mechanistic basis for the ability of C. dermatis NICC30027 to simultaneously utilize glucose and xylose to synthesize lipids.  相似文献   

14.
The Network Makeup Artist (NORMA) is a web tool for interactive network annotation visualization and topological analysis, able to handle multiple networks and annotations simultaneously. Precalculated annotations (e.g., Gene Ontology, Pathway enrichment, community detection, or clustering results) can be uploaded and visualized in a network, either as colored pie-chart nodes or as color-filled areas in a 2D/3D Venn-diagram-like style. In the case where no annotation exists, algorithms for automated community detection are offered. Users can adjust the network views using standard layout algorithms or allow NORMA to slightly modify them for visually better group separation. Once a network view is set, users can interactively select and highlight any group of interest in order to generate publication-ready figures. Briefly, with NORMA, users can encode three types of information simultaneously. These are 1) the network, 2) the communities or annotations of interest, and 3) node categories or expression values. Finally, NORMA offers basic topological analysis and direct topological comparison across any of the selected networks. NORMA service is available at http://norma.pavlopouloslab.info, whereas the code is available at https://github.com/PavlopoulosLab/NORMA.  相似文献   

15.
  相似文献   

16.
《遗传学报》2022,49(6):547-558
Sorbus pohuashanensis (Hance) Hedl. is a potential horticulture and medicinal plant, but its genomic and genetic backgrounds remain unknown. Here, we sequence and assemble the S. pohuashanensis reference genome using PacBio long reads. Based on the new reference genome, we resequence a core collection of 22 Sorbus spp. samples, which are divided into 2 groups (G1 and G2) based on phylogenetic and PCA analyses. These phylogenetic clusters are highly consistent with their classification based on leaf shape. Natural hybridization between the G1 and G2 groups is evidenced by a sample (R21) with a highly heterozygous genotype. Nucleotide diversity (π) analysis shows that G1 has a higher diversity than G2 and that G2 originated from G1. During the evolution process, the gene families involved in photosynthesis pathways expanded and the gene families involved in energy consumption contracted. RNA-seq data suggests that flavonoid biosynthesis and heat-shock protein (HSP)-heat-shock factor (HSF) pathways play important roles in protection against sunburn. This study provides new insights into the evolution of Sorbus spp. genomes. In addition, the genomic resources, and the identified genetic variations, especially those related to stress resistance, will help future efforts to produce and breed Sorbus spp.  相似文献   

17.
The DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP) supports gene prediction and/or functional annotation of microbial genomes towards comparative analysis with the Integrated Microbial Genome (IMG) system. DOE-JGI MAP annotation is applied on nucleotide sequence datasets included in the IMG-ER (Expert Review) version of IMG via the IMG ER submission site. Users can submit the sequence datasets consisting of one or more contigs in a multi-fasta file. DOE-JGI MAP annotation includes prediction of protein coding and RNA genes, as well as repeats and assignment of product names to these genes.  相似文献   

18.
19.
Advances in modern sequencing technologies allow us to generate sufficient data to analyze hundreds of bacterial genomes from a single machine in a single day. This potential for sequencing massive numbers of genomes calls for fully automated methods to produce high-quality assemblies and variant calls. We introduce Pilon, a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions. Pilon works with many types of sequence data, but is particularly strong when supplied with paired end data from two Illumina libraries with small e.g., 180 bp and large e.g., 3–5 Kb inserts. Pilon significantly improves draft genome assemblies by correcting bases, fixing mis-assemblies and filling gaps. For both haploid and diploid genomes, Pilon produces more contiguous genomes with fewer errors, enabling identification of more biologically relevant genes. Furthermore, Pilon identifies small variants with high accuracy as compared to state-of-the-art tools and is unique in its ability to accurately identify large sequence variants including duplications and resolve large insertions. Pilon is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains. Pilon is freely available as open source software.  相似文献   

20.
Even with the ubiquity of Sanger sequencing, automated assembly software are predominantly stand-alone software packages for desktop/laptop use with very few online equivalents, thus geospatially constraining sequence analysis and assembly. With increased data output worldwide, there is also a need for automated quality checks and trimming prior to large assemblies, along with automated detection of mutations. Through web servers with expanded automation and functionalities, even smartphones/phablets can be used to perform complex analysis previously limited to desktops, especially if they can upload files from cloud storage. To facilitate such online accessible sequence assembly and analysis, we created Yet Another Quick Assembly, Analysis and Trimming Tool web server for the automated assembly of multiple .ab1 and .FASTQ sequencing reads de novo with automated trimming and scanning of the assembled sequences for single nucleotide polymorphisms and insertions or deletions without installation of software, allowing it to be accessed from anywhere with Internet access and with minimal dependency on other software and web tools.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号