期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

One of the most important objects in bioinformatics is a gene product (protein or RNA). For many gene products, functional information is summarized in a set of Gene Ontology (GO) annotations. For these genes, it is reasonable to include similarity measures based on the terms found in the GO or other taxonomy. In this paper, we introduce several novel measures for computing the similarity of two gene products annotated with GO terms. The fuzzy measure similarity (FMS) has the advantage that it takes into consideration the context of both complete sets of annotation terms when computing the similarity between two gene products. When the two gene products are not annotated by common taxonomy terms, we propose a method that avoids a zero similarity result. To account for the variations in the annotation reliability, we propose a similarity measure based on the Choquet integral. These similarity measures provide extra tools for the biologist in search of functional information for gene products. The initial testing on a group of 194 sequences representing three proteins families shows a higher correlation of the FMS and Choquet similarities to the BLAST sequence similarities than the traditional similarity measures such as pairwise average or pairwise maximum. 相似文献

8.

First insights into the giant panda (Ailuropoda melanoleuca) blood transcriptome: a resource for novel gene loci and immunogenetics

Lianming Du Wujiao Li Zhenxin Fan Fujun Shen Mingyu Yang Zili Wang Zuoyi Jian Rong Hou Bisong Yue Xiuyue Zhang 《Molecular ecology resources》2015,15(4):1001-1013

相似文献

9.

The ‘TranSeq’ 3′‐end sequencing method for high‐throughput transcriptomics and gene space refinement in plant genomes

下载免费PDF全文

Oren Tzfadia Samuel Bocobza Jonas Defoort Efrat Almekias‐Siegl Sayantan Panda Matan Levy Veronique Storme Stephane Rombauts Diego Adhemar Jaitin Hadas Keren‐Shaul Yves Van de Peer Asaph Aharoni 《The Plant journal : for cell and molecular biology》2018,96(1):223-232

相似文献

10.

Mercator: a fast and simple web server for genome scale functional annotation of plant sequence data

MARC LOHSE AXEL NAGEL THOMAS HERTER PATRICK MAY MICHAEL SCHRODA RITA ZRENNER TAKAYUKI TOHGE ALISDAIR R. FERNIE MARK STITT BJÖRN USADEL 《Plant, cell & environment》2014,37(5):1250-1258

Next‐generation technologies generate an overwhelming amount of gene sequence data. Efficient annotation tools are required to make these data amenable to functional genomics analyses. The Mercator pipeline automatically assigns functional terms to protein or nucleotide sequences. It uses the MapMan ‘BIN’ ontology, which is tailored for functional annotation of plant ‘omics’ data. The classification procedure performs parallel sequence searches against reference databases, compiles the results and computes the most likely MapMan BINs for each query. In the current version, the pipeline relies on manually curated reference classifications originating from the three reference organisms (Arabidopsis, Chlamydomonas, rice), various other plant species that have a reviewed SwissProt annotation, and more than 2000 protein domain and family profiles at InterPro, CDD and KOG. Functional annotations predicted by Mercator achieve accuracies above 90% when benchmarked against manual annotation. In addition to mapping files for direct use in the visualization software MapMan, Mercator provides graphical overview charts, detailed annotation information in a convenient web browser interface and a MapMan‐to‐GO translation table to export results as GO terms. Mercator is available free of charge via http://mapman.gabipd.org/web/guest/app/Mercator . 相似文献

11.

A large‐scale proteogenomics study of apicomplexan pathogens—Toxoplasma gondii and Neospora caninum

下载免费PDF全文

Ritesh Krishna Dong Xia Sanya Sanderson Achchuthan Shanmugasundram Sarah Vermont Axel Bernal Gianluca Daniel‐Naguib Fawaz Ghali Brian P. Brunk David S. Roos Jonathan M. Wastling Andrew R. Jones 《Proteomics》2015,15(15):2618-2628

Proteomics data can supplement genome annotation efforts, for example being used to confirm gene models or correct gene annotation errors. Here, we present a large‐scale proteogenomics study of two important apicomplexan pathogens: Toxoplasma gondii and Neospora caninum. We queried proteomics data against a panel of official and alternate gene models generated directly from RNASeq data, using several newly generated and some previously published MS datasets for this meta‐analysis. We identified a total of 201 996 and 39 953 peptide‐spectrum matches for T. gondii and N. caninum, respectively, at a 1% peptide FDR threshold. This equated to the identification of 30 494 distinct peptide sequences and 2921 proteins (matches to official gene models) for T. gondii, and 8911 peptides/1273 proteins for N. caninum following stringent protein‐level thresholding. We have also identified 289 and 140 loci for T. gondii and N. caninum, respectively, which mapped to RNA‐Seq‐derived gene models used in our analysis and apparently absent from the official annotation (release 10 from EuPathDB) of these species. We present several examples in our study where the RNA‐Seq evidence can help in correction of the current gene model and can help in discovery of potential new genes. The findings of this study have been integrated into the EuPathDB. The data have been deposited to the ProteomeXchange with identifiers PXD000297and PXD000298. 相似文献

12.

Improving the Annotation of Arabidopsis lyrata Using RNA-Seq Data

Vimal Rawat Ahmed Abdelsamad Bj?rn Pietzenuk Danelle K. Seymour Daniel Koenig Detlef Weigel Ales Pecinka Korbinian Schneeberger 《PloS one》2015,10(9)

相似文献

13.

绿色杜氏藻转录组分析 总被引：1，自引：0，他引：1

朱帅旗龚一富杭雨晴刘浩王何瑜《遗传》2015,37(8):828-836

为了深入了解绿色杜氏藻(Dunaliella viridis)基因信息及功能、耐盐相关通路(甘油脂代谢)及关键酶,本文首次通过Illumina HiSeq^TM 2000高通量测序技术对绿色杜氏藻转录组进行测序,利用Trinity软件将数据组装形成转录本,对所有转录本进行COG(Clusters of Orthologous Groups)、GO(Gene Ontology)和KEGG(Kyoto Encyclopedia of Genes and Genomes)分类和功能注释、Pathway注释以及蛋白编码区(Opening reading fragment,ORF)的预测,并对甘油脂代谢通路关键酶基因进行了分析。转录组测序共获得81 593个转录本,其中ORF共有77 117条,约占所有转录本的94.50%。COG分类结果表明,16 569条转录本被分为24个类别。GO分类结果表明,76 436条转录本被注释。在所有注释分类中,生物学过程转录本数量最多,为30 678条,占总转录本数的40.14%。KEGG分析结果表明,317个标准途径中包含26 428条转录本,含转录本最多的类别是代谢,为9949条(37.65%)。与代谢有关的途径为131条,占所有注释途径的41.32%。在甘油脂代谢通路中仅发现1条关键酶转录本(二羟丙酮激酶),该酶可能与绿色杜氏藻耐盐胁迫中甘油的合成有较大关系。本研究进一步完善了绿色杜氏藻的基因信息,为绿色杜氏藻代谢途径研究奠定了坚实的基础。相似文献

14.

The DOE-JGI Standard Operating Procedure for the Annotations of Microbial Genomes

Mavromatis K Ivanova NN Chen IM Szeto E Markowitz VM Kyrpides NC 《Standards in genomic sciences》2009,1(1):63-67

The DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP) supports gene prediction and/or functional annotation of microbial genomes towards comparative analysis with the Integrated Microbial Genome (IMG) system. DOE-JGI MAP annotation is applied on nucleotide sequence datasets included in the IMG-ER (Expert Review) version of IMG via the IMG ER submission site. Users can submit the sequence datasets consisting of one or more contigs in a multi-fasta file. DOE-JGI MAP annotation includes prediction of protein coding and RNA genes, as well as repeats and assignment of product names to these genes. 相似文献

15.

Comparative genomics of six Juglans species reveals disease‐associated gene family contractions

Alexander J. Trouern‐Trend Taylor Falk Sumaira Zaman Madison Caballero David B. Neale Charles H. Langley Abhaya M. Dandekar Kristian A. Stevens Jill L. Wegrzyn 《The Plant journal : for cell and molecular biology》2020,102(2):410-423

相似文献

16.

CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics

Mohit Verma Vinay Kumar Ravi K. Patel Rohini Garg Mukesh Jain 《PloS one》2015,10(8)

相似文献

17.

The IGS Standard Operating Procedure for Automated Prokaryotic Annotation

Galens K Orvis J Daugherty S Creasy HH Angiuoli S White O Wortman J Mahurkar A Giglio MG 《Standards in genomic sciences》2011,4(2):244-251

The Institute for Genome Sciences (IGS) has developed a prokaryotic annotation pipeline that is used for coding gene/RNA prediction and functional annotation of Bacteria and Archaea. The fully automated pipeline accepts one or many genomic sequences as input and produces output in a variety of standard formats. Functional annotation is primarily based on similarity searches and motif finding combined with a hierarchical rule based annotation system. The output annotations can also be loaded into a relational database and accessed through visualization tools. 相似文献

18.

Sma3s: A Three-Step Modular Annotator for Large Sequence Datasets

Antonio Mu?oz-Mérida Enrique Viguera M. Gonzalo Claros Oswaldo Trelles Antonio J. Pérez-Pulido 《DNA research》2014,21(4):341-353

相似文献

19.

Transcriptome analysis in the beet webworm,Spoladea recurvalis (Lepidoptera: Crambidae)

Srinivasan Ramasamy 《Insect Science》2018,25(1):33-44

相似文献

20.

CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts

Alison C Testa James K Hane Simon R Ellwood Richard P Oliver 《BMC genomics》2015,16(1)

相似文献