首页 | 本学科首页   官方微博 | 高级检索  
     


The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes
Authors:Overbeek Ross  Begley Tadhg  Butler Ralph M  Choudhuri Jomuna V  Chuang Han-Yu  Cohoon Matthew  de Crécy-Lagard Valérie  Diaz Naryttza  Disz Terry  Edwards Robert  Fonstein Michael  Frank Ed D  Gerdes Svetlana  Glass Elizabeth M  Goesmann Alexander  Hanson Andrew  Iwata-Reuyl Dirk  Jensen Roy  Jamshidi Neema  Krause Lutz  Kubal Michael  Larsen Niels  Linke Burkhard  McHardy Alice C  Meyer Folker  Neuweger Heiko  Olsen Gary  Olson Robert  Osterman Andrei  Portnoy Vasiliy  Pusch Gordon D  Rodionov Dmitry A  Rückert Christian  Steiner Jason  Stevens Rick  Thiele Ines  Vassieva Olga  Ye Yuzhen  Zagnitko Olga  Vonstein Veronika
Affiliation:Fellowship for Interpretation of Genomes, 15W155 81st Street, Burr Ridge, IL 60527, USA.
Abstract:The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号