Statistics of N-terminal alignment as a guide for refining prokaryotic gene annotation |
| |
Authors: | Sato Naoki Tajima Naoyuki |
| |
Affiliation: | Department of Life Sciences, Graduate School of Arts and Sciences, University of Tokyo, Meguro-ku, Tokyo, Japan. naokisat@bio.c.u-tokyo.ac.jp |
| |
Abstract: | Identification of a correct N-terminus of a protein is an important step in genome annotation. However, we sometimes encounter incorrectly annotated N-termini in genomic databases. We analyzed statistics of surplus or missing N-terminal amino acid residues in tentatively translated coding sequence of cyanobacterial database entries, and found that, on average, about 8-9% of the aligned proteins have a putative incorrect N-terminus, although the percentage was dependent on the database entry. In an attempt to find more plausible N-termini for these proteins, we were able to estimate a better-aligning N-terminus in 90% of the cases. TTG was found as a putative initiation codon in most cases of recessed N-termini. This statistical approach, applicable to any group of prokaryotes, will help identify a plausible translation initiation site for each protein-coding gene in newly sequenced genomes, and also is a method of refining the N-terminus of proteins in already published genomes. |
| |
Keywords: | |
本文献已被 ScienceDirect PubMed 等数据库收录! |
|