Vertebrate gene predictions and the problem of large genes |
| |
Authors: | Wang Jun Li ShengTing Zhang Yong Zheng HongKun Xu Zhao Ye Jia Yu Jun Wong Gane Ka-Shu |
| |
Institution: | Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 101300, China. |
| |
Abstract: | To find unknown protein-coding genes, annotation pipelines use a combination of ab initio gene prediction and similarity to experimentally confirmed genes or proteins. Here, we show that although the ab initio predictions have an intrinsically high false-positive rate, they also have a consistently low false-negative rate. The incorporation of similarity information is meant to reduce the false-positive rate, but in doing so it increases the false-negative rate. The crucial variable is gene size (including introns)--genes of the most extreme sizes, especially very large genes, are most likely to be incorrectly predicted. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|