首页 | 本学科首页   官方微博 | 高级检索  
     


Improved database searches for orthologous sequences by conditioning on outgroup sequences
Authors:Cotter Philip J  Caffrey Daniel R  Shields Denis C
Affiliation:Department of Clinical Pharmacology, Royal College of Surgeons in Ireland, 123 Stephen's Green, Dublin 2, Ireland.
Abstract:MOTIVATION: Searches of biological sequence databases are usually focussed on distinguishing significant from random matches. However, the increasing abundance of related sequences on databases present a second challenge: to distinguish the evolutionarily most closely related sequences (often orthologues) from more distantly related homologues. This is particularly important when searching a database of partial sequences, where short orthologous sequences from a non-conserved region will score much more poorly than non-orthologous (outgroup) sequences from a conserved region. RESULTS: Such inferences are shown to be improved by conditioning the search results on the scores of an outgroup sequence. The log-odds score for each target sequence identified on the database has the log-odds score of the outgroup sequence subtracted from it. A test group of Caenorhabditis elegans kinase sequences and their identified C.elegans outgroups were searched against a test database of human Expressed Sequence Tag (EST) sequences, where the sets of true target sequences were known in advance. The outgroup conditioned method was shown to identify 58% more true positives ahead of the first false positive, compared to the straightforward search without an outgroup. A test dataset of 151 proteins drawn from the C.elegans genome, where the putative 'outgroup' was assigned automatically, similarly found 50% more true positives using outgroup conditioning. Thus, outgroup conditioning provides a means to improve the results of database searching with little increase in the search computation time.
Keywords:
本文献已被 PubMed Oxford 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号