首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Exploring functional variant discovery in non-coding regions with SInBaD
Authors:Kjong-Van Lehmann  Ting Chen
Institution:Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
Abstract:The thousand genomes project and many similar ongoing large-scale sequencing efforts require new methods to predict functional variants in both coding and non-coding regions in order to understand phenotype and genotype relationships. We report the design of a new model SInBaD (Sequence-Information-Based-Decision-model) which relies on nucleotide conservation information to evaluate any annotated human variant in all known exons, introns, splice junctions and promoter regions. SInBaD builds separate mathematical models for promoters, exons and introns, using the human disease mutations annotated in human gene mutation database as the training dataset for functional variants. The ten-fold cross validation shows high prediction accuracy. Validations on test datasets, demonstrate that variants predicted as functional have a significantly higher occurrence in cancer patients. We also applied our model to variants found in four different individual human genomes to identify a set of functional variants, which might be of interest for further studies. Scores for any possible variants for all annotated genes are available under http://tingchenlab.cmb.usc.edu/sinbad/. SInBaD supports the current standard format of genotyping, the variant call files (VCF 4.0), making it easy to integrate it into any existing next-generation sequencing pipeline. The accuracy of SNP detection poses the only limitation to the use of SInBaD.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号