HCV genotyping using statistical classification approach |
| |
Authors: | Ping Qiu Xiao-Yan Cai Wei Ding Qing Zhang Ellie D Norris Jonathan R Greene |
| |
Institution: | (1) Molecular Design and Informatics, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, NJ 07033, USA;(2) Biotechnology and Molecular Bioanalytics, Schering-Plough Research Institute, 1011 Morris Avenue, Union, 07083 New Jersey, USA |
| |
Abstract: | The genotype of Hepatitis C Virus (HCV) strains is an important determinant of the severity and aggressiveness of liver infection
as well as patient response to antiviral therapy. Fast and accurate determination of viral genotype could provide direction
in the clinical management of patients with chronic HCV infections. Using publicly available HCV nucleotide sequences, we
built a global Position Weight Matrix (PWM) for the HCV genome. Based on the PWM, a set of genotype specific nucleotide sequence
"signatures" were selected from the 5' NCR, CORE, E1, and NS5B regions of the HCV genome. We evaluated the predictive power
of these signatures for predicting the most common HCV genotypes and subtypes. We observed that nucleotide sequence signatures
selected from NS5B and E1 regions generally demonstrated stronger discriminant power in differentiating major HCV genotypes
and subtypes than that from 5' NCR and CORE regions. Two discriminant methods were used to build predictive models. Through
10 fold cross validation, over 99% prediction accuracy was achieved using both support vector machine (SVM) and random forest
based classification methods in a dataset of 1134 sequences for NS5B and 947 sequences for E1. Prediction accuracy for each
genotype is also reported. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|