A Model-Based Approach to Identify Binding Sites in CLIP-Seq Data |
| |
Authors: | Tao Wang Beibei Chen MinSoo Kim Yang Xie Guanghua Xiao |
| |
Institution: | 1. Quantitative Biomedical Research Center, Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America.; 2. Simmons Cancer Center, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America.; CSIR-Institute of Microbial Technology, India, |
| |
Abstract: | Cross-linking immunoprecipitation coupled with high-throughput sequencing (CLIP-Seq) has made it possible to identify the targeting sites of RNA-binding proteins in various cell culture systems and tissue types on a genome-wide scale. Here we present a novel model-based approach (MiClip) to identify high-confidence protein-RNA binding sites from CLIP-seq datasets. This approach assigns a probability score for each potential binding site to help prioritize subsequent validation experiments. The MiClip algorithm has been tested in both HITS-CLIP and PAR-CLIP datasets. In the HITS-CLIP dataset, the signal/noise ratios of miRNA seed motif enrichment produced by the MiClip approach are between 17% and 301% higher than those by the ad hoc method for the top 10 most enriched miRNAs. In the PAR-CLIP dataset, the MiClip approach can identify ∼50% more validated binding targets than the original ad hoc method and two recently published methods. To facilitate the application of the algorithm, we have released an R package, MiClip (
http://cran.r-project.org/web/packages/MiClip/index.html
), and a public web-based graphical user interface software (http://galaxy.qbrc.org/tool_runner?tool_id=mi_clip) for customized analysis. |
| |
Keywords: | |
|
|