共查询到20条相似文献,搜索用时 328 毫秒
1.
2.
3.
Jonathan M Carlson Arijit Chakravarty Robert H Gross 《Journal of computational biology》2006,13(3):686-701
The identification of potential protein binding sites (cis-regulatory elements) in the upstream regions of genes is key to understanding the mechanisms that regulate gene expression. To this end, we present a simple, efficient algorithm, BEAM (beam-search enumerative algorithm for motif finding), aimed at the discovery of cis-regulatory elements in the DNA sequences upstream of a related group of genes. This algorithm dramatically limits the search space of expanded sequences, converting the problem from one that is exponential in the length of motifs sought to one that is linear. Unlike sampling algorithms, our algorithm converges and is capable of finding statistically overrepresented motifs with a low failure rate. Further, our algorithm is not dependent on the objective function or the organism used. Limiting the space of candidate motifs enables the algorithm to focus only on those motifs that are most likely to be biologically relevant and enables the algorithm to use direct evaluations of background frequencies instead of resorting to probabilistic estimates. In addition, limiting the space of candidate motifs makes it possible to use computationally expensive objective functions that are able to correctly identify biologically relevant motifs. 相似文献
4.
5.
The identification of cis-regulatory modules (CRMs) can greatly advance our understanding of eukaryotic regulatory mechanism. Current methods to predict CRMs from known motifs either depend on multiple alignments or can only deal with a small number of known motifs provided by users. These methods are problematic when binding sites are not well aligned in multiple alignments or when the number of input known motifs is large. We thus developed a new CRM identification method MOPAT (motif pair tree), which identifies CRMs through the identification of motif modules, groups of motifs co-occurring in multiple CRMs. It can identify 'orthologous' CRMs without multiple alignments. It can also find CRMs given a large number of known motifs. We have applied this method to mouse developmental genes, and have evaluated the predicted CRMs and motif modules by microarray expression data and known interacting motif pairs. We show that the expression profiles of the genes containing CRMs of the same motif module correlate significantly better than those of a random set of genes do. We also show that the known interacting motif pairs are significantly included in our predictions. Compared with several current methods, our method shows better performance in identifying meaningful CRMs. 相似文献
6.
7.
8.
Regulatory motif finding by logic regression 总被引:1,自引:0,他引:1
9.
10.
11.
12.
13.
14.
We developed an algorithm, Lever, that systematically maps metazoan DNA regulatory motifs or motif combinations to sets of genes. Lever assesses whether the motifs are enriched in cis-regulatory modules (CRMs), predicted by our PhylCRM algorithm, in the noncoding sequences surrounding the genes. Lever analysis allows unbiased inference of functional annotations to regulatory motifs and candidate CRMs. We used human myogenic differentiation as a model system to statistically assess greater than 25,000 pairings of gene sets and motifs or motif combinations. We assigned functional annotations to candidate regulatory motifs predicted previously and identified gene sets that are likely to be co-regulated via shared regulatory motifs. Lever allows moving beyond the identification of putative regulatory motifs in mammalian genomes, toward understanding their biological roles. This approach is general and can be applied readily to any cell type, gene expression pattern or organism of interest. 相似文献
15.
16.
17.
18.
19.