首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
We present an efficient algorithm for detecting putative regulatory elements in the upstream DNA sequences of genes, using gene expression information obtained from microarray experiments. Based on a generalized suffix tree, our algorithm looks for motif patterns whose appearance in the upstream region is most correlated with the expression levels of the genes. We are able to find the optimal pattern, in time linear in the total length of the upstream sequences. We implement and apply our algorithm to publicly available microarray gene expression data, and show that our method is able to discover biologically significant motifs, including various motifs which have been reported previously using the same data set. We further discuss applications for which the efficiency of the method is essential, as well as possible extensions to our algorithm.  相似文献   

3.
4.
We developed an algorithm, Lever, that systematically maps metazoan DNA regulatory motifs or motif combinations to sets of genes. Lever assesses whether the motifs are enriched in cis-regulatory modules (CRMs), predicted by our PhylCRM algorithm, in the noncoding sequences surrounding the genes. Lever analysis allows unbiased inference of functional annotations to regulatory motifs and candidate CRMs. We used human myogenic differentiation as a model system to statistically assess greater than 25,000 pairings of gene sets and motifs or motif combinations. We assigned functional annotations to candidate regulatory motifs predicted previously and identified gene sets that are likely to be co-regulated via shared regulatory motifs. Lever allows moving beyond the identification of putative regulatory motifs in mammalian genomes, toward understanding their biological roles. This approach is general and can be applied readily to any cell type, gene expression pattern or organism of interest.  相似文献   

5.
6.
The identification of potential protein binding sites (cis-regulatory elements) in the upstream regions of genes is key to understanding the mechanisms that regulate gene expression. To this end, we present a simple, efficient algorithm, BEAM (beam-search enumerative algorithm for motif finding), aimed at the discovery of cis-regulatory elements in the DNA sequences upstream of a related group of genes. This algorithm dramatically limits the search space of expanded sequences, converting the problem from one that is exponential in the length of motifs sought to one that is linear. Unlike sampling algorithms, our algorithm converges and is capable of finding statistically overrepresented motifs with a low failure rate. Further, our algorithm is not dependent on the objective function or the organism used. Limiting the space of candidate motifs enables the algorithm to focus only on those motifs that are most likely to be biologically relevant and enables the algorithm to use direct evaluations of background frequencies instead of resorting to probabilistic estimates. In addition, limiting the space of candidate motifs makes it possible to use computationally expensive objective functions that are able to correctly identify biologically relevant motifs.  相似文献   

7.
Subtle motifs: defining the limits of motif finding algorithms   总被引:4,自引:0,他引:4  
MOTIVATION: What constitutes a subtle motif? Intuitively, it is a motif that is almost indistinguishable, in the statistical sense, from random motifs. This question has important practical consequences: consider, for example, a biologist that is generating a sample of upstream regulatory sequences with the goal of finding a regulatory pattern that is shared by these sequences. If the sequences are too short then one risks losing some of the regulatory patterns that are located further upstream. Conversely, if the sequences are too long, the motif becomes too subtle and one is then likely to encounter random motifs which are at least as significant statistically as the regulatory pattern itself. In practical terms one would like to recognize the sequence length threshold, or the twilight zone, beyond which the motifs are in some sense too subtle. RESULTS: The paper defines the motif twilight zone where every motif finding algorithm would be exposed to random motifs which are as significant as the one which is sought. We also propose an objective tool for evaluating the performance of subtle motif finding algorithms. Finally we apply these tools to evaluate the success of our MULTIPROFILER algorithm to detect subtle motifs.  相似文献   

8.
9.
10.
11.
12.
INCLUSive allows automatic multistep analysis of microarray data (clustering and motif finding). The clustering algorithm (adaptive quality-based clustering) groups together genes with highly similar expression profiles. The upstream sequences of the genes belonging to a cluster are automatically retrieved from GenBank and can be fed directly into Motif Sampler, a Gibbs sampling algorithm that retrieves statistically over-represented motifs in sets of sequences, in this case upstream regions of co-expressed genes.  相似文献   

13.
研究表明,第一内含子可能参与基因转录调控.利用统计方法提取人管家基因上游至第一内含子序列中潜在的组合转录调控模体,分析模体间的距离、区域分布等特征,探讨内含子参与基因转录调控的可能性及其参与方式.在管家基因中共获得960对潜在转录调控模体对,其中57%与实验已知的具有转录相互作用的因子对吻合,共涉及12组因子对.分析发现,绝大多数模体对(80%)偏向于上游区域及"上游-内含子"区域,进一步支持了内含子参与基因转录调控的假设,并据此推测内含子与上游序列之间具有转录协同作用,模体在基因转录起始位点(TSS)附近较为集中,模体对的两个模体之间距离较近,60%左右距离在200 bp以内,特别地,65%的模体对特征距离在100 bp以内,短距离间隔有利于转录因子间的协同作用.这些结果将有助于对人基因转录调控机制及内含子功能的深入认识.  相似文献   

14.
Genomics and proteomics approaches generate distinct gene expression and protein profiles, listing individual genes embedded in broad functional terms as gene ontologies. However, interpretation of gene profiles in a regulatory and functional context remains a major issue. Elucidation of regulatory mechanisms at the gene expression level via analysis of promoter regions is a prominent procedure to decipher such gene regulatory networks. We propose a novel genetic algorithm (GA) to extract joint promoter modules in a set of coexpressed genes as resulting from differential gene expression experiments. Algorithm design has focused on the following constraints: (I) identification of the major promoter modules, which are (II) characterized by a maximum number of joint motifs and (III) are found in a maximum number of coexpressed genes. The capability of the GA in detecting multiple modules was evaluated on various test data sets, analyzing the impact of the number of motifs per promoter module, the number of genes associated with a module, as well as the total number of distinct promoter modules encoded in a sequence set. In addition to the test data sets, the GA was evaluated on two biological examples, namely a muscle-specific data set and the upstream sequences of the beta-actin gene (ACTB) derived from different species, complemented by a comparison to alternative promoter module identification routines.  相似文献   

15.
16.
17.
18.
19.
20.
Diptericins are antibacterial polypeptides which are strongly induced in the fat body and blood cells of dipteran insects in response to septic injury. The promoter of the single-copy, intronless diptericin gene of Drosophila contains several nucleotide sequences homologous to mammalian cis-regulatory motifs involved in the control of acute phase response genes. Extending our previous studies on the expression of the diptericin gene, we now report a quantitative analysis of the contribution of various putative regulatory elements to the bacterial inducibility of this gene, based on the generation of 60 transgenic fly lines carrying different elements fused to a reporter gene. Our data definitively identify two Kappa B-related motifs in the proximal promoter as the sites conferring inducibility and tissue-specific expression to the diptericin gene. These motifs alone, however, mediate only minimal levels of expression. Additional proximal regulatory elements are necessary to attain some 20% of the full response and we suspect a role for sequences homologous to mammalian IL6 response elements and interferon-gamma responsive sites in this up-regulation. The transgenic experiments also reveal the existence of a distal regulatory element located upstream of -0.6 kb which increases the level of expression by a factor of five.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号