首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
5.
MOTIVATION: Discovery of regulatory motifs in unaligned DNA sequences remains a fundamental problem in computational biology. Two categories of algorithms have been developed to identify common motifs from a set of DNA sequences. The first can be called a 'multiple genes, single species' approach. It proposes that a degenerate motif is embedded in some or all of the otherwise unrelated input sequences and tries to describe a consensus motif and identify its occurrences. It is often used for co-regulated genes identified through experimental approaches. The second approach can be called 'single gene, multiple species'. It requires orthologous input sequences and tries to identify unusually well conserved regions by phylogenetic footprinting. Both approaches perform well, but each has some limitations. It is tempting to combine the knowledge of co-regulation among different genes and conservation among orthologous genes to improve our ability to identify motifs. RESULTS: Based on the Consensus algorithm previously established by our group, we introduce a new algorithm called PhyloCon (Phylogenetic Consensus) that takes into account both conservation among orthologous genes and co-regulation of genes within a species. This algorithm first aligns conserved regions of orthologous sequences into multiple sequence alignments, or profiles, then compares profiles representing non-orthologous sequences. Motifs emerge as common regions in these profiles. Here we present a novel statistic to compare profiles of DNA sequences and a greedy approach to search for common subprofiles. We demonstrate that PhyloCon performs well on both synthetic and biological data. AVAILABILITY: Software available upon request from the authors. http://ural.wustl.edu/softwares.html  相似文献   

6.
7.
8.
9.
10.
DNA甲基化是一种重要的表观遗传学修饰,在基因的转录调控方面具有重要的作用。异常的DNA甲基化可以导致癌症等复杂疾病发生,癌基因相关的DNA甲基化调控位点的识别对于解析癌症的发生发展机制及识别新的癌症标记具有重要意义。本研究通过整合The Cancer Genome Atlas(TCGA)的泛癌症基因组的高通量甲基化谱和基因表达谱,识别癌基因相关的DNA甲基化调控位点。对于每种癌症分批次计算Cp G位点甲基化与相关基因表达之间的相关性,并筛选调控下游基因的Cp G位点(包括强调控位点、弱调控位点和不调控位点),结果表明仅有一半的Cp G位点对下游基因具有调控作用;对癌症间共享的调控位点的分析发现不同癌症间共享的调控位点不尽相同,表明癌症特异的甲基化调控位点的存在。进一步地,对差异甲基化和差异表达基因的功能富集分析揭示了受甲基化调控的基因确实参与了癌症发生发展相关的功能。本研究的结果是对当前甲基化调控位点集的重要补充,也是识别癌症新型分子标记特征的重要资源。  相似文献   

11.
12.
Abstract In Escherichia coli K-12 the two adjacent operons exuT and uxaCA are divergently transcribed and each possesses its own control region. We show that in vivo the expression of these two operons seems to be sensitive to catabolite repression and to be cAMP-dependent. The nucleotide sequence of 318 nucleotides, including the entire control region of uxaCA and the exuT operator, was determined and two cAMP-CRP binding sites were identified.  相似文献   

13.
Selection of DNA binding sites by regulatory proteins   总被引:15,自引:0,他引:15  
  相似文献   

14.
15.
The biochemical diversity in the plant kingdom is estimated to well exceed 100,000 distinct compounds (Weckwerth, 2003) and 4000 to 20,000 metabolites per species seem likely (Fernie et al., 2004). In recent years extensive progress has been made towards the identification of enzymes and regulatory genes working in a complex network to generate this large arsenal of metabolites. Genetic loci influencing quantitative traits, e.g. metabolites or biomass, may be mapped to associated molecular markers, a method called quantitative trait locus mapping (QTL mapping), which may facilitate the identification of novel genes in biochemical pathways. Arabidopsis thaliana, as a model organism for seed plants, is a suitable target for metabolic QTL (mQTL) studies due to the availability of highly developed molecular and genetic tools, and the extensive knowledge accumulated on the metabolite profile. While intensely studied, in particular since the availability of its complete sequence, the genome of Arabidopsis still comprises a large proportion of genes with only tentative function based on sequence homology. From a total number of 33,518 genes currently listed (TAIR 9, http://www.arabidopsis.org), only about 25% have direct experimental evidence for their molecular function and biological process, while for more than 30% no biological data are available. Modern metabolomics approaches together with continually extended genomic resources will facilitate the task of assigning functions to those genes. In our previous study we reported on the identification of mQTL (Lisec et al., 2008). In this paper, we summarize the current status of mQTL analyses and causal gene identification in Arabidopsis and present evidence that a candidate gene located within the confidence interval of a fumarate mQTL (AT5G50950) encoding a putative fumarase is likely to be the causal gene of this QTL. The total number of genes molecularly identified based on mQTL studies is still limited, but the advent of multi-parallel analysis techniques for measurement of gene expression, as well as protein and metabolite abundances and for rapid gene identification will assist in the important task of assigning enzymes and regulatory genes to the growing network of known metabolic reactions.  相似文献   

16.
17.
A Bayesian model-based clustering approach is proposed for identifying differentially expressed genes in meta-analysis. A Bayesian hierarchical model is used as a scientific tool for combining information from different studies, and a mixture prior is used to separate differentially expressed genes from non-differentially expressed genes. Posterior estimation of the parameters and missing observations are done by using a simple Markov chain Monte Carlo method. From the estimated mixture model, useful measure of significance of a test such as the Bayesian false discovery rate (FDR), the local FDR (Efron et al., 2001), and the integration-driven discovery rate (IDR; Choi et al., 2003) can be easily computed. The model-based approach is also compared with commonly used permutation methods, and it is shown that the model-based approach is superior to the permutation methods when there are excessive under-expressed genes compared to over-expressed genes or vice versa. The proposed method is applied to four publicly available prostate cancer gene expression data sets and simulated data sets.  相似文献   

18.
19.
The amdS gene of Aspergillus nidulans, which encodes an acetamidase enzyme, is positively regulated by the trans-acting genes amdR, facB, amdA, and areA. Sequence changes in several cis-acting mutations in the 5' region of the gene which specifically affect amdS regulation were determined. The amdI9 mutation, which results in increased facB-dependent acetate induction, is due to a single-base change at base pair -210 relative to the start point of translation. The amdI93 mutation, which abolishes amdR-dependent omega-amino acid induction, is a deletion of base pairs -181 to -151. The amdI66 mutation, which causes increased gene activation in strains carrying amdA regulatory gene mutations, is a duplication of base pairs -107 to -90. Transformation of A. nidulans can generate transformants containing multiple integrated copies of plasmid sequences. When these plasmids carry a potential binding site for a regulatory gene product, growth on substrates whose catabolism requires genes activated by that regulatory gene can be reduced, apparently because of titration of the regulatory gene product. Introduction of 5' amdS sequences via cotransformation into strains of various genotypes was used to localize sequences apparently involved in binding of the products of the amdR, amdA, and facB genes. The position of these sequences is in agreement with the positions of the specific cis-acting mutations. Consistent with these results, a transformant of A. nidulans derived from a plasmid deleted for sequences upstream from -111 was found to have lost amdR- and facB-mediated control but was still regulated by the amdA gene. In addition, amdS expression in this transformant was still dependent on the areA gene.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号