期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

BEAM: a beam search algorithm for the identification of cis-regulatory elements in groups of genes.

Jonathan M Carlson Arijit Chakravarty Robert H Gross 《Journal of computational biology》2006,13(3):686-701

The identification of potential protein binding sites (cis-regulatory elements) in the upstream regions of genes is key to understanding the mechanisms that regulate gene expression. To this end, we present a simple, efficient algorithm, BEAM (beam-search enumerative algorithm for motif finding), aimed at the discovery of cis-regulatory elements in the DNA sequences upstream of a related group of genes. This algorithm dramatically limits the search space of expanded sequences, converting the problem from one that is exponential in the length of motifs sought to one that is linear. Unlike sampling algorithms, our algorithm converges and is capable of finding statistically overrepresented motifs with a low failure rate. Further, our algorithm is not dependent on the objective function or the organism used. Limiting the space of candidate motifs enables the algorithm to focus only on those motifs that are most likely to be biologically relevant and enables the algorithm to use direct evaluations of background frequencies instead of resorting to probabilistic estimates. In addition, limiting the space of candidate motifs makes it possible to use computationally expensive objective functions that are able to correctly identify biologically relevant motifs. 相似文献

2.

Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes

下载免费PDF全文

Shaoqiang Zhang Minli Xu Shan Li Zhengchang Su 《Nucleic acids research》2009,37(10):e72

Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which has largely hindered our understanding of many important biological processes. In this article, we describe a novel algorithm for genome-wide de novo prediction of CRBSs with high accuracy. We designed our algorithm to circumvent three identified difficulties for CRBS prediction using comparative genomics principles based on a new method for the selection of reference genomes, a new metric for measuring the similarity of CRBSs, and a new graph clustering procedure. When operon structures are correctly predicted, our algorithm can predict 81% of known individual binding sites belonging to 94% of known cis-regulatory motifs in the Escherichia coli K12 genome, while achieving high prediction specificity. Our algorithm has also achieved similar prediction accuracy in the Bacillus subtilis genome, suggesting that it is very robust, and thus can be applied to any other sequenced prokaryotic genome. When compared with the prior state-of-the-art algorithms, our algorithm outperforms them in both prediction sensitivity and specificity. 相似文献

3.

SPACER: identification of cis-regulatory elements with non-contiguous critical residues

Chakravarty A Carlson JM Khetani RS DeZiel CE Gross RH 《Bioinformatics (Oxford, England)》2007,23(8):1029-1031

相似文献

4.

Bioinformatic identification of candidate cis-regulatory elements involved in human mRNA polyadenylation 总被引：5，自引：0，他引：5

下载免费PDF全文

Hu J Lutz CS Wilusz J Tian B 《RNA (New York, N.Y.)》2005,11(10):1485-1493

Polyadenylation is an essential step for the maturation of almost all cellular mRNAs in eukaryotes. In human cells, most poly(A) sites are flanked by the upstream AAUAAA hexamer or a close variant, and downstream U/GU-rich elements. In yeast and plants, additional cis elements have been found to be located upstream of the poly(A) site, including UGUA, UAUA, and U-rich elements. In this study, we have developed a computer program named PROBE (Polyadenylation-Related Oligonucleotide Bidimensional Enrichment) to identify cis elements that may play regulatory roles in mRNA polyadenylation. By comparing human genomic sequences surrounding frequently used poly(A) sites with those surrounding less frequently used ones, we found that cis elements occurring in yeast and plants also exist in human poly(A) regions, including the upstream U-rich elements, and UAUA and UGUA elements. In addition, several novel elements were found to be associated with human poly(A) sites, including several G-rich elements. Thus, we suggest that many cis elements are evolutionarily conserved among eukaryotes, and human poly(A) sites have an additional set of cis elements that may be involved in the regulation of mRNA polyadenylation. 相似文献

5.

Prediction of cis-regulatory elements: from high-information content analysis to motif identification

Li G Lu J Olman V Xu Y 《Journal of bioinformatics and computational biology》2007,5(4):817-838

相似文献

6.

Strategies for characterising cis-regulatory elements in Xenopus.

Mustafa K Khokha Gabriela G Loots 《Briefings in Functional Genomics and Prot》2005,4(1):58-68

相似文献

7.

Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae 总被引：12，自引：0，他引：12

Hughes JD Estep PW Tavazoie S Church GM 《Journal of molecular biology》2000,296(5):1205-1214

相似文献

8.

CIRI: an efficient and unbiased algorithm for de novo circular RNA identification 总被引：2，自引：0，他引：2

Yuan Gao Jinfeng Wang Fangqing Zhao 《Genome biology》2015,16(1)

相似文献

9.

Global identification of the genetic networks and cis-regulatory elements of the cold response in zebrafish

Peng Hu Mingli Liu Dong Zhang Jinfeng Wang Hongbo Niu Yimeng Liu Zhichao Wu Bingshe Han Wanying Zhai Yu Shen Liangbiao Chen 《Nucleic acids research》2015,43(19):9198-9213

相似文献

10.

A targeted-replacement system for identification of signals for de novo methylation in Neurospora crassa. 总被引：2，自引：1，他引：2

下载免费PDF全文

V P Miao M J Singer M R Rountree E U Selker 《Molecular and cellular biology》1994,14(11):7059-7067

Transformation of eukaryotic cells can be used to test potential signals for DNA methylation. This approach is not always reliable, however, because of chromosomal position effects and because integration of multiple and/or rearranged copies of transforming DNA can influence DNA methylation. We developed a robust system to evaluate the potential of DNA fragments to function as signals for de novo methylation in Neurospora crassa. The requirements of the system were (i) a location in the N. crassa genome that becomes methylated only in the presence of a bona fide methylation signal and (ii) an efficient gene replacement protocol. We report here that the am locus fulfills these requirements, and we demonstrate its utility with the identification of a 2.7-kb fragment from the psi 63 locus as a new portable signal for de novo methylation. 相似文献

11.

Towards de novo identification of metabolites by analyzing tandem mass spectra

Böcker S Rasche F 《Bioinformatics (Oxford, England)》2008,24(16):i49-i55

相似文献

12.

Unique folding of precursor microRNAs: quantitative evidence and implications for de novo identification

Ng Kwang Loong S Mishra SK 《RNA (New York, N.Y.)》2007,13(2):170-187

相似文献

13.

SPIDER: software for protein identification from sequence tags with de novo sequencing error

Han Y Ma B Zhang K 《Journal of bioinformatics and computational biology》2005,3(3):697-716

For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software. 相似文献

14.

E-cadherin intron 2 contains cis-regulatory elements essential for gene expression

Stemmler MP Hecht A Kemler R 《Development (Cambridge, England)》2005,132(5):965-976

相似文献

15.

In silico identification and in vivo validation of a set of evolutionary conserved plant root-specific cis-regulatory elements

Aurélie Christ Ira Maegele Nati Ha Hong Ha Nguyen Martin D. Crespi Alexis Maizel 《Mechanisms of development》2013,130(1):70-81

相似文献

16.

Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statistics

Alves G Yu YK 《Bioinformatics (Oxford, England)》2005,21(19):3726-3732

Motivation: The key to MS -based proteomics is peptide sequencing.The major challenge in peptide sequencing, whether library searchor de novo, is to better infer statistical significance andbetter attain noise reduction. Since the noise in a spectrumdepends on experimental conditions, the instrument used andmany other factors, it cannot be predicted even if the peptidesequence is known. The characteristics of the noise can onlybe uncovered once a spectrum is given. We wish to overcome suchissues. Results: We designed RAId to identify peptides from their associatedtandem mass spectrometry data. RAId performs a novel de novosequencing followed by a search in a peptide library that wecreated. Through de novo sequencing, we establish the spectrum-specificbackground score statistics for the library search. When thedatabase search fails to return significant hits, the top-rankingde novo sequences become potential candidates for new peptidesthat are not yet in the database. The use of spectrum-specificbackground statistics seems to enable RAId to perform well evenwhen the spectral quality is marginal. Other important featuresof RAId include its potential in de novo sequencing alone andthe ease of incorporating post-translational modifications. Availability: Programs implementing the methods described areavailable from the authors on request. Contact: yyu{at}ncbi.nlm.nih.gov Supplementary information: ftp://ftp.ncbi.nih.gov/pub/yyu/Proteomics/MSMS/RAId/MSMS_bioinfo_supp.pdf 相似文献

17.

MotifCombinator: a web-based tool to search for combinations of cis-regulatory motifs

Mamoru Kato Tatsuhiko Tsunoda 《BMC bioinformatics》2007,8(1):100

相似文献

18.

RAP: a new computer program for de novo identification of repeated sequences in whole genomes

Campagna D Romualdi C Vitulo N Del Favero M Lexa M Cannata N Valle G 《Bioinformatics (Oxford, England)》2005,21(5):582-588

MOTIVATION: DNA repeats are a common feature of most genomic sequences. Their de novo identification is still difficult despite being a crucial step in genomic analysis and oligonucleotides design. Several efficient algorithms based on word counting are available, but too short words decrease specificity while long words decrease sensitivity, particularly in degenerated repeats. RESULTS: The Repeat Analysis Program (RAP) is based on a new word-counting algorithm optimized for high resolution repeat identification using gapped words. Many different overlapping gapped words can be counted at the same genomic position, thus producing a better signal than the single ungapped word. This results in better specificity both in terms of low-frequency detection, being able to identify sequences repeated only once, and highly divergent detection, producing a generally high score in most intron sequences. AVAILABILITY: The program is freely available for non-profit organizations, upon request to the authors. CONTACT: giorgio.valle@unipd.it SUPPLEMENTARY INFORMATION: The program has been tested on the Caenorhabditis elegans genome using word lengths of 12, 14 and 16 bases. The full analysis has been implemented in the UCSC Genome Browser and is accessible at http://genome.cribi.unipd.it. 相似文献

19.

Dissection of cis-regulatory elements of the Drosophila gene Serrate

André Bachmann E. Knust 《Development genes and evolution》1998,208(6):346-351

The Drosophila gene Serrate encodes a membrane spanning protein, which is expressed in a complex pattern during embryogenesis and larval stages. Loss of Serrate function leads to larval lethality, which is associated with several morphogenetic defects, including the failure to develop wings and halteres. Serrate has been suggested to act as a short-range signal during wing development. It is required for the induction of the organising centre at the dorsal/ventral compartment boundary, from which growth and patterning of the wing is controlled. In order to understand the regulatory network required to control the spatially and temporally dynamic expression of Serrate, we analysed its cis-regulatory elements by fusing various genomic fragments upstream of the reporter gene lacZ. Enhancer elements reflecting the expression pattern of endogenous Serrate in embryonic and postembryonic tissues could be confined to 26 kb of genomic DNA, including 9 kb of transcribed region. Expression in some embryonic tissues is under the control of multiple enhancers located in the 5’ region and in intron sequences. The data presented here provide the tools to unravel the genetic network which regulates Serrate during different developmental stages in diverse tissues. Received: 27 March 1998 / Accepted: 17 May 1998 相似文献

20.

The influence of cis-regulatory elements on DNA methylation fidelity

Teng M Balch C Liu Y Li M Huang TH Wang Y Nephew KP Li L 《PloS one》2012,7(3):e32928

相似文献