首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.  相似文献   

2.
3.
We propose a new algorithm for identifying cis-regulatory modules in genomic sequences. The proposed algorithm, named RISO, uses a new data structure, called box-link, to store the information about conserved regions that occur in a well-ordered and regularly spaced manner in the data set sequences. This type of conserved regions, called structured motifs, is extremely relevant in the research of gene regulatory mechanisms since it can effectively represent promoter models. The complexity analysis shows a time and space gain over the best known exact algorithms that is exponential in the spacings between binding sites. A full implementation of the algorithm was developed and made available online. Experimental results show that the algorithm is much faster than existing ones, sometimes by more than four orders of magnitude. The application of the method to biological data sets shows its ability to extract relevant consensi.  相似文献   

4.
The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST.  相似文献   

5.
Cluster-Buster: Finding dense clusters of motifs in DNA sequences   总被引:13,自引:2,他引:13       下载免费PDF全文
Frith MC  Li MC  Weng Z 《Nucleic acids research》2003,31(13):3666-3668
  相似文献   

6.
7.
8.
9.
10.
11.
12.
13.
Finding composite regulatory patterns in DNA sequences   总被引:1,自引:0,他引:1  
Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus on monad patterns that correspond to relatively short contiguous strings. However, many of the actual regulatory signals are composite patterns that are groups of monad patterns that occur near each other. A difficulty in discovering composite patterns is that one or both of the component monad patterns in the group may be 'too weak'. Since the traditional monad-based motif finding algorithms usually output one (or a few) high scoring patterns, they often fail to find composite regulatory signals consisting of weak monad parts. In this paper, we present a MITRA (MIsmatch TRee Algorithm) approach for discovering composite signals. We demonstrate that MITRA performs well for both monad and composite patterns by presenting experiments over biological and synthetic data.  相似文献   

14.
15.
16.
There are no well-known properties in regulatory DNA analogous to those in coding sequences; their spatial location is not regular, the consensus regulatory elements are often degenerate and there are no understandable rules governing their evolution. This makes it difficult to recognize regulatory regions within genome. We review developments in the statistical characterization of regulatory regions and methods of their recognition in eukaryotic genomes.  相似文献   

17.
To study the properties of DNA sequences we have transformed the sequences of bases into the sequences of twist angles along the chain of DNA double helix by using the Dickerson sum function. The Fourier transform and the auto-correlation function of the twist angles sequences have been used to study the periodicity and randomness of the original DNA sequences. Basing on the correlation coefficient, a distance between two DNA fragments has been defined and used to compare some realistic DNA sequences. It is hoped that the techniques developed here could be used to analyze more realistic DNA sequences.  相似文献   

18.
19.
We describe an algorithm (IRSA) for identification of common regulatory signals in samples of unaligned DNA sequences. The algorithm was tested on randomly generated sequences of fixed length with implanted signal of length 15 with 4 mutations, and on natural upstream regions of bacterial genes regulated by PurR, ArgR and CRP. Then it was applied to upstream regions of orthologous genes from Escherichia coli and related genomes. Some new palindromic binding and direct repeats signals were identified. Finally we present a parallel version suitable for computers supporting the MPI protocol. This implementation is not strictly bounded by the number of available processors. The computation speed linearly depends on the number of processors.  相似文献   

20.
MOTIVATION: The advent of genomics yields thousands of reading frames in search of function. Identification of conserved functional motifs in protein sequences can be helpful for function prediction. RESULTS: A database and a classification of reported DNA-binding protein motifs has been designed. A program ('TranScout') has been developed for the detection and evaluation of conserved motifs in prokaryotic and eukaryotic sequences of proteins with a gene regulatory function. The efficiency of the program is shown in a benchmark against a database obtained from SWISS-PROT without the protein sequences used to train the program. All motifs were detected with a mean average sensitivity of 0.98 and a mean average specificity of 0.92. AVAILABILITY: The program is freely available for use on the internet at http://luz.uab.es/transcout/. The user can find additional information at this site.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号