期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

CMfinder--a covariance model based RNA motif finding algorithm 总被引：5，自引：0，他引：5

Yao Z Weinberg Z Ruzzo WL 《Bioinformatics (Oxford, England)》2006,22(4):445-452

相似文献

2.

李冬冬王正志杜耀华晏春《生物物理学报》2005,21(2):121-129

模式发现是生物信息学的一个重要研究方向,但目前的大部分算法还不能保证获得最优的模式．文章推导了针对三个序列片段相似性关系的判据,将其作为剪枝规则,提出并实现了一种深度优先的穷举搜索算法——判据搜索算法(criterion search algorithm,CRISA),理论分析表明,对绝大多数模式发现问题,CRISA具有多项式的计算时间复杂度和线性的空间复杂度。对仿真的和实际的生物序列数据的测试也表明,CRISA能够快速而完全地识别出序列中所有的模式,具有优于其它算法的总体评价,能够应用于实际的模式发现问题。相似文献

3.

Discriminative motif finding for predicting protein subcellular localization

Lin TH Murphy RF Bar-Joseph Z 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2011,8(2):441-451

Many methods have been described to predict the subcellular location of proteins from sequence information. However, most of these methods either rely on global sequence properties or use a set of known protein targeting motifs to predict protein localization. Here, we develop and test a novel method that identifies potential targeting motifs using a discriminative approach based on hidden Markov models (discriminative HMMs). These models search for motifs that are present in a compartment but absent in other, nearby, compartments by utilizing an hierarchical structure that mimics the protein sorting mechanism. We show that both discriminative motif finding and the hierarchical structure improve localization prediction on a benchmark data set of yeast proteins. The motifs identified can be mapped to known targeting motifs and they are more conserved than the average protein sequence. Using our motif-based predictions, we can identify potential annotation errors in public databases for the location of some of the proteins. A software implementation and the data set described in this paper are available from http://murphylab.web.cmu.edu/software/2009_TCBB_motif/. 相似文献

4.

DECOD: fast and accurate discriminative DNA motif finding

Huggins P Zhong S Shiff I Beckerman R Laptenko O Prives C Schulz MH Simon I Bar-Joseph Z 《Bioinformatics (Oxford, England)》2011,27(17):2361-2367

相似文献

5.

A combinatorial optimization approach for diverse motif finding applications

Elena Zaslavsky Mona Singh 《Algorithms for molecular biology : AMB》2006,1(1):13-13

Background

Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied, several variations of the problem are commonly studied. 相似文献

6.

Integrating quantitative information from ChIP-chip experiments into motif finding

Shim H Keles S 《Biostatistics (Oxford, England)》2008,9(1):51-65

相似文献

7.

Identification of SNP interactions using logic regression

Schwender H Ickstadt K 《Biostatistics (Oxford, England)》2008,9(1):187-198

Interactions of single nucleotide polymorphisms (SNPs) are assumed to be responsible for complex diseases such as sporadic breast cancer. Important goals of studies concerned with such genetic data are thus to identify combinations of SNPs that lead to a higher risk of developing a disease and to measure the importance of these interactions. There are many approaches based on classification methods such as CART and random forests that allow measuring the importance of single variables. But none of these methods enable the importance of combinations of variables to be quantified directly. In this paper, we show how logic regression can be employed to identify SNP interactions explanatory for the disease status in a case-control study and propose 2 measures for quantifying the importance of these interactions for classification. These approaches are then applied on the one hand to simulated data sets and on the other hand to the SNP data of the GENICA study, a study dedicated to the identification of genetic and gene-environment interactions associated with sporadic breast cancer. 相似文献

8.

Regulatory motif discovery using a population clustering evolutionary algorithm 总被引：2，自引：0，他引：2

Lones MA Tyrrell AM 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2007,4(3):403-414

This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm's capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm's ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences. 相似文献

9.

Subtle motifs: defining the limits of motif finding algorithms 总被引：4，自引：0，他引：4

Keich U Pevzner PA 《Bioinformatics (Oxford, England)》2002,18(10):1382-1390

MOTIVATION: What constitutes a subtle motif? Intuitively, it is a motif that is almost indistinguishable, in the statistical sense, from random motifs. This question has important practical consequences: consider, for example, a biologist that is generating a sample of upstream regulatory sequences with the goal of finding a regulatory pattern that is shared by these sequences. If the sequences are too short then one risks losing some of the regulatory patterns that are located further upstream. Conversely, if the sequences are too long, the motif becomes too subtle and one is then likely to encounter random motifs which are at least as significant statistically as the regulatory pattern itself. In practical terms one would like to recognize the sequence length threshold, or the twilight zone, beyond which the motifs are in some sense too subtle. RESULTS: The paper defines the motif twilight zone where every motif finding algorithm would be exposed to random motifs which are as significant as the one which is sought. We also propose an objective tool for evaluating the performance of subtle motif finding algorithms. Finally we apply these tools to evaluate the success of our MULTIPROFILER algorithm to detect subtle motifs. 相似文献

10.

RSIR: regularized sliced inverse regression for motif discovery 总被引：3，自引：0，他引：3

Zhong W Zeng P Ma P Liu JS Zhu Y 《Bioinformatics (Oxford, England)》2005,21(22):4169-4175

相似文献

11.

A deterministic motif finding algorithm with application to the human genome

Hon LS Jain AN 《Bioinformatics (Oxford, England)》2006,22(9):1047-1054

相似文献

12.

A study on the application of topic models to motif finding algorithms

Josep Basha Gutierrez Kenta Nakai 《BMC bioinformatics》2016,17(19):502

相似文献

13.

Using RNA secondary structures to guide sequence motif finding towards single-stranded regions 总被引：2，自引：0，他引：2

Hiller M Pudimat R Busch A Backofen R 《Nucleic acids research》2006,34(17):e117

RNA binding proteins recognize RNA targets in a sequence specific manner. Apart from the sequence, the secondary structure context of the binding site also affects the binding affinity. Binding sites are often located in single-stranded RNA regions and it was shown that the sequestration of a binding motif in a double-strand abolishes protein binding. Thus, it is desirable to include knowledge about RNA secondary structures when searching for the binding motif of a protein. We present the approach MEMERIS for searching sequence motifs in a set of RNA sequences and simultaneously integrating information about secondary structures. To abstract from specific structural elements, we precompute position-specific values measuring the single-strandedness of all substrings of an RNA sequence. These values are used as prior knowledge about the motif starts to guide the motif search. Extensive tests with artificial and biological data demonstrate that MEMERIS is able to identify motifs in single-stranded regions even if a stronger motif located in double-strand parts exists. The discovered motif occurrences in biological datasets mostly coincide with known protein-binding sites. This algorithm can be used for finding the binding motif of single-stranded RNA-binding proteins in SELEX or other biological sequence data. 相似文献

14.

Soon-Heng Tan Willy Hugo Wing-Kin Sung See-Kiong Ng 《BMC bioinformatics》2006,7(1):502-16

Background

An important class of interaction switches for biological circuits and disease pathways are short binding motifs. However, the biological experiments to find these binding motifs are often laborious and expensive. With the availability of protein interaction data, novel binding motifs can be discovered computationally: by applying standard motif extracting algorithms on protein sequence sets each interacting with either a common protein or a protein group with similar properties. The underlying assumption is that proteins with common interacting partners will share some common binding motifs. Although novel binding motifs have been discovered with such approach, it is not applicable if a protein interacts with very few other proteins or when prior knowledge of protein group is not available or erroneous. Experimental noise in input interaction data can further deteriorate the dismal performance of such approaches. 相似文献

15.

ACoM: A classification method for elementary flux modes based on motif finding

Pérès S Vallée F Beurton-Aimar M Mazat JP 《Bio Systems》2011,103(3):410-419

Elementary flux mode analysis is a powerful tool for the theoretical study of metabolic networks. However, when the networks are complex, the determination of elementary flux modes leads to combinatorial explosion of their number which prevents from drawing simple conclusions from their analysis. To deal with this problem we have developed a method based on the Agglomeration of Common Motifs (ACoM) for classifying elementary flux modes. We applied this algorithm to describe the decomposition into elementary flux modes of the central carbon metabolism in Bacillus subtilis and of the yeast mitochondrial energy metabolism. ACoM helps to give biological meaning to the different elementary flux modes and to the relatedness between reactions. ACoM, which can be viewed as a bi-clustering method, can be of general use for sets of vectors with values 0, +1 or −1. 相似文献

16.

Dose finding using the biased coin up-and-down design and isotonic regression 总被引：4，自引：0，他引：4

Stylianou M Flournoy N 《Biometrics》2002,58(1):171-177

We are interested in finding a dose that has a prespecified toxicity rate in the target population. In this article, we investigate five estimators of the target dose to be used with the up-and-down biased coin design (BCD) introduced by Durham and Flournoy (1994, Statistical Decision Theory and Related Topics). These estimators are derived using maximum likelihood, weighted least squares, sample averages, and isotonic regression. A linearly interpolated isotonic regression estimate is shown to be simple to derive and to perform as well as or better than the other target dose estimators in terms of mean square error and average number of subjects needed for convergence in most scenarios studied. 相似文献

17.

Regulatory logic of neuronal diversity: Neuronal selector genes and selector motifs

Oliver Hobert 《Developmental biology》2008,319(2):489

相似文献

18.

A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences 总被引：3，自引：0，他引：3

Bilgen M Karaca M Onus AN Ince AG 《Bioinformatics (Oxford, England)》2004,20(18):3379-3386

MOTIVATION: One of the most interesting features of genomes (both coding and non-coding regions) is the presence of relatively short tandemly repeated DNA sequences known as tandem repeats (TRs). We developed a new PC-based stand-alone software analysis program, combining sequence motif searches with keywords such as organs, tissues, cell lines or development stages for finding exact, inexact and compound, TRs. Tandem Repeats Analyzer 1.5 (TRA) has several advanced repeat search parameters/options over other repeat finder programs as it does not only accept GenBank, FASTA and expressed sequence tag (EST) sequence files but also does analysis of multifiles with multisequences. Advanced user-defined parameters/options let the researchers use different motif lengths search criteria for varying motif lengths simultaneously. The outputs show statistical results to be evaluated by the user. The discovery of TRs in ESTs could be useful for both gene mapping and association studies and discovering TRs located in coding regions of important genes that are expressed under various conditions of environment, stress, organ, tissue and development stage. RESULTS: In this paper, we demonstrated applications of TRA using 175 899 ESTs sequences for three Arabidopsis spp. downloaded from GenBank. The EST-SSRs/ESTs ratios were found 43.1%, 15.3% and 2.34% in A.lyrata, A.thaliana and A.halleri, respectively. Analysis revealed that organs, tissues and development stages possessed different amounts of repeats and repeat compositions. This indicated that the distribution of TRs among the tissues or organs may not be random differing from the untranscribed repeats found in genomes. AVAILABILITY: The program can be obtained free by anonymous FTP from ftp.akdeniz.edu.tr/Araclar/TRA. 相似文献

19.

Regulatory role for a conserved motif adjacent to the homeodomain of Hox10 proteins

Guerreiro I Casaca A Nunes A Monteiro S Nóvoa A Ferreira RB Bom J Mallo M 《Development (Cambridge, England)》2012,139(15):2703-2710

Development of the vertebrate axial skeleton requires the concerted activity of several Hox genes. Among them, Hox genes belonging to the paralog group 10 are essential for the formation of the lumbar region of the vertebral column, owing to their capacity to block rib formation. In this work, we explored the basis for the rib-repressing activity of Hox10 proteins. Because genetic experiments in mice demonstrated that Hox10 proteins are strongly redundant in this function, we first searched for common motifs among the group members. We identified the presence of two small sequences flanking the homeodomain that are phylogenetically conserved among Hox10 proteins and that seem to be specific for this group. We show here that one of these motifs is required but not sufficient for the rib-repressing activity of Hox10 proteins. This motif includes two potential phosphorylation sites, which are essential for protein activity as their mutation to alanines resulted in a total loss of rib-repressing properties. Our data indicates that this motif has a significant regulatory function, modulating interactions with more N-terminal parts of the Hox protein, eventually triggering the rib-repressing program. In addition, this motif might also regulate protein activity by alteration of the protein's DNA-binding affinity through changes in the phosphorylation state of two conserved tyrosine residues within the homeodomain. 相似文献

20.

On counting position weight matrix matches in a sequence, with application to discriminative motif finding

Sinha S 《Bioinformatics (Oxford, England)》2006,22(14):e454-e463

相似文献