期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Clustering RNA structural motifs in ribosomal RNAs using secondary structural alignment

Zhong C Zhang S 《Nucleic acids research》2012,40(3):1307-1317

RNA structural motifs are the building blocks of the complex RNA architecture. Identification of non-coding RNA structural motifs is a critical step towards understanding of their structures and functionalities. In this article, we present a clustering approach for de novo RNA structural motif identification. We applied our approach on a data set containing 5S, 16S and 23S rRNAs and rediscovered many known motifs including GNRA tetraloop, kink-turn, C-loop, sarcin-ricin, reverse kink-turn, hook-turn, E-loop and tandem-sheared motifs, with higher accuracy than the state-of-the-art clustering method. We also identified a number of potential novel instances of GNRA tetraloop, kink-turn, sarcin-ricin and tandem-sheared motifs. More importantly, several novel structural motif families have been revealed by our clustering analysis. We identified a highly asymmetric bulge loop motif that resembles the rope sling. We also found an internal loop motif that can significantly increase the twist of the helix. Finally, we discovered a subfamily of hexaloop motif, which has significantly different geometry comparing to the currently known hexaloop motif. Our discoveries presented in this article have largely increased current knowledge of RNA structural motifs. 相似文献

2.

Nucleotide variation of regulatory motifs may lead to distinct expression patterns

Segal L Lapidot M Solan Z Ruppin E Pilpel Y Horn D 《Bioinformatics (Oxford, England)》2007,23(13):i440-i449

相似文献

3.

Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae 总被引：12，自引：0，他引：12

Hughes JD Estep PW Tavazoie S Church GM 《Journal of molecular biology》2000,296(5):1205-1214

相似文献

4.

Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

Rahul Karnik Michael A. Beer 《PloS one》2015,10(10)

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. 相似文献

5.

A computational pipeline to generate MHC binding motifs

Wang P Sidney J Sette A Peters B 《Immunome research》2011,7(2):3

BACKGROUND: Major histocompatibility complex (MHC) class I molecules play key roles in host immunity against pathogens by presenting peptide antigens to CD8+ T-cells. Many variants of MHC molecules exist, and each has a unique preference for certain peptide ligands. Both experimental approaches and computational algorithms have been utilized to analyze these peptide MHC binding characteristics. Traditionally, MHC binding specificities have been described in terms of binding motifs. Such motifs classify certain peptide positions as primary and secondary anchors according to their impact on binding, and they list the preferred and deleterious residues at these positions. This provides a concise and easily communicatable summary of MHC binding specificities. However, so far there has been no algorithm to generate such binding motifs in an automated and uniform fashion. In this paper, we present a computational pipeline that takes peptide MHC binding data as input and produces a concise MHC binding motif. We tested our pipeline on a set of 18 MHC class I molecules and showed that the derived motifs are consistent with historic expert assignments. We have implemented a pipeline that formally codifies rules to generate MHC binding motifs. The pipeline has been incorporated into the immune epitope database and analysis resource (IEDB) and motifs can be visualized while browsing MHC alleles in the IEDB. 相似文献

6.

A graph-based motif detection algorithm models complex nucleotide dependencies in transcription factor binding sites

Naughton BT Fratkin E Batzoglou S Brutlag DL 《Nucleic acids research》2006,34(20):5730-5739

相似文献

7.

SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model

Lee NK Wang D 《BMC bioinformatics》2011,12(Z1):S16

相似文献

8.

Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sites

Qin ZS McCue LA Thompson W Mayerhofer L Lawrence CE Liu JS 《Nature biotechnology》2003,21(4):435-439

相似文献

9.

NoFold: RNA structure clustering without folding or alignment

Sarah A. Middleton Junhyong Kim 《RNA (New York, N.Y.)》2014,20(11):1671-1683

相似文献

10.

A novel Bayesian DNA motif comparison method for clustering and retrieval

Habib N Kaplan T Margalit H Friedman N 《PLoS computational biology》2008,4(2):e1000010

相似文献

11.

iGibbs: improving Gibbs motif sampler for proteins by sequence clustering and iterative pattern sampling

Kim S Wang Z Dalkilic M 《Proteins》2007,66(3):671-681

The motif prediction problem is to predict short, conserved subsequences that are part of a family of sequences, and it is a very important biological problem. Gibbs is one of the first successful motif algorithms and it runs very fast compared with other algorithms, and its search behavior is based on the well-studied Gibbs random sampling. However, motif prediction is a very difficult problem and Gibbs may not predict true motifs in some cases. Thus, the authors explored a possibility of improving the prediction accuracy of Gibbs while retaining its fast runtime performance. In this paper, the authors considered Gibbs only for proteins, not for DNA binding sites. The authors have developed iGibbs, an integrated motif search framework for proteins that employs two previous techniques of their own: one for guiding motif search by clustering sequences and another by pattern refinement. These two techniques are combined to a new double clustering approach to guiding motif search. The unique feature of their framework is that users do not have to specify the number of motifs to be predicted when motifs occur in different subsets of the input sequences since it automatically clusters input sequences into clusters and predict motifs from the clusters. Tests on the PROSITE database show that their framework improved the prediction accuracy of Gibbs significantly. Compared with more exhaustive search methods like MEME, iGibbs predicted motifs more accurately and runs one order of magnitude faster. 相似文献

12.

Enrichment and aggregation of topological motifs are independent organizational principles of integrated interaction networks

Michoel T Joshi A Nachtergaele B Van de Peer Y 《Molecular bioSystems》2011,7(10):2769-2778

相似文献

13.

MuMoD: a Bayesian approach to detect multiple modes of protein–DNA binding from genome-wide ChIP data

Leelavati Narlikar 《Nucleic acids research》2013,41(1):21-32

High-throughput chromatin immunoprecipitation has become the method of choice for identifying genomic regions bound by a protein. Such regions are then investigated for overrepresented sequence motifs, the assumption being that they must correspond to the binding specificity of the profiled protein. However this approach often fails: many bound regions do not contain the ‘expected’ motif. This is because binding DNA directly at its recognition site is not the only way the protein can cause the region to immunoprecipitate. Its binding specificity can change through association with different co-factors, it can bind DNA indirectly, through intermediaries, or even enforce its function through long-range chromosomal interactions. Conventional motif discovery methods, though largely capable of identifying overrepresented motifs from bound regions, lack the ability to characterize such diverse modes of protein–DNA binding and binding specificities. We present a novel Bayesian method that identifies distinct protein–DNA binding mechanisms without relying on any motif database. The method successfully identifies co-factors of proteins that do not bind DNA directly, such as mediator and p300. It also predicts literature-supported enhancer–promoter interactions. Even for well-studied direct-binding proteins, this method provides compelling evidence for previously uncharacterized dependencies within positions of binding sites, long-range chromosomal interactions and dimerization. 相似文献

14.

DotAligner: identification and clustering of RNA structure motifs

Martin A. Smith Stefan E. Seemann Xiu Cheng Quek John S. Mattick 《Genome biology》2017,18(1):244

相似文献

15.

Identification of a novel calcium binding motif based on the detection of sequence insertions in the animal peroxidase domain of bacterial proteins

S Santamaría-Hernando T Krell MI Ramos-González 《PloS one》2012,7(7):e40698

Proteins of the animal heme peroxidase (ANP) superfamily differ greatly in size since they have either one or two catalytic domains that match profile PS50292. The orf PP_2561 of Pseudomonas putida KT2440 that we have called PepA encodes a two-domain ANP. The alignment of these domains with those of PepA homologues revealed a variable number of insertions with the consensus G-x-D-G-x-x-[GN]-[TN]-x-D-D. This motif has also been detected in the structure of pseudopilin (pdb 3G20), where it was found to be involved in Ca(2+) coordination although a sequence analysis did not reveal the presence of any known calcium binding motifs in this protein. Isothermal titration calorimetry revealed that a peptide containing this consensus motif bound specifically calcium ions with affinities ranging between 33-79 μM depending on the pH. Microcalorimetric titrations of the purified N-terminal ANP-like domain of PepA revealed Ca(2+) binding with a K(D) of 12 μM and stoichiometry of 1.25 calcium ions per protein monomer. This domain exhibited peroxidase activity after its reconstitution with heme. These data led to the definition of a novel calcium binding motif that we have termed PERCAL and which was abundantly present in animal peroxidase-like domains of bacterial proteins. Bacterial heme peroxidases thus possess two different types of calcium binding motifs, namely PERCAL and the related hemolysin type calcium binding motif, with the latter being located outside the catalytic domains and in their C-terminal end. A phylogenetic tree of ANP-like catalytic domains of bacterial proteins with PERCAL motifs, including single domain peroxidases, was divided into two major clusters, representing domains with and without PERCAL motif containing insertions. We have verified that the recently reported classification of bacterial heme peroxidases in two families (cd09819 and cd09821) is unrelated to these insertions. Sequences matching PERCAL were detected in all kingdoms of life. 相似文献

16.

Learning “graph-mer” Motifs that Predict Gene Expression Trajectories in Development

Xuejing Li Casandra Panea Chris H. Wiggins Valerie Reinke Christina Leslie 《PLoS computational biology》2010,6(4)

相似文献

17.

Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes 总被引：2，自引：0，他引：2

Jensen ST Shen L Liu JS 《Bioinformatics (Oxford, England)》2005,21(20):3832-3839

相似文献

18.

Genome wide identification of DNA binding motifs of NodD-factor in Sinorhizobium meliloti and Mesorhizobium loti

Khan F Agarwal S Mishra BN 《Journal of bioinformatics and computational biology》2005,3(4):773-801

相似文献