期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Finding the most significant common sequence and structure motifs in a set of RNA sequences. 总被引：12，自引：4，他引：12

J Gorodkin L J Heyer G D Stormo 《Nucleic acids research》1997,25(18):3724-3732

We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints. In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections. The first part utilizes a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences, but maintains tractability by constructing multi-sequence alignments from pairwise comparisons. The algorithm finds the multiple alignments using a greedy approach and has similarities to both CLUSTAL and CONSENSUS, but the core algorithm assures that the pairwise alignments are optimized for both sequence and structure conservation. The choice of scoring system and the method of progressively constructing the final solution are important considerations that are discussed. Example solutions, and comparisons with other approaches, are provided. The solutions include finding consensus structures identical to published ones. 相似文献

2.

GPRM: A genetic programming approach to finding common RNA secondary structure elements

下载免费PDF全文

Hu YJ 《Nucleic acids research》2003,31(13):3446-3449

RNA molecules play an important role in many biological activities. Knowing its secondary structure can help us better understand the molecule's ability to function. The methods for RNA structure determination have traditionally been implemented through biochemical, biophysical and phylogenetic analyses. As the advance of computer technology, an increasing number of computational approaches have recently been developed. They have different goals and apply various algorithms. For example, some focus on secondary structure prediction for a single sequence; some aim at finding a global alignment of multiple sequences. Some predict the structure based on free energy minimization; some make comparative sequence analyses to determine the structure. In this paper, we describe how to correctly use GPRM, a genetic programming approach to finding common secondary structure elements in a set of unaligned coregulated or homologous RNA sequences. GPRM can be accessed at http://bioinfo.cis.nctu.edu.tw/service/gprm/. 相似文献

3.

RSEARCH: Finding homologs of single structured RNA sequences

Robert?J?Klein Sean?R?Eddy Email author 《BMC bioinformatics》2003,4(1):44

Background

For many RNA molecules, secondary structure rather than primary sequence is the evolutionarily conserved feature. No programs have yet been published that allow searching a sequence database for homologs of a single RNA molecule on the basis of secondary structure.

Results

We have developed a program, RSEARCH, that takes a single RNA sequence with its secondary structure and utilizes a local alignment algorithm to search a database for homologous RNAs. For this purpose, we have developed a series of base pair and single nucleotide substitution matrices for RNA sequences called RIBOSUM matrices. RSEARCH reports the statistical confidence for each hit as well as the structural alignment of the hit. We show several examples in which RSEARCH outperforms the primary sequence search programs BLAST and SSEARCH. The primary drawback of the program is that it is slow. The C code for RSEARCH is freely available from our lab's website.

Conclusion

RSEARCH outperforms primary sequence programs in finding homologs of structured RNA sequences.

相似文献

4.

Alignment of protein sequences using secondary structure: a modified dynamic programming method

F Fischel-Ghodsian G Mathiowitz T F Smith 《Protein engineering》1990,3(7):577-581

A method for comparison of protein sequences based on their primary and secondary structure is described. Protein sequences are annotated with predicted secondary structures (using a modified Chou and Fasman method). Two lettered code sequences are generated (Xx, where X is the amino acid and x is its annotated secondary structure). Sequences are compared with a dynamic programming method (STRALIGN) that includes a similarity matrix for both the amino acids and secondary structures. The similarity value for each paired two-lettered code is a linear combination of similarity values for the paired amino acids and their annotated secondary structures. The method has been applied to eight globin proteins (28 pairs) for which the X-ray structure is known. For protein pairs with high primary sequence similarity (greater than 45%), STRALIGN alignment is identical to that obtained by a dynamic programming method using only primary sequence information. However, alignment of protein pairs with lower primary sequence similarity improves significantly with the addition of secondary structure annotation. Alignment of the pair with the least primary sequence similarity of 16% was improved from 0 to 37% 'correct' alignment using this method. In addition, STRALIGN was successfully applied to seven pairs of distantly related cytochrome c proteins, and three pairs of distantly related picornavirus proteins. 相似文献

5.

Mining parasite data using genetic programming

Barrett J Kostadinova A Raga JA 《Trends in parasitology》2005,21(5):207-209

Genetic programming is a technique that can be used to tackle the hugely demanding data-processing problems encountered in the natural sciences. Application of genetic programming to a problem using parasites as biological tags demonstrates its potential for developing explanatory models using data that are both complex and noisy. 相似文献

6.

Discovering common stem-loop motifs in unaligned RNA sequences

下载免费PDF全文

Gorodkin J Stricklin SL Stormo GD 《Nucleic acids research》2001,29(10):2135-2144

相似文献

7.

An approach to delineate primers for a group of poorly conserved sequences incorporating the common motif region

Sahu M Sahu J Sahoo S Dehury B Sarma K Sarmah R Sen P Modi MK Barooah M 《Bioinformation》2012,8(4):181-184

Glutathione synthetase (gshB) has previously been reported to confer tolerance to acidic soil condition in Rhizobium species. Cloning the gene coding for this enzyme necessitates the designing of proper primer sets which in turn depends on the identification of high quality sequence similarity in multiple global alignments. In this experiment, a group of homologous gene sequences related to gshB gene (accession no: gi-86355669:327589-328536) of Rhizobium etli CFN 42, were extracted from NCBI nucleotide sequence databases using BLASTN and were analyzed for designing degenerate primers. However, the T-coffee multiple global alignment results did not show any block of conserved region for the above sequence set to design the primers. Therefore, we attempted to identify the location of common motif region based on multiple local alignments employing the MEME algorithm supported with MAST and Primer3. The results revealed some common motif regions that enabled us to design the primer sets for related gshB gene sequences. The result will be validated in wet lab. 相似文献

8.

What the papers say: Engineering a plant RNA virus for expression of foreign genetic sequences

Donald L. Nuss 《BioEssays : news and reviews in molecular, cellular and developmental biology》1986,4(3):133-134

相似文献

9.

Pseudoknots: a new motif in the RNA game.

C W Pleij 《Trends in biochemical sciences》1990,15(4):143-147

In the last few years a novel RNA folding principle called pseudoknotting has emerged. Originally discovered in noncoding regions of plant viral RNAs, pseudoknots now appear to be a widespread structural motif in a number of functionally different RNAs. These structural elements are part of tRNA-like structures and are involved in folding catalytic sites of ribozymes. They increase the efficiency of ribosomal frameshifting or can serve as specific binding sites for regulatory proteins. 相似文献

10.

Finding the right template: RNA Pol IV, a plant-specific RNA polymerase

Vaughn MW Martienssen RA 《Molecular cell》2005,17(6):754-756

相似文献

11.

Finding optimal vaccination strategies under parameter uncertainty using stochastic programming

Tanner MW Sattenspiel L Ntaimo L 《Mathematical biosciences》2008,215(2):144-151

We present a stochastic programming framework for finding the optimal vaccination policy for controlling infectious disease epidemics under parameter uncertainty. Stochastic programming is a popular framework for including the effects of parameter uncertainty in a mathematical optimization model. The problem is initially formulated to find the minimum cost vaccination policy under a chance-constraint. The chance-constraint requires that the probability that R(*) 相似文献

12.

Finding associations in dense genetic maps: a genetic algorithm approach

Clark TG De Iorio M Griffiths RC Farrall M 《Human heredity》2005,60(2):97-108

Large-scale association studies hold promise for discovering the genetic basis of common human disease. These studies will consist of a large number of individuals, as well as large number of genetic markers, such as single nucleotide polymorphisms (SNPs). The potential size of the data and the resulting model space require the development of efficient methodology to unravel associations between phenotypes and SNPs in dense genetic maps. Our approach uses a genetic algorithm (GA) to construct logic trees consisting of Boolean expressions involving strings or blocks of SNPs. These blocks or nodes of the logic trees consist of SNPs in high linkage disequilibrium (LD), that is, SNPs that are highly correlated with each other due to evolutionary processes. At each generation of our GA, a population of logic tree models is modified using selection, cross-over and mutation moves. Logic trees are selected for the next generation using a fitness function based on the marginal likelihood in a Bayesian regression frame-work. Mutation and cross-over moves use LD measures to pro pose changes to the trees, and facilitate the movement through the model space. We demonstrate our method and the flexibility of logic tree structure with variable nodal lengths on simulated data from a coalescent model, as well as data from a candidate gene study of quantitative genetic variation. 相似文献

13.

The common and the distinctive features of the bulged-G motif based on a 1.04 A resolution RNA structure

下载免费PDF全文

Correll CC Beneken J Plantinga MJ Lubbers M Chan YL 《Nucleic acids research》2003,31(23):6806-6818

Bulged-G motifs are ubiquitous internal RNA loops that provide specific recognition sites for proteins and RNAs. To establish the common and distinctive features of the motif we determined the structures of three variants and compared them with related structures. The variants are 27-nt mimics of the sarcin/ricin loop (SRL) from Escherichia coli 23S ribosomal RNA that is an essential part of the binding site for elongation factors (EFs). The wild-type SRL has now been determined at 1.04 Å resolution, supplementing data obtained before at 1.11 Å and allowing the first calculation of coordinate error for an RNA motif. The other two structures, having a viable (C2658U^•G2663A) or a lethal mutation (C2658G^• G2663C), were determined at 1.75 and 2.25 Å resolution, respectively. Comparisons reveal that bulged-G motifs have a common hydration and geometry, with flexible junctions at flanking structural elements. Six conserved nucleotides preserve the fold of the motif; the remaining seven to nine vary in sequence and alter contacts in both grooves. Differences between accessible functional groups of the lethal mutation and those of the viable mutation and wild-type SRL may account for the impaired elongation factor binding to ribosomes with the C2658G^•G2663C mutation and may underlie the lethal phenotype. 相似文献

14.

G-ribo: a new structural motif in ribosomal RNA

Steinberg SV Boutorine YI 《RNA (New York, N.Y.)》2007,13(4):549-554

Analysis of the available crystal structures of the ribosome and of its subunits has revealed a new RNA motif that we call G-ribo. The motif consists of two double helices positioned side-by-side and connected by an unpaired region. The juxtaposition of the two helices is kept by a complex system of tertiary interactions spread over several layers of stacked nucleotides. In the center of this arrangement, the ribose of a nucleotide from one helix is specifically packed with the ribose and the minor-groove edge of a guanosine from the other helix. In total, we found eight G-ribo motifs in both ribosomal subunits. The location of these motifs suggests that at least some of them play an important role in the formation of the ribosome structure and/or in its function. 相似文献

15.

PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach

Sahraeian SM Yoon BJ 《BMC bioinformatics》2011,12(Z1):S38

相似文献

16.

Dynalign: an algorithm for finding the secondary structure common to two RNA sequences 总被引：28，自引：0，他引：28

Mathews DH Turner DH 《Journal of molecular biology》2002,317(2):191-203

With the rapid increase in the size of the genome sequence database, computational analysis of RNA will become increasingly important in revealing structure-function relationships and potential drug targets. RNA secondary structure prediction for a single sequence is 73 % accurate on average for a large database of known secondary structures. This level of accuracy provides a good starting point for determining a secondary structure either by comparative sequence analysis or by the interpretation of experimental studies. Dynalign is a new computer algorithm that improves the accuracy of structure prediction by combining free energy minimization and comparative sequence analysis to find a low free energy structure common to two sequences without requiring any sequence identity. It uses a dynamic programming construct suggested by Sankoff. Dynalign, however, restricts the maximum distance, M, allowed between aligned nucleotides in the two sequences. This makes the calculation tractable because the complexity is simplified to O(M(3)N(3)), where N is the length of the shorter sequence.The accuracy of Dynalign was tested with sets of 13 tRNAs, seven 5 S rRNAs, and two R2 3' UTR sequences. On average, Dynalign predicted 86.1 % of known base-pairs in the tRNAs, as compared to 59.7 % for free energy minimization alone. For the 5 S rRNAs, the average accuracy improves from 47.8 % to 86.4 %. The secondary structure of the R2 3' UTR from Drosophila takahashii is poorly predicted by standard free energy minimization. With Dynalign, however, the structure predicted in tandem with the sequence from Drosophila melanogaster nearly matches the structure determined by comparative sequence analysis. 相似文献

17.

Locomotif: from graphical motif description to RNA motif search

Reeder J Reeder J Giegerich R 《Bioinformatics (Oxford, England)》2007,23(13):i392-i400

MOTIVATION AND RESULTS: Motivated by the recent rise of interest in small regulatory RNAs, we present Locomotif--a new approach for locating RNA motifs that goes beyond the previous ones in three ways: (1) motif search is based on efficient dynamic programming algorithms, incorporating the established thermodynamic model of RNA secondary structure formation. (2) motifs are described graphically, using a Java-based editor, and search algorithms are derived from the graphics in a fully automatic way. The editor allows us to draw secondary structures, annotated with size and sequence information. They closely resemble the established, but informal way in which RNA motifs are communicated in the literature. Thus, the learning effort for Locomotif users is minimal. (3) Locomotif employs a client-server approach. Motifs are designed by the user locally. Search programs are generated and compiled on a bioinformatics server. They are made available both for execution on the server, and for download as C source code plus an appropriate makefile. AVAILABILITY: Locomotif is available at http://bibiserv.techfak.uni-bielefeld.de/locomotif. 相似文献

18.

The kink-turn: a new RNA secondary structure motif 总被引：29，自引：0，他引：29

Klein DJ Schmeing TM Moore PB Steitz TA 《The EMBO journal》2001,20(15):4214-4221

Analysis of the Haloarcula marismortui large ribosomal subunit has revealed a common RNA structure that we call the kink-turn, or K-turn. The six K-turns in H.marismortui 23S rRNA superimpose with an r.m.s.d. of 1.7 A. There are two K-turns in the structure of Thermus thermophilus 16S rRNA, and the structures of U4 snRNA and L30e mRNA fragments form K-turns. The structure has a kink in the phosphodiester backbone that causes a sharp turn in the RNA helix. Its asymmetric internal loop is flanked by C-G base pairs on one side and sheared G-A base pairs on the other, with an A-minor interaction between these two helical stems. A derived consensus secondary structure for the K-turn includes 10 consensus nucleotides out of 15, and predicts its presence in the 5'-UTR of L10 mRNA, helix 78 in Escherichia coli 23S rRNA and human RNase MRP. Five K-turns in 23S rRNA interact with nine proteins. While the observed K-turns interact with proteins of unrelated structures in different ways, they interact with L7Ae and two homologous proteins in the same way. 相似文献

19.

A common RNA recognition motif identified within a defined U1 RNA binding domain of the 70K U1 snRNP protein 总被引：130，自引：0，他引：130

C C Query R C Bentley J D Keene 《Cell》1989,57(1):89-101

We have defined the RNA binding domain of the 70K protein component of the U1 small nuclear ribonucleoprotein to a region of 111 amino acids. This domain encompasses an octamer sequence that has been observed in other proteins associated with RNA, but has not previously been shown to bind directly to a specific RNA sequence. Within the U1 RNA binding domain, an 80 amino acid consensus sequence that is conserved in many presumed RNA binding proteins was discerned. This sequence pattern appears to represent an RNA recognition motif (RRM) characteristic of a distinct family of proteins. By site-directed mutagenesis, we determined that the 70K protein consists of 437 amino acids (52 kd), and found that its aberrant electrophoretic migration is due to a carboxy-terminal charged domain structurally similar to two Drosophila proteins (su(wa) and tra) that may regulate alternative pre-messenger RNA splicing. 相似文献

20.

Discriminative motif discovery in DNA and protein sequences using the DEME algorithm

Emma Redhead Timothy L Bailey 《BMC bioinformatics》2007,8(1):385

相似文献