首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Fast and sensitive multiple sequence alignments on a microcomputer   总被引:99,自引:0,他引:99  
A strategy is described for the rapid alignment of many longnucleic acid or protein sequences on a microcomputer. The programdescribed can handle up to 100 sequences of 1200 residues each.The approach is based on progressively aligning sequences accordingto the branching order in an initial phylogenetic tree. Theresults obtained using the package appear to be as sensitiveas those from any other available method. Received on October 7, 1988; accepted on December 6, 1988  相似文献   

2.
Using PC/GENE for protein and nucleic acid analysis   总被引:4,自引:0,他引:4  
This paper describes a series of protein analyses using the molecular biology software package PC/GENE, which runs on an IBM or compatible microcomputer. A nucleic acid sequence was first edited and then translated into an amino acid sequence. The amino acid composition, isoelectric point, molecular weight, and other properties of the sequence were determined. Programs to predict secondary structure, alpha helix membrane associations, hydrophobic and hydrophilic regions, and surface and antigenic sites from the amino acid sequence were also used. A search was made in a data base for sequences containing a region similar to a region in the protein sequence. Sequence alignments and queries of data bases can also be performed.  相似文献   

3.
Vienna RNA secondary structure server   总被引:1,自引:0,他引:1       下载免费PDF全文
The Vienna RNA secondary structure server provides a web interface to the most frequently used functions of the Vienna RNA software package for the analysis of RNA secondary structures. It currently offers prediction of secondary structure from a single sequence, prediction of the consensus secondary structure for a set of aligned sequences and the design of sequences that will fold into a predefined structure. All three services can be accessed via the Vienna RNA web server at http://rna.tbi.univie.ac.at/.  相似文献   

4.
Apple Macintosh programs for nucleic and protein sequence analyses   总被引:4,自引:1,他引:3  
This paper describes a package of programs for handling and analyzing nucleic acid and protein sequences using the Apple Macintosh microcomputer. There are three important features of these programs: first, because of the now classical Macintosh interface the programs can be easily used by persons with little or no computer experience. Second, it is possible to save all the data, written in an editable scrolling text window or drawn in a graphic window, as files that can be directly used either as word processing documents or as picture documents. Third, sequences can be easily exchanged with any other computer. The package is composed of thirteen programs, written in Pascal programming language.  相似文献   

5.
6.
We have developed a collection of programs for manipulation and analysis of nucleotide and protein sequences. The package was written in Fortran 77 on a Sirius1/Victor microcomputer which can be easily implemented on a large variety of other computers. Some of the programs have already been adapted for use on a Vax 11. Our aim was to develop programs consisting of small, comprehensible and well documented units that have very fast execution times and are comfortably interactive. The package is therefore suitable for individual modifications, even with little understanding of computer languages.  相似文献   

7.
Certain sequences, known as chameleon sequences, take both alpha- and beta-conformations in natural proteins. We demonstrate that a wild chameleon sequence fused to the C-terminal alpha-helix or beta-sheet in foreign stable proteins from hyperthermophiles forms the same conformation as the host secondary structure. However, no secondary structural formation is observed when the sequence is attached to the outside of the secondary structure. These results indicate that this sequence inherently possesses an ability to make either alpha- or beta-conformation, depending on the sequentially neighboring secondary structure if little other nonlocal interaction occurs. Thus, chameleon sequences take on a satellite state through contagion by the power of a secondary structure. We propose this "conformational contagion" as a new nonlocal determinant factor in protein structure and misfolding related to protein conformational diseases.  相似文献   

8.
R-Coffee is a multiple RNA alignment package, derived from T-Coffee, designed to align RNA sequences while exploiting secondary structure information. R-Coffee uses an alignment-scoring scheme that incorporates secondary structure information within the alignment. It works particularly well as an alignment improver and can be combined with any existing sequence alignment method. In this work, we used R-Coffee to compute multiple sequence alignments combining the pairwise output of sequence aligners and structural aligners. We show that R-Coffee can improve the accuracy of all the sequence aligners. We also show that the consistency-based component of T-Coffee can improve the accuracy of several structural aligners. R-Coffee was tested on 388 BRAliBase reference datasets and on 11 longer Cmfinder datasets. Altogether our results suggest that the best protocol for aligning short sequences (less than 200 nt) is the combination of R-Coffee with the RNA pairwise structural aligner Consan. We also show that the simultaneous combination of the four best sequence alignment programs with R-Coffee produces alignments almost as accurate as those obtained with R-Coffee/Consan. Finally, we show that R-Coffee can also be used to align longer datasets beyond the usual scope of structural aligners. R-Coffee is freely available for download, along with documentation, from the T-Coffee web site (www.tcoffee.org).  相似文献   

9.
Computational tools for prediction of the secondary structure of two or more interacting nucleic acid molecules are useful for understanding mechanisms for ribozyme function, determining the affinity of an oligonucleotide primer to its target, and designing good antisense oligonucleotides, novel ribozymes, DNA code words, or nanostructures. Here, we introduce new algorithms for prediction of the minimum free energy pseudoknot-free secondary structure of two or more nucleic acid molecules, and for prediction of alternative low-energy (sub-optimal) secondary structures for two nucleic acid molecules. We provide a comprehensive analysis of our predictions against secondary structures of interacting RNA molecules drawn from the literature. Analysis of our tools on 17 sequences of up to 200 nucleotides that do not form pseudoknots shows that they have 79% accuracy, on average, for the minimum free energy predictions. When the best of 100 sub-optimal foldings is taken, the average accuracy increases to 91%. The accuracy decreases as the sequences increase in length and as the number of pseudoknots and tertiary interactions increases. Our algorithms extend the free energy minimization algorithm of Zuker and Stiegler for secondary structure prediction, and the sub-optimal folding algorithm by Wuchty et al. Implementations of our algorithms are freely available in the package MultiRNAFold.  相似文献   

10.
Compilation and analysis of viroid and viroid-like RNA sequences.   总被引:2,自引:1,他引:1       下载免费PDF全文
We have created a catalogue comprising all viroid and viroid-like RNA sequences which to our knowledge have been either published or were available from on-line sequence libraries as of October 1, 1995. In the development of this catalogue nomenclature ambiguities were removed, the likely ancestral sequence of most species was determined and the most stable secondary structures of these sequences were predicted using the MulFold package. Only viroids of PSTVd-type possessed a rod-like secondary structure, while most other viroids adopted branched secondary structures. Several viroids have predicted secondary structures that include either a Y or cruciform structure reminiscent of the tRNA-like end of virus genomes at an extremity. However, it remains unknown whether or not these predicted structures are adopted in solution, and if they serve a particular function in vivo. Additional information such as the position of the self-catalytic domains are included in the catalogue. An analysis of the data compilated in the catalogue is included. The catalogue will be available on the world wide web (http://www.callistro.si.usherb.ca/jpperra), on computer disk and in printed form. It should provide an excellent reference point for further studies.  相似文献   

11.
An algorithm for multiple sequence comparison was implementedin FORTRAN 77 for VAX/VMS in GCG-atible format. The MULTICOMPprogram package includes several procedures with which one querysequence can be compared simultaneously to several DNA, RNAor amino acid sequences. The same technique was also introducedfor comparing propensities of secondary structural features,which can be predicted on the basis of amino acid sequences.The technique has been applied to a wide range of sequence andstructural analyses.  相似文献   

12.
BACKGROUND: With the ever-increasing number of sequenced RNAs and the establishment of new RNA databases, such as the Comparative RNA Web Site and Rfam, there is a growing need for accurately and automatically predicting RNA structures from multiple alignments. Since RNA secondary structure is often conserved in evolution, the well known, but underused, mutual information measure for identifying covarying sites in an alignment can be useful for identifying structural elements. This article presents MIfold, a MATLAB toolbox that employs mutual information, or a related covariation measure, to display and predict conserved RNA secondary structure (including pseudoknots) from an alignment. RESULTS: We show that MIfold can be used to predict simple pseudoknots, and that the performance can be adjusted to make it either more sensitive or more selective. We also demonstrate that the overall performance of MIfold improves with the number of aligned sequences for certain types of RNA sequences. In addition, we show that, for these sequences, MIfold is more sensitive but less selective than the related RNAalifold structure prediction program and is comparable with the COVE structure prediction package. CONCLUSION: MIfold provides a useful supplementary tool to programs such as RNA Structure Logo, RNAalifold and COVE, and should be useful for automatically generating structural predictions for databases such as Rfam.  相似文献   

13.
RAGA: RNA sequence alignment by genetic algorithm.   总被引:7,自引:0,他引:7       下载免费PDF全文
We describe a new approach for accurately aligning two homologous RNA sequences when the secondary structure of one of them is known. To do so we developed two software packages, called RAGA and PRAGA, which use a genetic algorithm approach to optimize the alignments. RAGA is mainly an extension of SAGA, an earlier package for multiple protein sequence alignment. In PRAGA several genetic algorithms run in parallel and exchange individual solutions. This method allows us to optimize an objective function that describes the quality of a RNA pairwise alignment, taking into account both primary and secondary structure, including pseudoknots. We report results obtained using PRAGA on nine test cases of pairs of eukaryotic small subunit rRNA sequence (nuclear and mitochondrial).  相似文献   

14.
Multiple sequence alignments become biologically meaningful only if conserved and functionally important residues and secondary structural elements preserved can be identified at equivalent positions. This is particularly important for transmembrane proteins like G-protein coupled receptors (GPCRs) with seven transmembrane helices. TM-MOTIF is a software package and an effective alignment viewer to identify and display conserved motifs and amino acid substitutions (AAS) at each position of the aligned set of homologous sequences of GPCRs. The key feature of the package is to display the predicted membrane topology for seven transmembrane helices in seven colours (VIBGYOR colouring scheme) and to map the identified motifs on its respective helices /loop regions. It is an interactive package which provides options to the user to submit query or pre-aligned set of GPCR sequences to align with a reference sequence, like rhodopsin, whose structure has been solved experimentally. It also provides the possibility to identify the nearest homologue from the available inbuilt GPCR or Olfactory Receptor cluster dataset whose association is already known for its receptor type. AVAILABILITY: The database is available for free at mini@ncbs.res.in.  相似文献   

15.

Background

The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented.

Results

TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms.

Conclusions

TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.  相似文献   

16.
A microcomputer program which locates tRNA genes within longDNA sequences is described. The search is performed either byidentifying tRNA-like secondary structures or by locating eukaryoticRNA polymerase III promoter consensus sequences. The programis also useful in finding inverted repeats allowing the formationof stem-loop secondary structures in tRNA. The program has beendeveloped in BASIC and 6502 Assembler and runs on the AppleII plus and He microcomputers. The execution is quite fast;all the operations are carried out in 1–90 s, dependingon the required task and on the sequence length. Received on March 1, 1985; accepted on April 25, 1985  相似文献   

17.
Visually examining RNA structures can greatly aid in understanding their potential functional roles and in evaluating the performance of structure prediction algorithms. As many functional roles of RNA structures can already be studied given the secondary structure of the RNA, various methods have been devised for visualizing RNA secondary structures. Most of these methods depict a given RNA secondary structure as a planar graph consisting of base-paired stems interconnected by roundish loops. In this article, we present an alternative method of depicting RNA secondary structure as arc diagrams. This is well suited for structures that are difficult or impossible to represent as planar stem-loop diagrams. Arc diagrams can intuitively display pseudo-knotted structures, as well as transient and alternative structural features. In addition, they facilitate the comparison of known and predicted RNA secondary structures. An added benefit is that structure information can be displayed in conjunction with a corresponding multiple sequence alignments, thereby highlighting structure and primary sequence conservation and variation. We have implemented the visualization algorithm as a web server R-chie as well as a corresponding R package called R4RNA, which allows users to run the software locally and across a range of common operating systems.  相似文献   

18.
本文介绍了一个在微机(IBM PC)上实现的、用于核酸顺序分析的计算机程序系统.该系统由三个层次和18个功能块构成,菜单及人机对话使得用户能较快地掌握和使用它.在编程中,采用了树结构、先进后出栈和稀疏矩阵等数据结构技巧,运用了Bayes法等统计分析方法,Kruskal算法和Floyd算法等一系列图论方法也被得到应用,这个软件系统的推出对于分子生物学研究具有一定的积极作用.  相似文献   

19.
MOTIVATION: To predict the consensus secondary structure, possibly including pseudoknots, of a set of RNA unaligned sequences. RESULTS: We have designed a method based on a new representation of any RNA secondary structure as a set of structural relationships between the helices of the structure. We refer to this representation as a structural pattern. In a first step, we use thermodynamic parameters to select, for each sequence, the best secondary structures according to energy minimization and we represent each of them using its corresponding structural pattern. In a second step, we search for the repeated structural patterns, i.e. the largest structural patterns that occur in at least one sequence, i.e. included in at least one of the structural patterns associated to each sequence. Thanks to an efficient encoding of structural patterns, this search comes down to identifying the largest repeated word suffixes in a dictionary. In a third step, we compute the plausibility of each repeated structural pattern by checking if it occurs more frequently in the studied sequences than in random RNA sequences. We then suppose that the consensus secondary structure corresponds to the repeated structural pattern that displays the highest plausibility. We present several experiments concerning tRNA, fragments of 16S rRNA and 10Sa RNA (including pseudoknots); in each of them, we found the putative consensus secondary structure.  相似文献   

20.
《Biophysical journal》2017,112(1):16-21
Intrinsically disordered proteins and regions (IDPs) represent a large class of proteins that are defined by conformational heterogeneity and lack of persistent tertiary/secondary structure. IDPs play important roles in a range of biological functions, and their dysregulation is central to numerous diseases, including neurodegeneration and cancer. The conformational ensembles of IDPs are encoded by their amino acid sequences. Here, we present two computational tools that are designed to enable rapid and high-throughput analyses of a wide range of physicochemical properties encoded by IDP sequences. The first, CIDER, is a user-friendly webserver that enables rapid analysis of IDP sequences. The second, localCIDER, is a high-performance software package that enables a wide range of analyses relevant to IDP sequences. In addition to introducing the two packages, we demonstrate the utility of these resources using examples where sequence analysis offers biophysical insights.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号