共查询到20条相似文献,搜索用时 15 毫秒
1.
We describe an algorithm for finding nucleotide residues stronglycorrelated with the amino acid acceptor functions of transferRNAs. The algorithm exploits the fact that each tRNA acceptsonly one of 20 amino acids. The algorithm is applied to 37 Saccharomycescerevisiae transfer RNAs. Received on January 28, 1987 相似文献
2.
Local gapped subforest alignment and its application in finding RNA structural motifs. 总被引:1,自引:0,他引:1
RNA molecules whose secondary structures contain similar substructures often have similar functions. Therefore, an important task in the study of RNA is to develop methods for discovering substructures in RNA secondary structures that occur frequently (also referred to as motifs). In this paper, we consider the problem of computing an optimal local alignment of two given labeled ordered forests F1 and F2. This problem asks for a substructure of F1 and a substructure of F2 that exhibit a high similarity. Since an RNA molecule's secondary structure can be represented as a labeled ordered forest, the problem we study has a direct application to finding potential motifs. We generalize the previously studied concept of a closed subforest to a gapped subforest and present the first algorithm for computing the optimal local gapped subforest alignment of F1 and F2. We also show that our technique can improve the time and space complexity of the previously most efficient algorithm for optimal local closed subforest alignment. Furthermore, we prove that a special case of our local gapped subforest alignment problem is equivalent to a problem known in the literature as the local sequence-structure alignment problem (lssa) and modify our main algorithm to obtain a much faster algorithm for lssa than the one previously proposed. An implementation of our algorithm is available at www.comp.nus.edu.sg/~bioinfo/LGSFAligner/. Its running time is significantly faster than the original lssa program. 相似文献
3.
MOTIVATION: This paper is concerned with algorithms for aligning two whole genomes so as to identify regions that possibly contain conserved genes. Motivated by existing heuristic-based software tools, we initiate the study of an optimization problem that attempts to uncover conserved genes with a global concern. Another interesting feature in our formulation is the tolerance of noise, which also complicates the optimization problem. A brute-force approach takes time exponential in the noise level. RESULTS: We show how an insight into the optimization structure can lead to a drastic improvement in the time and space requirement [precisely, to O(k2n2) and O(k2n), respectively, where n is the size of the input and k is the noise level]. The reduced space requirement allows us to implement the new algorithm, called MaxMinCluster, on a PC. It is exciting to see that when tested with different real data sets, MaxMinCluster consistently uncovers a high percentage of conserved genes that have been published by GenBank. Its performance is indeed favorably compared to MUMmer (perhaps the most popular software tool for uncovering conserved genes in a whole-genome scale). AVAILABILITY: The source code is available from the website http://www.csis.hku.hk/~colly/maxmincluster/ detailed proof of the propositions can also be found there. 相似文献
4.
Super pairwise alignment (SPA): an efficient approach to global alignment for homologous sequences. 总被引:6,自引:0,他引:6
Sequence analysis is the basis of bioinformatics, while sequence alignment is a fundamental task for sequence analysis. The widely used alignment algorithm, Dynamic Programming, though generating optimal alignment, takes too much time due to its high computation complexity O(N(2)). In order to reduce computation complexity without sacrificing too much accuracy, we have developed a new approach to align two homologous sequences. The new approach presented here, adopting our novel algorithm which combines the methods of probabilistic and combinatorial analysis, reduces the computation complexity to as low as O(N). The computation speed by our program is at least 15 times faster than traditional pairwise alignment algorithms without a loss of much accuracy. We hence named the algorithm Super Pairwise Alignment (SPA). The pairwise alignment execution program based on SPA and the detailed results of the aligned sequences discussed in this article are available upon request. 相似文献
5.
Background
The comparison of homologous sequences from different species is an essential approach to reconstruct the evolutionary history of species and of the genes they harbour in their genomes. Several complete mitochondrial and nuclear genomes are now available, increasing the importance of using multiple sequence alignment algorithms in comparative genomics. MtDNA has long been used in phylogenetic analysis and errors in the alignments can lead to errors in the interpretation of evolutionary information. Although a large number of multiple sequence alignment algorithms have been proposed to date, they all deal with linear DNA and cannot handle directly circular DNA. Researchers interested in aligning circular DNA sequences must first rotate them to the "right" place using an essentially manual process, before they can use multiple sequence alignment tools. 相似文献6.
We present a fast algorithm to produce a graphic matrix representationof sequence homology. The algorithm is based on lexicographicalordering of fragments. It preserves most of the options of asimple naive algorithm with a significant increase in speed.This algorithm was the basis for a program, called DNAMAT, thathas been extensively tested during the last three years at theWeizmann Institute of Science and has proven to be very useful.In addition we suggest a way to extend our approach to analysea series of related DNA or RNA sequences, in order to determinecertain common structural features. The analysis is done bysumming a set of dot-matrices to produce an overallmatrix that displays structural elements common to most of thesequences. We give an example of this procedure by analysingtRNA sequences. Received on June 26, 1986; accepted on September 28, 1986 相似文献
7.
An iterative refinement algorithm for consistency based multiple structural alignment methods 总被引:1,自引:0,他引:1
MOTIVATION: Multiple STructural Alignment (MSTA) provides valuable information for solving problems such as fold recognition. The consistency-based approach tries to find conflict-free subsets of alignments from a pre-computed all-to-all Pairwise Alignment Library (PAL). If large proportions of conflicts exist in the library, consistency can be hard to get. On the other hand, multiple structural superposition has been used in many MSTA methods to refine alignments. However, multiple structural superposition is dependent on alignments, and a superposition generated based on erroneous alignments is not guaranteed to be the optimal superposition. Correcting errors after making errors is not as good as avoiding errors from the beginning. Hence it is important to refine the pairwise library to reduce the number of conflicts before any consistency-based assembly. RESULTS: We present an algorithm, Iterative Refinement of Induced Structural alignment (IRIS), to refine the PAL. A new measurement for the consistency of a library is also proposed. Experiments show that our algorithm can greatly improve T-COFFEE performance for less consistent pairwise alignment libraries. The final multiple alignment outperforms most state-of-the-art MSTA algorithms at assembling 15 transglycosidases. Results on three other benchmarks showed that the algorithm consistently improves multiple alignment performance. AVAILABILITY: The C++ code of the algorithm is available upon request. 相似文献
8.
Omura F Fujita A Miyajima K Fukui N 《Bioscience, biotechnology, and biochemistry》2005,69(6):1162-1171
The Saccharomyces cerevisiae Put4 permease is significant for the transport of proline, alanine, and glycine. Put4p downregulation is counteracted by npi1 mutation that affects the cellular ubiquitination function. Here we describe mutant Put4 permeases, in which up to nine lysine residues in the cytoplasmic N-terminal domain have been replaced by arginine. The steady-state protein level of the mutant permease Put4-20p (Lys9, Lys34, Lys35, Lys60, Lys68, Lys71, Lys93, Lys105, Lys107 --> Arg) was largely higher compared to that of the wild-type Put4p, indicating that the N-terminal lysines can undergo ubiquitination and the subsequent degradation steps. Proline is the only amino acid that yeast assimilates with difficulty under standard brewing conditions. A lager yeast strain provided with Put4-20p was able to assimilate proline efficiently during beer fermentations. These results suggest possible industrial applications of the mutant Put4 permeases in improved fermentation systems for beer and other alcoholic beverages based on proline-rich fermentable sources. 相似文献
9.
RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment 总被引:1,自引:0,他引:1
MOTIVATION: Non-coding RNA genes and RNA structural regulatory motifs play important roles in gene regulation and other cellular functions. They are often characterized by specific secondary structures that are critical to their functions and are often conserved in phylogenetically or functionally related sequences. Predicting common RNA secondary structures in multiple unaligned sequences remains a challenge in bioinformatics research. Methods and RESULTS: We present a new sampling based algorithm to predict common RNA secondary structures in multiple unaligned sequences. Our algorithm finds the common structure between two sequences by probabilistically sampling aligned stems based on stem conservation calculated from intrasequence base pairing probabilities and intersequence base alignment probabilities. It iteratively updates these probabilities based on sampled structures and subsequently recalculates stem conservation using the updated probabilities. The iterative process terminates upon convergence of the sampled structures. We extend the algorithm to multiple sequences by a consistency-based method, which iteratively incorporates and reinforces consistent structure information from pairwise comparisons into consensus structures. The algorithm has no limitation on predicting pseudoknots. In extensive testing on real sequence data, our algorithm outperformed other leading RNA structure prediction methods in both sensitivity and specificity with a reasonably fast speed. It also generated better structural alignments than other programs in sequences of a wide range of identities, which more accurately represent the RNA secondary structure conservations. AVAILABILITY: The algorithm is implemented in a C program, RNA Sampler, which is available at http://ural.wustl.edu/software.html 相似文献
10.
We present an efficient algorithm for statistical multiple alignment based on the TKF91 model of Thorne, Kishino, and Felsenstein (1991) on an arbitrary k-leaved phylogenetic tree. The existing algorithms use a hidden Markov model approach, which requires at least O( radical 5(k)) states and leads to a time complexity of O(5(k)L(k)), where L is the geometric mean sequence length. Using a combinatorial technique reminiscent of inclusion/exclusion, we are able to sum away the states, thus improving the time complexity to O(2(k)L(k)) and considerably reducing memory requirements. This makes statistical multiple alignment under the TKF91 model a definite practical possibility in the case of a phylogenetic tree with a modest number of leaves. 相似文献
11.
Ying Chih Lin Chin Lung Lu Hwan-You Chang Chuan Yi Tang 《Journal of computational biology》2005,12(1):102-112
In the study of genome rearrangement, the block-interchanges have been proposed recently as a new kind of global rearrangement events affecting a genome by swapping two nonintersecting segments of any length. The so-called block-interchange distance problem, which is equivalent to the sorting-by-block-interchange problem, is to find a minimum series of block-interchanges for transforming one chromosome into another. In this paper, we study this problem by considering the circular chromosomes and propose a Omicron(deltan) time algorithm for solving it by making use of permutation groups in algebra, where n is the length of the circular chromosome and delta is the minimum number of block-interchanges required for the transformation, which can be calculated in Omicron(n) time in advance. Moreover, we obtain analogous results by extending our algorithm to linear chromosomes. Finally, we have implemented our algorithm and applied it to the circular genomic sequences of three human vibrio pathogens for predicting their evolutionary relationships. Consequently, our experimental results coincide with the previous ones obtained by others using a different comparative genomics approach, which implies that the block-interchange events seem to play a significant role in the evolution of vibrio species. 相似文献
12.
13.
14.
反义RNA及其在植物基因工程领域的应用 总被引:6,自引:0,他引:6
随着反义RNA的发现及对其研究的深入,反义RNA技术已被广泛应用于基因调控的研究中。本介绍了反义RNA的概念,并就反义RNA的作用机理和在植物基因工程领域的应用进行了综述。其作用机理包括:在原核生物中反义RNA与引物RNA前体及mRNA分子5′的不同区域进行互补,从而抑制其复制、转录和翻译;在其核生物中反义RNA影响mRNA前体拼接、转移及mRNA分子5′和3′正常修饰。在植物基因工程领域,反义RNA主要应用于抑制果实成熟、抗病、作为反向筛选标记基因、控制花色、控制淀粉合成、控制油料种子中脂肪酸的合成、控制雄性不育等方面。 相似文献
15.
16.
An efficient hybridized genetic algorithm architecture for the flexible job shop scheduling problem 总被引:1,自引:0,他引:1
The work presented in this paper proposes hybridized genetic algorithm architecture for the Flexible Job Shop Scheduling Problem (FJSP). The efficiency of the genetic algorithm is enhanced by integrating it with an initial population generation algorithm and a local search method. The usefulness of the proposed methodology is illustrated with the aid of an extensive computational study on 184 benchmark problems with the objective of minimizing the makespan. Results highlight the ability of the proposed algorithm to first obtain optimal or near-optimal solutions, and second to outperform or produce comparable results with these obtained by other best-known approaches in literature. 相似文献
17.
Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment-based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient. 相似文献
18.
Constrained multiple sequence alignment tool development and its application to RNase family alignment 总被引:1,自引:0,他引:1
Tang CY Lu CL Chang MD Tsai YT Sun YJ Chao KM Chang JM Chiou YH Wu CM Chang HT Chou WI 《Journal of bioinformatics and computational biology》2003,1(2):267-287
In this paper, we design a heuristic algorithm of computing a constrained multiple sequence alignment (CMSA for short) for guaranteeing that the generated alignment satisfies the user-specified constraints that some particular residues should be aligned together. If the number of residues needed to be aligned together is a constant alpha, then the time-complexity of our CMSA algorithm for aligning K sequences is O(alphaKn(4)), where n is the maximum of the lengths of sequences. In addition, we have built up such a CMSA software system and made several experiments on the RNase sequences, which mainly function in catalyzing the degradation of RNA molecules. The resulting alignments illustrate the practicability of our method. 相似文献
19.
Optimal alignment between groups of sequences and its application to multiple sequence alignment 总被引:11,自引:2,他引:11
Four algorithms, AD, were developed to align two groupsof biological sequences. Algorithm A is equivalent to the conventionaldynamic programming method widely used for aligning ordinarysequences, whereas algorithms B D are designed to evaluatethe cost for a deletion/insertion more accurately when internalgaps are present in either or both groups of sequences. Rigorousoptimization of the sum of pairs (SP) score isachieved by algorithm D, whose average performance is closeto O(MNL2) where M and N are numbers of sequences included inthe two groups and L is the mean length of the sequences. AlgorithmB uses some app mximations to cope with profile-based operations,whereas algorithm C is a simpler variant of algorithm D. Thesegroup-to-group alignment algorithms were applied to multiplesequence alignment with two iterative strategies: a progressivemethod based on a given binary tree and a randomized grouping-realignmentmethod. The advantages and disadvantages of the four algorithmsare discussed on the basis of the results of exatninations ofseveral protein families. 相似文献
20.
Incremental genetic K-means algorithm and its application in gene expression data analysis 总被引:1,自引:0,他引:1