Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage |
| |
Authors: | Altschul SF; Erickson BW |
| |
Institution: | Department of Applied Mathematics, Massachusetts Institute of Technology. |
| |
Abstract: | The similarity of two nucleotide sequences is often expressed in terms of
evolutionary distance, a measure of the amount of change needed to
transform one sequence into the other. Given two sequences with a small
distance between them, can their similarity be explained by their base
composition alone? The nucleotide order of these sequences contributes to
their similarity if the distance is much smaller than their average
permutation distance, which is obtained by calculating the distances for
many random permutations of these sequences. To determine whether their
similarity can be explained by their dinucleotide and codon usage, random
sequences must be chosen from the set of permuted sequences that preserve
dinucleotide and codon usage. The problem of choosing random dinucleotide
and codon-preserving permutations can be expressed in the language of graph
theory as the problem of generating random Eulerian walks on a directed
multigraph. An efficient algorithm for generating such walks is described.
This algorithm can be used to choose random sequence permutations that
preserve (1) dinucleotide usage, (2) dinucleotide and trinucleotide usage,
or (3) dinucleotide and codon usage. For example, the similarity of two
60-nucleotide DNA segments from the human beta-1 interferon gene
(nucleotides 196-255 and 499-558) is not just the result of their nonrandom
dinucleotide and codon usage.
|
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|