Stochastic models for heterogeneous DNA sequences |
| |
Authors: | Gary A Churchill |
| |
Institution: | (1) Department of Biostatistics, University of Washington, 98195 Seattle, WA, U.S.A. |
| |
Abstract: | The composition of naturally occurring DNA sequences is often strikingly heterogeneous. In this paper, the DNA sequence is
viewed as a stochastic process with local compositional properties determined by the states of a hidden Markov chain. The
model used is a discrete-state, discreteoutcome version of a general model for non-stationary time series proposed by Kitagawa
(1987). A smoothing algorithm is described which can be used to reconstruct the hidden process and produce graphic displays
of the compositional structure of a sequence. The problem of parameter estimation is approached using likelihood methods and
an EM algorithm for approximating the maximum likelihood estimate is derived. The methods are applied to sequences from yeast
mitochondrial DNA, human and mouse mitochondrial DNAs, a human X chromosomal fragment and the complete genome of bacteriophage
lambda. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|