首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Buhler and Tompa (2002) introduced the random projection algorithm for the motif discovery problem and demonstrated that this algorithm performs well on both simulated and biological samples. We describe a modification of the random projection algorithm, called the uniform projection algorithm, which utilizes a different choice of projections. We replace the random selection of projections by a greedy heuristic that approximately equalizes the coverage of the projections. We show that this change in selection of projections leads to improved performance on motif discovery problems. Furthermore, the uniform projection algorithm is directly applicable to other problems where the random projection algorithm has been used, including comparison of protein sequence databases.  相似文献   

3.
4.
5.
6.
MOTIVATION: This paper studies the problem of discovering subsequences, known as motifs, that are common to a given collection of related biosequences, by proposing a greedy algorithm for learning a mixture of motifs model through likelihood maximization. The approach adds sequentially a new motif to a mixture model by performing a combined scheme of global and local search for appropriately initializing its parameters. In addition, a hierarchical partitioning scheme based on kd-trees is presented for partitioning the input dataset in order to speed-up the global searching procedure. The proposed method compares favorably over the well-known MEME approach and treats successfully several drawbacks of MEME. RESULTS: Experimental results indicate that the algorithm is advantageous in identifying larger groups of motifs characteristic of biological families with significant conservation. In addition, it offers better diagnostic capabilities by building more powerful statistical motif-models with improved classification accuracy.  相似文献   

7.
8.
9.
10.
Stochastic models for heterogeneous DNA sequences   总被引:10,自引:0,他引:10  
The composition of naturally occurring DNA sequences is often strikingly heterogeneous. In this paper, the DNA sequence is viewed as a stochastic process with local compositional properties determined by the states of a hidden Markov chain. The model used is a discrete-state, discreteoutcome version of a general model for non-stationary time series proposed by Kitagawa (1987). A smoothing algorithm is described which can be used to reconstruct the hidden process and produce graphic displays of the compositional structure of a sequence. The problem of parameter estimation is approached using likelihood methods and an EM algorithm for approximating the maximum likelihood estimate is derived. The methods are applied to sequences from yeast mitochondrial DNA, human and mouse mitochondrial DNAs, a human X chromosomal fragment and the complete genome of bacteriophage lambda.  相似文献   

11.
12.
13.
14.
We present a method for encoded tagging and imaging of short nucleic acid motif chains (oligomotifs) using selective hybridization of heterogeneous Au nanoparticles (Au-NP). The resulting encoded NP string is thus representative of the underlying motif sequence. As the NPs are much more massive than the motifs, the motif chain order can be directly observed using scanning electron microscopy. Using this technique we demonstrate direct sequencing of oligomotifs in single DNA molecules consisting of four 100-nt motif chains tagged with four different types of NPs. The method outlined is a precursor for a high density direct sequencing technology.  相似文献   

15.
16.
17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号