首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 187 毫秒
1.
刘娟  高洁 《生物信息学》2011,9(2):97-101
用时间序列模型来分析乙型、丙型这两种流感病毒,对乙流、丙流病毒DNA序列提供了一种新的时间序列模型,即CGR弧度序列。利用CGR坐标将乙流、丙流病毒DNA序列转换成CGR弧度序列,且引入长记忆ARFIMA模型去拟合这两类序列。发现随机找来的10条乙流序列,10条丙流序列都具有长相关性且拟合很好,并且还发现这两种病毒序列可以尝试用不同的ARFIMA模型ARFIMA(0,d,4)模型,ARFIMA(1,d,1)模型去识别。  相似文献   

2.
基于混沌游走方法的Rh血型系统中RHD基因的分析   总被引:3,自引:0,他引:3  
高雷  齐斌  朱平 《生命科学研究》2009,13(5):408-412
利用基于经典HP模型的蛋白质序列混沌游走方法(chaos game representation,CGR),给出了RHD基因的蛋白质序列CGR图,可视作蛋白质序列二级结构的一个特征图谱描述.对临床上的血型鉴别有一定的参考价值.另外.还根据由Jeffrey在1990年提出的描绘DNA序列的CGR方法,给出了RHD基因的DNA序列的CGR图.并且根据RHD基因DNA序列的CGR图算出了尺日D基因相应的马尔可夫两步转移概率矩阵,从概率矩阵表可以看出RHD基因对编码氨基酸的三联子的第3个碱基的使用偏好性.  相似文献   

3.
目的:用长记忆模型预测未来年份的甲型H1N1流感病毒的蛋白质序列.方法:基于时间序列分析,首先建立CGR混沌游走序列,再进行模型拟合.对所选取的1943年~2012年同源性相对较高的70条流感病毒蛋白质序列,先混沌游走再用ARFIMA(p,d,q)模型对其前10个位置去拟合并且预测.结果:几乎所有原始蛋白质序列的各个位置值都在预报区域内(除极个别之外),表明选择的模型比较科学.结论:可以用来预测未来年份的流感病毒蛋白质序列,对流感病毒的预测和预防有着重要的研究价值.  相似文献   

4.
遗传算法是模拟生物进化过程的计算模型,是一种全局优化搜索算法。将遗传算法与转录因子结合位点识别问题相结合的新方法,以一致性序列模型作为保守motif的描述模型,通过对motif序列与待测序列的比对问题进行编码,将其转化成搜索空间中的优化问题,利用遗传算法来搜索最优解,预测转录因子的结合位点。实验结果表明,这种新的方法是有效的,它在占用少量内存的情况下能够准确地识别出待测转录因子结合位点。  相似文献   

5.
在DNA序列相似性的研究中,通常采用的动态规划算法对空位罚分函数缺乏理论依据而带有主观性,从而取得不同的结果,本文提出了一种基于DTW(Dynamic Time Warping,动态时间弯曲)距离的DNA序列相似性度量方法可以解决这一问题.通过DNA序列的图形表示把DNA序列转化为时间序列,然后计算DTW距离来度量序列相似度以表征DNA序列属性,得到能够比较DNA序列相似性度量方法,并用这个方法比较分析了七种东亚钳蝎神经毒素(Buthusmartensi Karsch neurotoxin)基因序列的相似性,验证了该度量方法的有效性和准确性.  相似文献   

6.
在随机引物建库的基础上 ,通过交叉设计引物及加单链DNA接头 ,应用RT PCR技术克隆了家蚕质多角体病毒dsRNA片段Ⅳ的全长cDNA ,并测定了它的全序列。该片段长 3,2 6 2bp ,含有一个完整的开放读码框 ,编码一个长 1 ,0 5 8氨基酸的成熟多肽。序列分析表明 ,该片段与日本BmCPV片段Ⅳ的核苷酸序列同源性为 89% ,氨基酸序列同源性为 95 %。  相似文献   

7.
从一株产乳糖酶的亮白曲霉(Aspergillus candidus)中克隆到了乳糖酶基因组DNA及cDNA序列(EMBL ACCESSION No. AJ431643),序列分析表明,乳糖酶基因组DNA序列长3458bp,其中含有8个内含子,cDNA编码区长3015bp,共编码1005个氨基酸,前19个氨基酸为信号肽序列,氨基酸序列中共含有11个潜在的糖基化位点。将此基因与不同来源的乳糖酶基因序列进行比较发现,该基因与绝大多数乳糖酶基因同源性较低。虽与米曲霉ATCC 20423的乳糖酶序列同源性较高,但其在酶学性质上更优于后者,亮白曲霉的乳糖酶基因可能是一个具有更广阔的生产应用前景的新基因。   相似文献   

8.
通过PCR方法,将禽网状内皮组织增殖病病毒(REV)的长末端重复序列(LTR)扩增并克隆进pUC-18质粒多克隆位点(MCS)的EcoR I和Sac I之间,并以BGH基因的多聚腺苷酸序列作为终止子克隆到SphI~HindIII之间,构建成重组质粒pUC-LTR。将GFP基因和REV囊膜糖蛋白gp90基因分别克隆到pUC-LTR载体中,获得质粒pUC-LTR-GFP和质粒pUC-LTR-gp90。重组质粒经转染48h,能够检测到外源基因的表达。本研究提示,REVLTR能够作为启动子构建表达质粒。  相似文献   

9.
郑文华  许旭 《生命的化学》2004,24(3):259-262
特定序列DNA的检测方法通常使用PCR、DNA杂交、连接酶反应等专法。近年来根据不同的啄理又发展出一些新方法,其中比较重要的是基于非酶连接反应、分子信号灯、纳米微粒、以及酶抑制剂-DNA-酶(IDE)的方法。  相似文献   

10.
目的:克隆噬菌体ψ297切除酶(xis)基因,并对其进行遗传与变异研究。方法:提取埃希氏大肠杆菌O157:H7菌株EH297染色体DNA,采用步移PCR方法寻找目的基因,并通过克隆、亚克隆、DNA测序等分子生物学方法获得切除酶基因,通过序列分析软件对此基因进行分析。结果:克隆获得噬菌体ψ97编码的切除酶基因(xis)的完整序列,它的长度是255bp,编码了一个84个氨基酸组成的蛋白质(xis),将它们的序列与λ噬菌体的切除酶家族的其它成员进行了比较。其结果是噬菌体ψ297的切除酶基因(xis)与噬菌体VT1-Sakai的切除酶基因(xis)只有4个核苷酸的不同,而Xis蛋白与噬菌体VT1-Sakai的Xis蛋白是一样的,与噬菌体933W的Xis蛋白只有47.2%的相似性。结论:噬菌体ψ297编码的切除酶基因(xis)与λ噬菌体的切除酶基因同源。  相似文献   

11.
Chaos game representation of gene structure.   总被引:21,自引:2,他引:19       下载免费PDF全文
This paper presents a new method for representing DNA sequences. It permits the representation and investigation of patterns in sequences, visually revealing previously unknown structures. Based on a technique from chaotic dynamics, the method produces a picture of a gene sequence which displays both local and global patterns. The pictures have a complex structure which varies depending on the sequence. The method is termed Chaos Game Representation (CGR). CGR raises a new set of questions about the structure of DNA sequences, and is a new tool for investigating gene structure.  相似文献   

12.
Similar to the chaos game representation (CGR) of DNA sequences proposed by Jeffrey (Nucleic Acid Res. 18 (1990) 2163), a new CGR of protein sequences based on the detailed HP model is proposed. Multifractal and correlation analyses of the measures based on the CGR of protein sequences from complete genomes are performed. The Dq spectra of all organisms studied are multifractal-like and sufficiently smooth for the Cq curves to be meaningful. The Cq curves of bacteria resemble a classical phase transition at a critical point. The correlation distance of the difference between the measure based on the CGR of protein sequences and its fractal background is also proposed to construct a more precise phylogenetic tree of bacteria.  相似文献   

13.
Chaos Game Representation (CGR) can recognize patterns in the nucleotide sequences, obtained from databases, of a class of genes using the techniques of fractal structures and by considering DNA sequences as strings composed of four units, G, A, T and C. Such recognition of patterns relies only on visual identification and no mathematical characterization of CGR is known. The present report describes two algorithms that can predict the presence or absence of a stretch of nucleotides in any gene family. The first algorithm can be used to generate DNA sequences represented by any point in the CGR. The second algorithm can simulate known CGR patterns for different gene families by setting the probabilities of occurrence of different di- or trinucleotides by a trial and error process using some guidelines and approximate rules-of-thumb. The validity of the second algorithm has been tested by simulating sequences that can mimic the CGRs of vertebrate non-oncogenes, proto-oncogenes and oncogenes. These algorithms can provide a mathematical basis of the CGR patterns obtained using nucleotide sequences from databases.  相似文献   

14.
为了深入研究基因组序列的多重分形性质,首先选取12条较长的DNA序列,并根据此12条DNA序列的编码/非编码片段将DNA序列转换成相应的12条时间序列,其次对这12个时间序列进行多重分形Hurst分析,计算它们的Hurst指数,并且利用Hurst指数分析序列的自相似性,进一步将得到的Hurst指数与DNA一维游走模型相比较,发现12条序列均具有长程相关性,这说明DNA序列中确实存在着长程相关现象。  相似文献   

15.
The chaos game representation (CGR) is a scatter plot derived from a DNA sequence, with each point of the plot corresponding to one base of the sequence. If the DNA sequence were a random collection of bases, the CGR would be a uniformly filled square; conversely, any patterns visible in the CGR represent some pattern (information) in the DNA sequence. In this paper, patterns previously observed in a variety of DNA sequences are explained solely in terms of nucleotide, dinucleotide and trinucleotide frequencies.  相似文献   

16.
Analysis of genomic sequences by Chaos Game Representation   总被引:4,自引:0,他引:4  
MOTIVATION: Chaos Game Representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to find the coordinates for their position in a continuous space. This distribution of positions has two properties: it is unique, and the source sequence can be recovered from the coordinates such that distance between positions measures similarity between the corresponding sequences. The possibility of using the latter property to identify succession schemes have been entirely overlooked in previous studies which raises the possibility that CGR may be upgraded from a mere representation technique to a sequence modeling tool. RESULTS: The distribution of positions in the CGR plane were shown to be a generalization of Markov chain probability tables that accommodates non-integer orders. Therefore, Markov models are particular cases of CGR models rather than the reverse, as currently accepted. In addition, the CGR generalization has both practical (computational efficiency) and fundamental (scale independence) advantages. These results are illustrated by using Escherichia coli K-12 as a test data-set, in particular, the genes thrA, thrB and thrC of the threonine operon.  相似文献   

17.
Hai ming Ni  Da wei Qi  Hongbo Mu 《Genomics》2018,110(3):180-190
Converting DNA sequence to image by using chaos game representation (CGR) is an effective genome sequence pretreatment technology, which provides the basis for further analysis between the different genes. In this paper, we have constructed 10 mammal species, 48 hepatitis E virus (HEV), and 10 kinds of bacteria genetic CGR images, respectively, to calculate the mean structural similarity (MSSIM) coefficient between every two CGR images. From our analysis, the MSSIM coefficient of gene CGR images can accurately reflect the similarity degrees between different genomes. Hierarchical clustering analysis was used to calculate the class affiliation and construct a dendrogram. Large numbers of experiments showed that this method gives comparable results to the traditional Clustal X phylogenetic tree construction method, and is significantly faster in the clustering analysis process. Meanwhile MSSIM combined CGR method was also able to efficiently clustering of large genome sequences, which the traditional multiple sequence alignment methods (e.g. Clustal X, Clustal Omega, Clustal W, et al.) cannot classify.  相似文献   

18.
A new method to determine entropic profiles in DNA sequences is presented. It is based on the chaos-game representation (CGR) of gene structure, a technique which produces a fractal-like picture of DNA sequences. First, the CGR image was divided into squares 4-m in size (m being the desired resolution), and the point density counted. Second, appropriate intervals were adjusted, and then a histogram of densities was prepared. Third, Shannon's formula was applied to the probability-distribution histogram, thus obtaining a new entropic estimate for DNA sequences, the histogram entropy , a measurement that goes with the level of constraints on the DNA sequence. Lastly, the entropic profile for the sequence was drawn, by considering the entropies at each resolution level, thus providing a way to summarize the complexity of large genomic regions or even entire genomes at different resolution levels. The application of the method to DNA sequences reveals that entropic profiles obtained in this way, as opposed to previously published ones, clearly discriminate between random and natural DNA sequences. Entropic profiles also show a different degree of variability within and between genomes. The results of these analyses are discussed in relation both to the genome compartmentalization in vertebrates and to the differential action of compositional and/or functional constraints on DNA sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号