首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
遗传密码和DNA序列的高维空间数字编码   总被引:13,自引:7,他引:6  
二进制数字化编码是信息科学最基本的编码方式。用0(00)、1(01)、2(10)和3(11)4个数码对4种碱基(C、T、A、G)进行二进制数字编码,共有24种可能的编码组合,其中8种满足碱基到补法则,它们是拓扑等价的。按碱基分子量大小排列的编码格式:0123/CTAG是最理想的编码格式。用二进制数对DNA的字符序列进行编码,有以下优点:1)压缩信息冗余度,提高编码效率;2)可以对碱基的结构、功能基  相似文献   

2.
中国5种珍稀绢蝶非损伤取样的mtDNA序列及系统进化   总被引:17,自引:0,他引:17  
陈永久  沈发荣 《遗传学报》1999,26(3):203-207
应用非损伤性取样DNA测序技术测定了4种来自云南白马雪山和1种来自新疆天山的5种珍稀绢蝶的线粒体DNA细胞色素b基因部分DNA序列。在获得的433bp的序列中,A+T约占75.4%,其中40个核苷酸位点存在变异(约9.24%)。DNA一级序列数据显示,该5种绢蝶间DNA序列变异丰富。PAUP3.1.1(简约法)数据分析软件构建该5个种绢蝶的分子系统树显示,爱珂绢蝶和巴裔绢蝶的亲缘关系比较接近,阿波  相似文献   

3.
水稻脂质转移蛋白基因的分离和分析   总被引:3,自引:0,他引:3  
用水稻脂质转移蛋白(lipidtransferprotein,LTP)的cDNA(pFDRSC110)为探针,从水稻(OryzasativaL.sp.indica)“广陆矮4号”基因文库中筛选出LTP基因,完成了1.8kb长片段的序列测定。该基因编码一个121个氨基酸组成的肽,编码区中间有一个90bp的内含子,基因的调控区除有2个TATAbox和polyA加工信号外,还存在多个回文序列和逆向重复序列。该基因编码的蛋白质N端是由28个氨基酸组成的信号肽,同源性比较表明它具有典型的植物LTP特征。  相似文献   

4.
香蕉束顶病毒基因克隆和序列分析   总被引:11,自引:0,他引:11  
肖火根  HuJohn 《病毒学报》1999,15(1):55-63
对香蕉束顶病毒(BBTV)中国分离株DNA组份I(DNA-1)、外壳蛋白(CP)和运转蛋白(MP)基因进行了克隆和序列分析。BBTVDNA-1含有1103个核苷酸,与南太平洋和亚洲分离株分别有87%-88% 96.9-98%的核苷酸序列同源性。由DNA-1编码的复制酶含有186个在酸残基。与南太平洋和亚洲分离株分别有84.4%-95.8%和97.6%、98.0%的氨基酸序列同源性。外壳蛋白基因由5  相似文献   

5.
近年来,随着人类基因组计划(HGP)在世界范围内的开展,破译人类及多种模式生物的遗传密码已成为生物学领域的重要学科。同时产生了巨量的基因组信息。分析这些信息是人类基因组研究必不可少的重要内容,从而也促成了生物信息学的产生与发展。生物信息学作为一门新的学科领域,它是把基因组DNA序列信息分析作为源头,在获得了蛋白质编码区的信息之后进行蛋白质空间结构模拟和预测,然后依据特定蛋白质的功能进行必要的...  相似文献   

6.
以甜菜坏死黄脉病毒(BNYVV)内蒙分离物(NM)RNA为模板,通过反转录和PCR扩增得到了BNYVV RNA4基因组的cDNA克隆pGBF6。序列分析结果表明,pGBF6含有全长RNA4 cDNA插入片段,大小为1465个核苷酸,含有一个849个核苷酸的开放阅读框架,编码产生由282个氨基酸组成的分子量为31kDa的蛋白。与法国F2分离物RNA4相比,其核苷酸序列和由此推导的氨基酸序列同源性分别  相似文献   

7.
以甜菜坏死黄脉病毒内蒙分离物(BNYVV NM)总RNA为模板,经RT-PCR扩增,分别获得RNA2、RNA3和RNA4自然缺失突变体cDNA克隆。序列分析结果表明,RNA2自然缺失突变体在75kD通读蛋白编码区C端缺失348个核苷酸(缺失位置nt1488 ̄nt1835)。RNA3在其25kD蛋白编码区内缺失360个核苷酸(缺失位置nt729 ̄nt1088)。RNA4的自然缺失区域位于31kD蛋白  相似文献   

8.
大豆花叶病毒(SMV) 在大豆( Glycine max L.) 上引起严重病害。利用RT_PCR 扩增并克隆了SMV_ZK( 一个中国SMV 分离株) 基因组中全部蛋白质编码区的cDNA。通过对HC_PRO、NIb 和CP编码区进行序列测定与分析,发现SMV_ZK 与SMV_G2 高度同源,从而在分子水平上证明在我国大豆作物中存在SMV_G2 类似株系。将SMV_ZKcDNA克隆于细菌表达载体,获得并提纯了6 种cDNA 的表达产物。这项工作将为进一步研究SMV 基因组的功能奠定基础。  相似文献   

9.
卫星、小卫星和微卫星DNA——真核生物基因组的串状重复序列姜运良(山东农业大学动物科技学院,山东泰安271018)关键词卫星小卫星微卫星串状重复序列真核生物基因组中编码蛋白质(酶)的结构基因只占很少的一部分(10%~20%),其余大部分是重复序列。根...  相似文献   

10.
大赖草总DNA转化小麦的分子证据   总被引:11,自引:0,他引:11  
缪军  赵民安  李维琪 《遗传学报》2000,27(7):621-627
用来自大麦组的4个高度重复序列克隆了pHv7161,pHv71789,pHv7191、pHv7293,经地高辛和同位素2种方法标记后作为探针,对新疆大赖草(供体)、春麦761(受体)以及用大赖草总DNA通过花粉管通道转化成功的大穗转化株基因组在高度严谨条件下进行了分子杂交。结果表明,这4个探针可以探查出基因内一种具有主串产重复单位的散在重复序列。比较受共体和转化体的杂交图谱,发现在转化株中出现了  相似文献   

11.
Summary The compositional distributions of coding sequences and DNA molecules (in the 50-100-kb range) are remarkably narrower in murids (rat and mouse) compared to humans (as well as to all other mammals explored so far). In murids, both distributions begin at higher and end at lower GC values. A comparison of homologous coding sequences from murids and humans revealed that their different compositional distributions are due to differences in GC levels in all three codon positions, particularly of genes located at both ends of the distribution. In turn, these differences are responsible for differences in both codon usage and amino acids. When GC levels at first+second codon positions and third codon positions, respectively, of murid genes are plotted against corresponding GC levels of homologous human genes, linear relationships (with very high correlation coefficients and slopes of about 0.78 and 0.60, respectively) are found. This indicates a conservation of the order of GC levels in homologous genes from humans and murids. (The same comparison for mouse and rat genes indicates a conservation of GC levels of homologous genes.) A similar linear relationship was observed when plotting GC levels of corresponding DNA fractions (as obtained by density gradient centrifugation in the presence of a sequence-specific ligand) from mouse and human. These findings indicate that orderly compositional changes affecting not only coding sequences but also noncoding sequences took place since the divergence of murids. Such directional fixations of mutations point to the existence of selective pressures affecting the genome as a whole.  相似文献   

12.
13.
The nucleotide sequence running from the genetic left end of bacteriophage T7 DNA to within the coding sequence of gene 4 is given, except for the internal coding sequence for the gene 1 protein, which has been determined elsewhere. The sequence presented contains nucleotides 1 to 3342 and 5654 to 12,100 of the approximately 40,000 base-pairs of T7 DNA. This sequence includes: the three strong early promoters and the termination site for Escherichia coli RNA polymerase: eight promoter sites for T7 RNA polymerase; six RNAase III cleavage sites; the primary origin of replication of T7 DNA; the complete coding sequences for 13 previously known T7 proteins, including the anti-restriction protein, protein kinase, DNA ligase, the gene 2 inhibitor of E. coli RNA polymerase, single-strand DNA binding protein, the gene 3 endonuclease, and lysozyme (which is actually an N-acetylmuramyl-l-alanine amidase); the complete coding sequences for eight potential new T7-coded proteins; and two apparently independent initiation sites that produce overlapping polypeptide chains of gene 4 primase. More than 86% of the first 12,100 base-pairs of T7 DNA appear to be devoted to specifying amino acid sequences for T7 proteins, and the arrangement of coding sequences and other genetic elements is very efficient. There is little overlap between coding sequences for different proteins, but junctions between adjacent coding sequences are typically close, the termination codon for one protein often overlapping the initiation codon for the next. For almost half of the potential T7 proteins, the sequence in the messenger RNA that can interact with 16 S ribosomal RNA in initiation of protein synthesis is part of the coding sequence for the preceding protein. The longest non-coding region, about 900 base-pairs, is at the left end of the DNA. The right half of this region contains the strong early promoters for E. coli RNA polymerase and the first RNAase III cleavage site. The left end contains the terminal repetition (nucleotides 1 to 160), followed by a striking array of repeated sequences (nucleotides 175 to 340) that might have some role in packaging the DNA into phage particles, and an A · T-rich region (nucleotides 356 to 492) that contains a promoter for T7 RNA polymerase, and which might function as a replication origin.  相似文献   

14.
15.
人类蛋白编码基因局部GC水平相关性分析   总被引:2,自引:0,他引:2  
陈祥贵  胡军  杨潇 《遗传》2008,30(9):1169-1174
GC含量是基因组DNA序列碱基组成的重要特征, 蕴涵基因结构、功能和进化信息。文中通过从公共数据库提取7 992个非冗余的人类蛋白质编码基因DNA序列, 分析了基因序列不同区域的局部GC含量和相关性。结果表明: 基因局部GC含量呈现不均一性, 5′非翻译区GC水平最高, 为62.56%; 而3′非翻译区GC水平最低, 为43.97%。3′侧翼序列的GC含量能较好地代表基因所在区域DNA长片段的GC水平。虽然开放阅读框的GC含量比内含子、3′非翻译区和3′侧翼序列的GC含量高, 但4个区域的GC含量之间均存在较高的相关性。密码子第三位置的平均GC含量(GC3)为58.09%, 显著高于密码子第一位置和第二位置的GC含量, 且与开放阅读框的GC水平高度相关, 相关系数高达0.91。GC3与内含子、3′非翻译区、3′侧翼序列的GC水平相关性也较高, GC3对3′侧翼序列的GC含量的直线回归斜率为1.25。因此, GC3可作为基因所在区域GC水平变化的敏感性指标。而密码子第一位置和第二位置以及5′侧翼序列和5′非翻译区GC水平与基因其他区域的GC水平的相关性较弱。该研究结果提示: 基因蛋白编码区密码子第三位置、内含子、3′非翻译区和3′侧翼序列的碱基可能经历了相近的进化过程, 而蛋白编码区密码子第一位置和第二位置、5′侧翼序列和5′非翻译区由于功能的需要而经历了不同的突变和选择。  相似文献   

16.
17.
The complete nucleotide sequence of the Clostridium thermocellum celE gene, coding for an endo-beta-1,4-glucanase (endoglucanase E; EGE) with xylan-hydrolysing activity has been determined. The structural gene consists of an open reading frame (ORF) of 2442 bp commencing with a GTG start codon and followed by a TAA stop codon. The nucleotide sequence obtained has been confirmed by comparing the predicted amino acid sequence with that derived by N-terminal amino acid sequencing of the purified protein. The EGE sequence contains a region homologous to the reiterated domain found at the C terminus of other endoglucanases from the same organism. BAL 31 deletions of the structural gene have revealed the extent to which this conserved sequence is necessary for endoglucanase and xylanase activity. A region of DNA, upstream from the structural gene has also been sequenced and a ribosome-binding site and putative promoter sequences have been identified. A second ORF which ends 349 bp 5' to the GTG start codon of the celE gene has also been identified. The encoded product contains a C terminus homologous to other C. thermocellum endoglucanases.  相似文献   

18.
The sequence of 4.4 kilobase pairs (kbp) from the conventional right terminus of the A + T-rich light-DNA (L-DNA) sequences of the herpesvirus saimiri (HVS) genome contains a leftward-directed open reading frame (ORF) for a 1,299-residue protein. The molecular weight predicted for the protein (143,000) is in good agreement with the estimates of 150,000 to 160,000 for the major nonglycosylated polypeptide of the virion tegument (the 160K polypeptide), previously shown to be encoded by this region of the genome. The first initiation codon of the ORF is only 250 nucleotides from the junction of the L-DNA component with the G + C-rich terminal reiterations (i.e., heavy or H-DNA) of the genome. An unusually A + T-rich sequence (43 of 45 nucleotides are A or T, relative to a mean composition of 40% G + C for the ORF) occurs some 75 bp 5' to this initiation codon, and the first adenylation signal (AATAAA) on this DNA strand occurs 18 bp 3' to the termination codon. The amino acid sequence predicted for the 160K protein of HVS is homologous over most of its length to the 1,318-residue protein encoded by the leftmost major ORF of the G + C-rich genome of Epstein-Barr virus (BNRF1, the 140K nonglycosylated membrane antigen). No homology to either of these proteins is evident among the products predicted from the complete sequence of the alpha herpesvirus varicella-zoster virus. Thus gamma herpesviruses with coding sequences which differ in mean nucleotide composition by some 20% G + C have homologous proteins encoded at similar positions with respect to genome termini, with the right end of HVS being homologous to the left end of Epstein-Barr virus.  相似文献   

19.
DNA sequences, potentially coding for histidine-rich proteins, were isolated from a P. falciparum genomic library using an oligonucleotide probe consisting of histidine codon repeats. Sequencing revealed that the different DNA fragments contain long repetitive regions very homologous to the probe. One clone was fully sequenced and contains two open reading frames that overlap in the repetitive region but are located on opposite strands. Analysis suggests that both are coding. One frame could code for a small histidine-rich protein, the other for a protein containing many aspartic acid residues. Southern blotting revealed that these sequences are conserved in all three P. falciparum strains studied.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号