首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
Frenkel FE  Korotkov EV 《Gene》2008,421(1-2):52-60
We introduce a new concept of triplet periodicity class (TPC) and a measure of similarity between such classes. We performed classification of 472288 triplet periodicity (TP) regions found in 578868 genes from 29th release of KEGG databank. Totally 2520 classes were obtained. They contain 94% of 472288 found cases of TP. For 92% of TP regions contained in classes the same linkage of TP to open reading frame (ORF) is observed. For 8% of TP cases we revealed a shift between ORF of a gene and ORF common for majority of genes contained in a TPC. For these 8% of periodic regions the hypothetical amino acid sequences corresponding to ORF built by TPC were made. BLAST program has shown that 2679 hypothetical amino acid sequences have statistically significant similarity with proteins from UniProt databank. We suppose that 8% of TP regions contained in classes possess a mutation originating from ORF shift. Obtained TPCs can be used for identification of genes' coding regions as well as for searching for mutations arisen arising from ORF shift.  相似文献   

3.
Two distinct processed calmodulin genes of rat (lambda SC8 and lambda SC9) were identified, cloned and their DNA sequences determined. The existence of direct repeats of 19 base-pairs for lambda SC8 or 9 base-pairs for lambda SC9 at both ends of the coding plus non-coding regions suggested a possible involvement of a mRNA-mediated process of insertion. Total genomic Southern hybridization suggested the existence of at least three different calmodulin-related genes in the rat genome. The other gene was the bona fide calmodulin gene (lambda SC4) which was split into at least five exons. lambda SC9 contained insertions of one nucleotide and two 17 base-pair direct repeats in the coding region. These insertions cause frameshift mutations probably preventing it from encoding a functional calmodulin. It also carried an insertion of a rat middle repetitive sequence, identifier sequence (IDS: Sutcliffe et al., 1982) in the 3'-non-coding region. Otherwise, it consisted of an almost identical DNA sequence to that of the bona fide calmodulin gene (lambda SC4), including the 3'-non-coding region down to the poly(A) recognition signal, A-A-T-A-A-A. On the other hand, lambda SC8 did not possess frameshift mutations in the coding region, and hence was capable of encoding a functional protein. In fact, a probe specific to the lambda SC8 sequence identified a band in Northern blotting whose size was 300 nucleotides smaller than that of authentic calmodulin mRNA. Comparison of the nucleotide sequences showed that only the coding regions of these two processed genes were homologous, indicating that the divergence of these two processed genes from the common ancestor calmodulin was an ancient event.  相似文献   

4.
叶绿体基因infA-rpl36区域在小麦族物种中的序列变异分析   总被引:3,自引:1,他引:2  
刘畅  杨足君  李光蓉  冯娟  邓科君  黄健  任正隆 《遗传》2006,28(10):1265-1272
利用小麦叶绿体基因组中infA-rpl36区域的序列设计引物, 对小麦族(Triticeae)的12个二倍体和多倍体的物种进行了PCR扩增和序列测定, 获得了长度为584~603 bp的12条DNA序列。序列分析表明, 供试物种在infA-rpl36基因间隔区的核苷酸变异明显高于基因编码区。基因编码区核苷酸序列同源性高达97%, 表明了目标片段具有高度的保守性。但在5个物种的infA编码区出现了较大的插入、缺失突变, 导致推导的氨基酸序列也发生了很大的变化, 证实了infA基因是叶绿体基因组中最活跃的基因之一, 而rpl36基因的变异较小, 说明不同叶绿体基因的进化速度是不同的。基于测定序列建立的种系树分析发现, 多倍体物种中间偃麦草(Thinopyrum intermedium)具有多种不同的细胞质起源, 与核基因组一样在进化上较为复杂。  相似文献   

5.
We cloned ras-related sequences from goldfish genomic libraries constructed as recombinants using the lambda phage. Restriction enzyme mapping of the clones obtained revealed three kinds of ras-related sequences among approximately 350,000 genomic clones. One of these clones was partially sequenced. Comparison with the nucleotide sequences of mammalian ras genes showed that the determined sequences covered the predicted amino acid coding regions and parts of the intervening regions. The predicted amino acid sequences of the cloned ras-related goldfish gene suggested that the coding region is localized separately in DNA, and that its exon-intron boundaries are exactly the same as those of corresponding mammalian genes. The nucleotide and amino acid sequences of the goldfish ras-related gene may have extensive homologies to mammalian p 21 protein. Among the three mammalian ras proteins, the predicted amino acid sequence of the sequenced ras-related goldfish clone is most closely homologous (96%) to the Kirsten ras protein. Differences in the predicted amino acid sequence were greatest in the sequence predicted from the fourth exon; fewer differences were found in the sequence from the third exon, and only slight or no differences were found in the sequence predicted for the first and second exons. The 12th and 61st amino acids from the N-terminal of the protein, which are thought to be critical positions for GTP binding and catalysis, are both conserved in the goldfish protein.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

6.
7.
We have isolated almost full-length cDNA clones corresponding to human erythrocyte membrane sialoglycoproteins alpha (glycophorin A) and delta (glycophorin B). The predicted amino acid sequence of delta differs at two amino acid residues from the sequence determined by peptide sequencing. The sialoglycoprotein delta clone we have isolated contains an interrupting sequence within the region that gives rise to the cleaved N-terminal leader sequence for the protein and represents a product that is unlikely to be inserted into the erythrocyte membrane. Comparison of the cDNA sequences of alpha and delta shows very strong homology at the DNA level within the coding regions. The two mRNA sequences are closely related and differ by a number of clearly defined insertions and deletions.  相似文献   

8.
Complementary and genomic DNA clones corresponding to the human serum amyloid P component (SAP) mRNA have been isolated and analyzed. The nucleotide sequences of the cDNA and the corresponding regions of the genomic SAP DNA reported here were identical, and revealed that after coding for a signal peptide of 19 amino acids and the first two amino acids of the mature SAP protein, there is one small intron of 115-base pairs (bp), followed by a nucleotide sequence coding for the remaining 202 amino acid residues. The SAP gene has an ATATAAA sequence 29-bp upstream from the cap site, but there is no CAAT box-like sequence. A possible polyadenylation signal sequence, ATTAAA, was found to be located 28-bp upstream from the polyadenylation site. A comparison of the genomic SAP DNA sequence with that of human C-reactive protein (CRP) revealed a striking overall homology which was not uniform: several highly conserved regions were bounded by non-homologous regions. This comparison provides further support for the hypothesis that SAP and CRP are products of a gene duplication event.  相似文献   

9.
Hu JM  Fu HC  Lin CH  Su HJ  Yeh HH 《Journal of virology》2007,81(4):1746-1761
The nanovirus Banana bunchy top virus (BBTV) has six standard components in its genome and occasionally contains components encoding additional Rep (replication initiation protein) genes. Phylogenetic network analysis of coding sequences of DNA 1 and 3 confirmed the two major groups of BBTV, a Pacific and an Asian group, but show evidence of web-like phylogenies for some genes. Phylogenetic analysis of 102 major common regions (CR-Ms) from all six components showed a possible concerted evolution within the Pacific group, which is likely due to recombination in this region. The CR-M of additional Rep genes is close to that of DNA 1 and 2. Comparison of tree topologies constructed with DNA 1 and DNA 3 coding sequences of 14 BBTV isolates showed distinct phylogenetic histories based on Kishino-Hasegawa and Shimodaira-Hasegawa tests. The results of principal component analysis of amino acid and codon usages indicate that DNA 1 and 3 have a codon bias different from that of all other genes of nanoviruses, including all currently known additional Rep genes of BBTV, which suggests a possible ancient genome reassortment event between distinctive nanoviruses.  相似文献   

10.
The concept of the phase shift of triplet periodicity (TP) was used for searching potential DNA insertions in genes from 17 bacterial genomes. A mathematical algorithm for detection of these insertions has been developed. This approach can detect potential insertions and deletions with lengths that are not multiples of three bases, especially insertions of relatively large DNA fragments (>100 bases). New similarity measure between triplet matrixes was employed to improve the sensitivity for detecting the TP phase shift. Sequences of 17,220 bacterial genes with each consisting of more than 1,200 bases were analyzed, and the presence of a TP phase shift has been shown in ~16% of analysed genes (2,809 genes), which is about 4 times more than that detected in our previous work. We propose that shifts of the TP phase may indicate the shifts of reading frame in genes after insertions of the DNA fragments with lengths that are not multiples of three bases. A relationship between the phase shifts of TP and the frame shifts in genes is discussed.  相似文献   

11.
12.
Purifying and directional selection in overlapping prokaryotic genes   总被引:4,自引:0,他引:4  
In overlapping genes, the same DNA sequence codes for two proteins using different reading frames. Analysis of overlapping genes can help in understanding the mode of evolution of a coding region from noncoding DNA. We identified 71 pairs of convergent genes, with overlapping 3' ends longer than 15 nucleotides, that are conserved in at least two prokaryotic genomes. Among the overlap regions, we observed a statistically significant bias towards the 123:132 phase (i.e. the second codon base in one gene facing the degenerate third position in the second gene). This phase ensures the least mutual constraint on nonconservative amino acid replacements in both overlapping coding sequences. The excess of this phase is compatible with directional (positive) selection acting on the overlapping coding regions. This could be a general evolutionary mode for genes emerging from noncoding sequences, in which the protein sequence has not been subject to selection.  相似文献   

13.
14.
Structural comparison of yeast ribosomal protein genes.   总被引:12,自引:19,他引:12       下载免费PDF全文
The primary structure of the genes encoding the yeast ribosomal proteins L17a and L25 was determined, as well as the positions of the 5'- and 3'-termini of the corresponding mRNAs. Comparison of the gene sequences to those obtained for various other yeast ribosomal protein genes revealed several similarities. In all split genes the intron is located near the 5'-side of the amino acid coding region. Among the introns a clear pattern of sequence conservation can be observed. In particular the intron-exon boundaries and a region close to the 3'-splice site show sequence homology. Conserved sequences were also found in the leader and trailer regions of the ribosomal protein mRNAs. The 5'-flanking regions of the yeast ribosomal protein genes appeared to contain sequence elements that many but not all ribosomal protein genes have in common, and therefore may be implicated in the coordinate expression of these genes. The amino acid coding sequences of the ribosomal protein genes show a biased codon usage. Like most yeast ribosomal protein molecules, L17a and L25 are particularly basic at their N-terminus.  相似文献   

15.
DNA sequence of the Herpes simplex virus type 2 glycoprotein D gene   总被引:30,自引:0,他引:30  
R J Watson 《Gene》1983,26(2-3):307-312
We describe a 1635-bp Herpes simplex virus type 2 (HSV-2) DNA sequence containing the entire coding region of glycoprotein D (gD-2). The amino acid sequence of gD-2, deduced from the nucleotide sequence, was compared to that of the analogous Herpes simplex virus type 1 (HSV-1) glycoprotein (gD-1). The two glycoproteins are 85% homologous and contain highly conserved regions of as much as 49 amino acids in length. Comparison of DNA sequences upstream from gD-1 and gD-2 coding regions identified possible conserved regulatory sequences.  相似文献   

16.
Recent nuclear transfer of organellar DNA is thought to result mainly in nonfunctional nuclear sequences or in genetic dysfunction. Here we show that nuclear exons encoding novel protein sequences can be generated by insertions of organellar DNA. Most of the protein sequences do not correspond to preexisting organellar coding sequences or they represent markedly reshaped protein domains, reflecting the recruitment and adaptation of encoded proteins to new functions. Organelle-derived DNA insertions might be responsible for many more ancient functional exon acquisitions that are not directly detectable.  相似文献   

17.
Despite the agricultural importance of both potato and tomato, very little is known about their chloroplast genomes. Analysis of the complete sequences of tomato, potato, tobacco, and Atropa chloroplast genomes reveals significant insertions and deletions within certain coding regions or regulatory sequences (e.g., deletion of repeated sequences within 16S rRNA, ycf2 or ribosomal binding sites in ycf2). RNA, photosynthesis, and atp synthase genes are the least divergent and the most divergent genes are clpP, cemA, ccsA, and matK. Repeat analyses identified 33–45 direct and inverted repeats ≥30 bp with a sequence identity of at least 90%; all but five of the repeats shared by all four Solanaceae genomes are located in the same genes or intergenic regions, suggesting a functional role. A comprehensive genome-wide analysis of all coding sequences and intergenic spacer regions was done for the first time in chloroplast genomes. Only four spacer regions are fully conserved (100% sequence identity) among all genomes; deletions or insertions within some intergenic spacer regions result in less than 25% sequence identity, underscoring the importance of choosing appropriate intergenic spacers for plastid transformation and providing valuable new information for phylogenetic utility of the chloroplast intergenic spacer regions. Comparison of coding sequences with expressed sequence tags showed considerable amount of variation, resulting in amino acid changes; none of the C-to-U conversions observed in potato and tomato were conserved in tobacco and Atropa. It is possible that there has been a loss of conserved editing sites in potato and tomato.Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.  相似文献   

18.
19.
20.
The genes of the BanI restriction-modification system specific for GGPyPuCC were cloned from the chromosomal DNA of Bacillus aneurinolyticus IAM1077, and the coding regions were assigned on the nucleotide sequence on the basis of the N-terminal amino acid sequences and molecular weights of the enzymes. The restriction and modification genes coded for polypeptides with calculated molecular weights of 39,841 and 42,637, respectively. Both the enzymes were coded by the same DNA strand. The restriction gene was located upstream of the methylase gene, separated by 21 bp. The cloned genes were significantly expressed in E. coli cells, so that the respective enzymes could be purified to homogeneity. Analysis by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and gel filtration indicated that the catalytically active form of the endonuclease was dimeric and that of the methylase was monomeric. Comparison of the amino acid sequences revealed no significant homology between the endonuclease and methylase, though both enzymes recognize the same target sequence. Sequence comparison with other related enzymes indicated that BanI methylase contains sequences common to cytosine-specific methylases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号