首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The bacterial DNA sequence in GenBank database were divided into coding and noncoding regions and examined for the base-trimer distribution in every triplet frame on the sense and antisense strands. The results revealed that for the noncoding region, both strands have very similar base-trimer distributions and have no frame specificity; that is, DNA is symmetric in the noncoding region. For the coding region, on the other hand, the symmetry is broken only in the triplet framework, and we found a special triplet-frame-specific symmetry which appears when the two complementary strands of the coding region are read from their 5 ends. In addition, the following frame specificity was also observed in the distribution of stop codons on the antisense strand of the coding region. When the antisense sequences of the open reading frames (ORFs) in the database are read in the three reading frames, the same reading frame as the corresponding ORF contains a significantly larger amount of long open frames without stop codons (i.e., nonstop frames [NSFs]) than expected, while the number of NSFs in the other two reading frames is similar to that of the expected one. That is, NSFs as well as ORFs are maintained in a frame-specific manner, and in this sense, DNA becomes symmetrical even in the coding region. These two kinds of frame-specific symmetries indicate that only an ORF and its complementary triplets are specifically recognized and maintained in DNA. We suppose that the antisense strands as well as the sense strands in the coding region may be transcribed, thereby producing various kinds of proteins corresponding to NSFs, though their amount may not be large. The presence of these proteins should have some benefits for living organisms, and therefore we propose that these proteins are upcoming enzymes having novel functions.Correspondence to: I. Urabe  相似文献   

2.
Origin and properties of non-coding ORFs in the yeast genome.   总被引:4,自引:0,他引:4       下载免费PDF全文
In a recent paper we have estimated the total number of protein coding open reading frames (ORFs) in the Saccharomyces cerevisiae genome, based on their properties, at about 4800. This number is much smaller than the 5800-6000 which is widely accepted. In this paper we analyse differences between the set of ORFs with known phenotypes annotated in the Munich Information Centre for Protein Sequences (MIPS) database and ORFs for which the probability of coding, counted by us, is very low. We have found that many of the latter ORFs have properties of antisense sequences of coding ORFs, which suggests that they could have been generated by duplication of coding sequences. Since coding sequences generate ORFs inside themselves, with especially high frequency in the antisense sequences, we have looked for homology between known proteins and hypothetical polypeptides generated by ORFs under consideration in all the six phases. For many ORFs we have found paralogues and orthologues in phases different than the phase which had been assumed in the MIPS database as coding.  相似文献   

3.
4.
《Gene》1997,194(1):143-155
In recent studies it has been suggested that long reading frames on the antisense strand of open reading frames (ORFs) are more frequent than expected. The vertebrate DNA database was searched for long (greater than 900 bp) antisense non-stop reading frames (aNRFs) that overlap known coding regions. The sequences obtained were predominantly positioned in DNA with a high usage of Gor C in the third codon position of the sense ORF. The major class of sequences revealed by the search was that of the heat-shock protein 70 kDa (Hsp70) family. A long Hsp70 aNRF was found in many Hsp70 sequences and occurred in species as diverse as fish, flies, fungi and bacteria. The role of codon usage bias was analysed both in the specific case of the Hsp70 genes and in a general species-wide context. The data obtained showed that even the very long aNRFs present in the Hsp70 family could be explained by codon usage bias on the sense strand. Codon usage bias is determined by GC content at the third codon position of the sense ORF and, in some species, by a high expression level of the gene in question. Such an explanation for the occurrence of long aNRFs cannot exclude that some aNRFs are transcribed and translated.  相似文献   

5.
The nucleotide sequence of the ftf gene from Streptococcus mutants GS-5 was determined. The deduced amino acid sequence indicates that the unprocessed fructosyltransferase gene product has a molecular weight of 87,600. A typical streptococcal signal sequence is present at the amino terminus of the protein. The processed enzyme is relatively hydrophilic and has a pI of 5.66. An inverted repeat structure was detected upstream from the ftf gene and may function in the regulation of fructosyltransferase expression. Sequencing of the regions flanking the gene revealed the presence of four other putative open reading frames (ORFs). Two of these, ORFs 2 and 3, appear to code for low-molecular-weight proteins containing amino acid sequences sharing homology with several gram-positive bacterial DNA-binding proteins. In addition, ORF 3 is transcribed from the ftf DNA coding strand. Partial sequencing of ORF 4 suggests that its gene product may be an extracellular protein.  相似文献   

6.
7.
8.
9.
The question whether the noncoding DNA strand had or still has the capability for encoding functional polypeptides has been addressed in several articles. The theoretical background of the views advocating this idea arose from two groups of findings. One of them was based on various observations implying that the genetic code was adapted for double-strand coding. The other group of theories arose from the observation of gene-length overlapping open reading frames (O-ORFs) on the antisense DNA strand in a number of genes. In fact, the above theories, which I term selectionist, conceive a novel conception of gene evolution, proposing that new genes can be created by the utilization of antisense DNA strand. In contrast, neutralist theory claims that the O-ORFs are mere by-products of evolutionary processes acting to create special codon usage and base distribution patterns in the coding sequences. Received: 16 June 2000 / Accepted: 31 August 2000  相似文献   

10.
11.
Sequence of figwort mosaic virus DNA (caulimovirus group).   总被引:19,自引:3,他引:16       下载免费PDF全文
  相似文献   

12.
Expressed sequence tag (EST) databases contain a significant number (5-20%) of reversed, antisense, cDNA sequences that can be recognized by the label "reversed clone: similarity on wrong strand" in the annotations to the sequence. Despite this high number of altered sequences, no attempt has been made to explain the alteration in molecular terms, or to evaluate their effect on the quality of the information curated in EST databases. In this paper we try to explain the way these altered sequences are originated, and propose a plausible mechanism: a "double priming" of the first strand oligo-dT primer at both ends of nascent cDNAs. In this way, a symmetrical cDNA intermediate is generated, an intermediate that can be cloned after partial digestion with the restriction enzyme used for the directional cloning. Furthermore, when "secondary" priming takes place inside the cDNA, the chain synthesized is prone to be truncated prematurely, with the subsequent loss of upstream information. One of the most subtle effects of this cloning alteration is the generation of virtual open reading frames (ORFs) in sequences with no homologues available for comparison. Nevertheless, and according to our model and our data, the "double priming mechanism" does not shift the ORF effected, so antisense sequences should be considered as normal ones after a simple transformation in their inverse-complementary forms.  相似文献   

13.
14.
A novel cytorhabdovirus, tentatively named Actinidia virus D (AcVD), was identified from kiwifruit (Actinidia chinensis) in China using high-throughput sequencing technology. The genome of AcVD consists of 13,589 nucleotides and is organized into seven open reading frames (ORFs) in its antisense strand, coding for proteins in the order N-P-P3-M-G-P6-L. The ORFs were flanked by a 3′ leader sequence and a 5′ trailer sequence and are separated by conserved intergenic junctions. The genome sequence of AcVD was 44.6%–51.5% identical to those of reported cytorhabdoviruses. The proteins encoded by AcVD shared the highest sequence identities, ranging from 27.3% (P6) to 44.5% (L), with the respective proteins encoded by reported cytorhabdoviruses. Phylogenetic analysis revealed that AcVD clustered together with the cytorhabdovirus Wuhan insect virus 4. The subcellular locations of the viral proteins N, P, P3, M, G, and P6 in epidermal cells of Nicotiana benthamiana leaves were determined. The M protein of AcVD uniquely formed filament structures and was associated with microtubules. Bimolecular fluorescence complementation assays showed that three proteins, N, P, and M, self-interact, protein N plays a role in the formation of cytoplasm viroplasm, and protein M recruits N, P, P3, and G to microtubules. In addition, numerous paired proteins interact in the nucleus. This study presents the first evidence of a cytorhabdovirus infecting kiwifruit plants and full location and interaction maps to gain insight into viral protein functions.  相似文献   

15.
Gene overlap occurs when two or more genes are encoded by the same nucleotides. This phenomenon is found in all taxonomic domains, but is particularly common in viruses, where it may increase the information content of compact genomes or influence the creation of new genes. Here we report a global comparative study of overlapping open reading frames (OvRFs) of 12,609 virus reference genomes in the NCBI database. We retrieved metadata associated with all annotated open reading frames (ORFs) in each genome record to calculate the number, length, and frameshift of OvRFs. Our results show that while the number of OvRFs increases with genome length, they tend to be shorter in longer genomes. The majority of overlaps involve +2 frameshifts, predominantly found in dsDNA viruses. Antisense overlaps in which one of the ORFs was encoded in the same frame on the opposite strand (−0) tend to be longer. Next, we develop a new graph-based representation of the distribution of overlaps among the ORFs of genomes in a given virus family. In the absence of an unambiguous partition of ORFs by homology at this taxonomic level, we used an alignment-free k-mer based approach to cluster protein coding sequences by similarity. We connect these clusters with two types of directed edges to indicate (1) that constituent ORFs are adjacent in one or more genomes, and (2) that these ORFs overlap. These adjacency graphs not only provide a natural visualization scheme, but also a novel statistical framework for analyzing the effects of gene- and genome-level attributes on the frequencies of overlaps.  相似文献   

16.
D X Liu  S C Inglis 《Journal of virology》1992,66(10):6143-6154
mRNA3 specified by the coronavirus infectious bronchitis virus appears to be functionally tricistronic, having the capacity to encode three small proteins (3a, 3b, and 3c) from separate open reading frames (ORFs). The mechanism by which this can occur was investigated through in vitro translation studies using synthetic mRNAs containing the 3a, 3b, and 3c ORFs, and the results suggest that translation of the most distal of the three ORFs, that for 3c, is mediated by an unconventional, cap-independent mechanism involving internal initiation. This conclusion is based on several observations. A synthetic mRNA whose peculiar 5' end structure prevents translation of the 5'-proximal ORFs (3a and 3b) directs the synthesis of 3c normally. Translation of 3c, unlike that of 3a and 3b, was insensitive to the presence of the 5' cap analog 7-methyl-GTP, and it was unaffected by alteration of the sequence contexts for initiation on the 3a and 3b ORFs. Finally, an mRNA in which the 3a/b/c infectious bronchitis virus coding region was placed downstream of the influenza A virus nucleocapsid protein gene directed the efficient synthesis of 3c as well as nucleocapsid protein, whereas initiation at 3a and 3b could not be detected. Expression of the 3c ORF from this mRNA, however, was abolished when the 3a and 3b coding region was deleted, indicating that 3c initiation is dependent on upstream sequence elements which together may serve as a ribosomal internal entry site similar to those described for picornaviruses.  相似文献   

17.
Nucleotide sequence of the traD region in the Escherichia coli F sex factor   总被引:11,自引:0,他引:11  
M B Jalajakumari  P A Manning 《Gene》1989,81(2):195-202
The complete nucleotide sequence has been determined of a 3635-bp region, extending from the HpaI site in traT, at F coordinate 90.3 kb, to beyond the end of traD, of the F sex factor plasmid of Escherichia coli K-12. This region contains the C-terminal coding part of traT and the entire traD gene. An open reading frame (ORF) of 2148 bp within the sequence confirms that traD encodes an 81.4-kDa cytoplasmic membrane protein. The TraD protein has several regions with an unusually high pI (greater than 10), suggesting that they may correspond to the DNA-binding domains. Several other ORFs were detected within the region including the gene (ORF1) for a 26.3-kDa protein and ORF2, probably corresponding to traI, which continues to the end of the sequence. An ORF for an 8.5-kDa protein preceded by an excellent promoter and ribosome-binding site is present in the region following traD but on the opposite strand. This promoter is thought to correspond to the major RNA polymerase binding site in this region, implying that traI does not have its own promoter. The lack of a typical terminator following traD and ORF1 and the translational coupling provided by overlapping stop and start codons is consistent with this conclusion.  相似文献   

18.
猪瘟病毒反义cDNA片段的化学合成及克隆   总被引:1,自引:0,他引:1  
涂长春  江南 《病毒学报》1992,8(4):383-385
  相似文献   

19.
MOTIVATION: Overlapping gene coding sequences (CDSs) are particularly common in viruses but also occur in more complex genomes. Detecting such genes with conventional gene-finding algorithms can be difficult for several reasons. If an overlapping CDS is on the same read-strand as a known CDS, then there may not be a distinct promoter or mRNA. Furthermore, the constraints imposed by double-coding can result in atypical codon biases. However, these same constraints lead to particular mutation patterns that may be detectable in sequence alignments. RESULTS: In this paper, we investigate several statistics for detecting double-coding sequences with pairwise alignments--including a new maximum-likelihood method. We also develop a model for double-coding sequence evolution. Using simulated sequences generated with the model, we characterize the distribution of each statistic as a function of sequence composition, length, divergence time and double-coding frame. Using these results, we develop several algorithms for detecting overlapping CDSs. The algorithms were tested on known overlapping CDSs and other overlapping open reading frames (ORFs) in the hepatitis B virus (HBV), Escherichia coli and Salmonella typhimurium genomes. The algorithms should prove useful for detecting novel overlapping genes--especially short coding ORFs in viruses. AVAILABILITY: Programs may be obtained from the authors. SUPPLEMENTARY INFORMATION: http://biochem.otago.ac.nz/double.html.  相似文献   

20.
康乃馨ACC氧化酶cDNA的克隆及其反义植物表达载体的构建   总被引:1,自引:0,他引:1  
以康乃馨(Dianthus caryophyllus L.)花瓣为材料,用改进的异硫氰酸胍一步法提取总RNA,根据已报道的康乃馨ACC氧化酶(1-aminocyclopropane-1-carboxylic acid oxidase,CO)基因的序列设计产合成一对引物,通过RT-PCR方法获得一约1.2kb特异片段,把该片段连接pGEM^(R)-Teasy vector上进行测序,其全长共1156bp,编码区915bp。共编码304个氨基酸残基,序列分析结果表明该序列与GenBankL35152中的康乃馨ACC氧化酶基因的cDNA序列完全相符,推断该基因在康乃馨种内可能是完全或高度保守的,随 后将此片段反向插入植物表达载体pBI121的35S启动子和NOS终止子之间,构建了一反义植物表达载体pBO;又把花特异表达启动子PchsA插入pBI121的HindⅢ Xbal位点构建中间载体pGHB,再把康乃馨ACC氧化酶基因反向插入中间载体pCHB的XbaI Satl位点构建成另一反义植物表达载体pCBO。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号