首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sánchez J 《Bioinformation》2011,6(9):327-329
All coding DNAs exhibit 3-base periodicity (TBP), which may be defined as the tendency of nucleotides and higher order n-tuples, e.g. trinucleotides (triplets), to be preferentially spaced by 3, 6, 9 etc, bases, and we have proposed an association between TBP and clustering of same-phase triplets. We here investigated if TBP was affected by intercodon dinucleotide tendencies and whether clustering of same-phase triplets was involved. Under constant protein sequence intercodon dinucleotide frequencies depend on the distribution of synonymous codons. So, possible effects were revealed by randomly exchanging synonymous codons without altering protein sequences to subsequently document changes in TBP via frequency distribution of distances (FDD) of DNA triplets. A tripartite positive correlation was found between intercodon dinucleotide frequencies, clustering of same-phase triplets and TBP. So, intercodon C|A (where "|" indicates the boundary between codons) was more frequent in native human DNA than in the codon-shuffled sequences; higher C|A frequency occurred along with more frequent clustering of C|AN triplets (where N jointly represents A, C, G and T) and with intense CAN TBP. The opposite was found for C|G, which was less frequent in native than in shuffled sequences; lower C|G frequency occurred together with reduced clustering of C|GN triplets and with less intense CGN TBP. We hence propose that intercodon dinucleotides affect TBP via same-phase triplet clustering. A possible biological relevance of our findings is briefly discussed.  相似文献   

2.
3.
Liang Y  Hong Y  Parslow TG 《Journal of virology》2005,79(16):10348-10355
The influenza A virus genome consists of eight negative-sense RNA segments. The cis-acting signals that allow these viral RNA segments (vRNAs) to be packaged into influenza virus particles have not been fully elucidated, although the 5' and 3' untranslated regions (UTRs) of each vRNA are known to be required. Efficient packaging of the NA, HA, and NS segments also requires coding sequences immediately adjacent to the UTRs, but it is not yet known whether the same is true of other vRNAs. By assaying packaging of genetically tagged vRNA reporters during plasmid-directed influenza virus assembly in cells, we have now mapped cis-acting sequences that are sufficient for packaging of the PA, PB1, and PB2 segments. We find that each involves portions of the distal coding regions. Efficient packaging of the PA or PB1 vRNAs requires at least 40 bases of 5' and 66 bases of 3' coding sequences, whereas packaging of the PB2 segment requires at least 80 bases of 5' coding region but is independent of coding sequences at the 3' end. Interestingly, artificial reporter vRNAs carrying mismatched ends (i.e., whose 5' and 3' ends are derived from different vRNA segments) were poorly packaged, implying that the two ends of any given vRNA may collaborate in forming specific structures to be recognized by the viral packaging machinery.  相似文献   

4.
Single-nucleotide polymorphisms (SNPs) can make an important contribution to our understanding of genetic backgrounds that may influence medical conditions and ethnic diversity. We undertook a systematic survey of genomic DNA for SNPs located not only in coding sequences but also in non-coding regions (e.g., introns and 5' flanking regions) of selected genes. Using DNA samples from 48 Japanese patients with rheumatoid arthritis (RA) as templates, we surveyed 41 genes that represent candidates for RA, screening a total of 104 kb of DNA (30 kb of coding sequences and 74 kb of non-coding DNA). Within this 104 kb of genomic sequences we identified 163 polymorphisms (1 per 638 bases on average), of which 142 were single-nucleotide substitutions and the remainder, insertions or deletions. Of the coding SNPs, 52% were non-synonymous substitutions, and non-conservative amino acid changes were observed in a quarter of those. Sixty-nine polymorphisms showed high frequencies for minor alleles (more than 15%) and 20 revealed low frequencies (<5%). Our results indicated a greater average distance between SNPs than others have reported, but this disparity may reflect the type of genes surveyed and/or the relative ethnic homogeneity of our test population.  相似文献   

5.
cDNA sequence data from E. coli phages, for which complete genome sequences are known, have been analysed, From this analysis thirteen triplets have been identified as markers to distinguish protein-coding frames from fortuitous open reading frames. The region of -18 to +18 nucleotides around ATG/GTG, has been analysed and used to identify initiator codons from internal ATG/GTG. With the aid of criteria defined above a method has been developed to locate protein coding sequences by a combination of 'gene search by signal' and 'gene search by content' approaches. Application of this method to prokaryotic systems including those which were not part of our data base indicates that it is quite accurate and general in nature.  相似文献   

6.
The organization of 5S ribosomal RNA (rRNA) genes in the genome of Schizosaccharomyces pombe has been investigated by restriction and hybridization analyses. The 5S rRNA genes were not linked to the other three species of rRNA genes which formed a repeating unit of 6.9 megadaltons, but located in other regions surrounded by heterogeneous sequences. The 5S rRNA gene organization in S. pombe is therefore different from those in other yeasts; Saccharomyces cerevisiae and Torulopsis utilis. Four restriction segments of different sizes each containing a single 5S rRNA gene were cloned on a bacterial plasmid, and the sequences in and around the RNA coding regions were determined. In the RNA coding regions, the sequences in four clones were identical with an exception that one residue has been substituted in one clone. In the flanking regions, the sequences were extremely rich in the AT-content and highly heterogeneous. The sequences were also markedly different from those in the corresponding regions of the other two yeasts. THe presence of T-clusters in the regions immediately after the RNA coding sequences was only notable homology among the four clones and the other two yeasts.  相似文献   

7.
A simplified plasmid-directed coupled system [Robakis, N., Cenatiempo, Y., Meza-Basso, L., Brot, N., & Weissbach, H. (1983) Methods Enzymol. 101, 690-706] was used to study the accuracy of natural messenger translation in vitro. In this system, protein synthesis is limited to the formation of the N-terminal di- or tripeptide of the gene product. Such a control is obtained by restricting the supply of aminoacyl-tRNAs in the assay medium to those corresponding specifically to the first two or three triplets in the mRNA coding sequence. We analyzed comparatively the interaction of 6 different codons with their cognate tRNAs and 18 noncognate tRNAs able to recognize triplets differing from the legitimate sequences by one base only. Special attention was paid to the single base errors occurring at the first and second codon positions during ribosomal selection of aminoacyl-tRNA molecules. The noncognate tRNAs were assayed either in the absence of the legitimate tRNAs or under competition conditions. They were chosen so that all the possibilities for misreading any particular base as each of the other three bases could be studied. First, it was mainly observed that translation mistakes can be equally detected in the first and second codon positions; there is no compelling evidence for a most or least accurate site. Second, pyrimidines seem to be read more accurately than purines. In particular, U cannot be read as either C or G, and C can hardly be mistaken for any other base.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

8.
Recently, we have shown that peptide nucleic acid (PNA) tridecamers targeted to the codon 74, 128 and 149 regions of Ha-ras mRNA arrested translation elongation in vitro. Our data demonstrated for the first time that PNAs with mixed base sequence targeted to the coding region of a messenger RNA could arrest the translation machinery and polypeptide chain elongation. The peculiarity of the complexes formed with PNA tridecamers and Ha-ras mRNA rests upon the stability of PNA-mRNA hybrids, which are not dissociated by cellular proteins or multiple denaturing conditions. In the present study, we show that shorter PNAs such as a dodecamer or an undecamer targeted to the codon 74 region arrest translation elongation in vitro. The 13, 12, and 11-mer PNAs contain eight and the 10-mer PNA seven contiguous pyrimidine residues. Upon binding with parallel Hoogsteen base-pairing to the PNA-RNA duplex, six of the cytosine bases and one thymine base of a second PNA can form C.G*C(+) and T.A*T triplets. Melting experiments show two well-resolved transitions corresponding to the dissociation of the third strand from the core duplex and to melting of duplex at higher temperature. The enzymatic structure mapping of a target 27-mer RNA revealed a hairpin structure that is disrupted upon binding of tri-, dodeca-, undeca- and decamer PNAs. We show that the non-bonded nucleobase overhangs on the RNA stabilize the PNA-RNA hybrids and probably assist the PNA in overcoming the stable secondary structure of the RNA target. The great stability of PNA-RNA duplex and triplex structures allowed us to identify both 1:1 and 2:1 PNA-RNA complexes using matrix-assisted laser desorption/ionization time-of -flight mass spectrometry. Therefore, it is possible to successfully target mixed sequences in structured regions of messenger RNA with short PNA oligonucleotides that form duplex and triplex structures that can arrest elongating ribosomes.  相似文献   

9.
Summary Various measures of sequence dissimilarity have been evaluated by how well the additive least squares estimation of edges (branch lengths) of an unrooted evolutionary tree fit the observed pairwise dissimilarity measures and by how consistent the trees are for different data sets derived from the same set of sequences. This evaluation provided sensitive discrimination among dissimilarity measures and among possible trees. Dissimilarity measures not requiring prior sequence alignment did about as well as did the traditional mismatch counts requiring prior sequence alignment. Application of Jukes-Cantor correction to singlet mismatch counts worsened the results. Measures not requiring alignment had the advantage of being applicable to sequences too different to be critically alignable. Two different measures of pairwise dissimilarity not requiring alignment have been used: (1) multiplet distribution distance (MDD), the square of the Euclidean distance between vectors of the fractions of base singlets (or doublets, or triplets, or…) in the respective sequences, and (2) complements of long words (CLW), the count of bases not occurring in significantly long common words. MDD was applicable to sequences more different than was CLW (noncoding), but the latter often gave better results where both measures were available (coding). MDD results were improved by using longer multiplets and, if the sequences were coding, by using the larger amino acid and codon alphabets rather than the nucleotide alphabet. The additive least squares method could be used to provide a reasonable consensus of different trees for the same set of species (or related genes).  相似文献   

10.
This report deals with the study of compositional properties of human gene sequences evaluating similarities and differences among functionally distinct sectors of the gene independently of the reading frame. To retrieve the compositional information of DNA, we present a neighbor base dependent coding system in which the alphabet of 64 letters (DNA triplets) is compressed to an alphabet of 14 letters here termed triplet composons. The triplets containing the same set of distinct bases in whatever order and number form a triplet composon. The reading of the DNA sequence is performed starting at any letter of the initial triplet and then moving, triplet-to-triplet, until the end of the sequence. The readings were made in an overlapping way along the length of the sequences. The analysis of the compositional content in terms of the composon usage frequencies of the gene sequences shows that: (i) the compositional content of the sequences is far from that of random sequences, even in the case of non-protein coding sequences; (ii) coding sequences can be classified as components of compositional clusters; and (iii) intron sequences in a cluster have the same composon usage frequencies, even as their base composition differs notably from that of their home coding sequences. A comparison of the composon usage frequencies between human and mouse homologous genes indicated that two clusters found in humans do not have their counterpart in mouse whereas the others clusters are stable in both species with respect to their composon usage frequencies in both coding and noncoding sequences.  相似文献   

11.
12.
Most of the gene prediction algorithms for prokaryotes are based on Hidden Markov Models or similar machine-learning approaches, which imply the optimization of a high number of parameters. The present paper presents a novel method for the classification of coding and non-coding regions in prokaryotic genomes, based on a suitably defined compression index of a DNA sequence. The main features of this new method are the non-parametric logic and the costruction of a dictionary of words extracted from the sequences. These dictionaries can be very useful to perform further analyses on the genomic sequences themselves. The proposed approach has been applied on some prokaryotic complete genomes, obtaining optimal scores of correctly recognized coding and non-coding regions. Several false-positive and false-negative cases have been investigated in detail, which have revealed that this approach can fail in the presence of highly structured coding regions (e.g., genes coding for modular proteins) or quasi-random non-coding regions (e.g., regions hosting non-functional fragments of copies of functional genes; regions hosting promoters or other protein-binding sequences). We perform an overall comparison with other gene-finder software, since at this step we are not interested in building another gene-finder system, but only in exploring the possibility of the suggested approach.  相似文献   

13.
J Sugihara  T O Baldwin 《Biochemistry》1988,27(8):2872-2880
Ten recombinant plasmids have been constructed by deletion of specific regions from the plasmid pTB7 that carries the luxA and luxB genes, encoding the alpha and beta subunits of luciferase from Vibrio harveyi, such that luciferases with normal alpha subunits and variant beta subunits were produced in Escherichia coli cells carrying the recombinant plasmids. The original plasmid, which conferred bioluminescence (upon addition of exogenous aldehyde substrate) on E. coli carrying it, was constructed by insertion of a 4.0-kb HindIII fragment of V. harveyi DNA into the HindIII site of plasmid pBR322 [Baldwin, T.O., Berends, T., Bunch, T. A., Holzman, T. F., Rausch, S. K., Shamansky, L., Treat, M. L., & Ziegler, M. M. (1984) Biochemistry 23, 3663-3667]. Deletion mutants in the 3' region of luxB were divided into three groups: (A) those with deletions in the 3' untranslated region that left the coding sequences intact, (B) those that left the 3' untranslated sequences intact but deleted short stretches of the 3' coding region of the beta subunit, and (C) those for which the 3' deletions extended from the untranslated region into the coding sequences. Analysis of the expression of luciferase from these variant plasmids has demonstrated two points concerning the synthesis of luciferase subunits and the assembly of those subunits into active luciferase in E. coli. First, deletion of DNA sequences 3' to the translational open reading frame of the beta subunit that contain a potential stem and loop structure resulted in dramatic reduction in the level of accumulation of active luciferase in cells carrying the variant plasmids, even though the luxAB coding regions remained intact.  相似文献   

14.
15.
16.
17.
In order to investigate the energy and structural character of RNA-RNA triplets and RNA-DNA duplex base triplets, 64 sets of three-dimensional models of RNA-DNA duplex base triplets and mRNA-tRNA triplex base triplets were constructed and optimized by homologous modeling method using the software InsightII. The comparative statistical method and cluster analysis were adopted to study these features. The result showed: (i) all energy parameters of monomer RNA-DNA hybrid triplets and ternary complexes appeared significantly different; and some parameters related with overall molecules such as overall energy, bond energy and coulomb energy have statistically significant correlations between the structures in vacuum and aquatic solutions while other parameters, including theta energy, phi energy, hydrogen bond energy and non-bond energy, changed significantly, but not continuously. (ii) However, the case of mRNA-tRNA triplets was much more complicated in that only the bond energy's correlation coefficient is -0.8. Typically, the main contribution of GC pairs and G/A/U bases were interesting. The models of RNA-DNA hybrid triplets and mRNA-tRNA triplet should be helpful for the study of base pairing in codons and the biological effectiveness of antisense nucleic acids.  相似文献   

18.
The primary structure of an insert from a clone isolated from the bovine pituitary cDNA library by hybridization with prolactin-specific probe has been determined. It was found that the rearrangement of cDNA took place in the process of cloning. The rearrangement includes the inversion of 5'-terminal and the deletion of the central part of cDNA. However from the structure of the insert we were able to deduce the sequences of 5'- and 3'-terminal regions of bovine preprolactin mRNA (257 and 551 bases long). The comparison of these sequences with those published earlier revealed several differences in the primary structure. The most essential of them is the additional triplet coding for alanine in position of -22 of the signal peptide. The heterogeneity of bovine preprolactin mRNA in the region coding for the signal peptide is considered to be a consequence of alternative splicing as it was shown for rat preprolactin mRNA.  相似文献   

19.
J L Weber 《Gene》1987,52(1):103-109
The genome of the human malaria parasite Plasmodium falciparum has an A + T content of about 82%, higher than any other organism whose DNA has been characterized. Computer analysis of 36 kb of available nucleotide sequences from this species showed that the coding regions, with an A + T content of 69.0%, are flanked by more A + T-rich regions of 86.0% A + T. Within the coding sequences, the A/T ratio was 1.68 in the mRNA sense strand, and overall A + T content in the three codon positions increased in the order 1st-2nd-3rd position. Codons with T or especially A in the third position were strongly preferred. Codon usage among individual parasite genes was very similar compared to genes from other species. Dinucleotide frequencies for the parasite DNA were close to those expected for a random sequence with the known base composition, except that the CpG frequency in the coding sequences was low.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号