期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Visualization of the protein-coding regions with a self adaptive spectral rotation approach

Chen B Ji P 《Nucleic acids research》2011,39(1):e3

Identifying protein-coding regions in DNA sequences is an active issue in computational biology. In this study, we present a self adaptive spectral rotation (SASR) approach, which visualizes coding regions in DNA sequences, based on investigation of the Triplet Periodicity property, without any preceding training process. It is proposed to help with the rough coding regions prediction when there is no extra information for the training required by other outstanding methods. In this approach, at each position in the DNA sequence, a Fourier spectrum is calculated from the posterior subsequence. Following the spectrums, a random walk in complex plane is generated as the SASR's graphic output. Applications of the SASR on real DNA data show that patterns in the graphic output reveal locations of the coding regions and the frame shifts between them: arcs indicate coding regions, stable points indicate non-coding regions and corners' shapes reveal frame shifts. Tests on genomic data set from Saccharomyces Cerevisiae reveal that the graphic patterns for coding and non-coding regions differ to a great extent, so that the coding regions can be visually distinguished. Meanwhile, a time cost test shows that the SASR can be easily implemented with the computational complexity of O(N). 相似文献

2.

Identification of gene expression elements in Histomonas meleagridis using splinkerette PCR, a variation of ligated adaptor PCR

Lynn EC Beckstead RB 《The Journal of parasitology》2012,98(1):135-141

相似文献

3.

Statistical characteristics of primary structures of the functional regions of the Escherichia coli genome. III. Computer recognition of coding regions

M Iu Borodovski? Iu A Sprizhitski? E I Golovanov A A Aleksandrov 《Molekuliarnaia biologiia》1986,20(5):1390-1398

相似文献

4.

Polypeptide structure and encoding location of the adenovirus serotype 2 late, nonstructural 33K protein. 总被引：8，自引：6，他引：2

E A Oosterom-Dragon C W Anderson 《Journal of virology》1983,45(1):251-263

相似文献

5.

Sequence and transcription analysis of the human cytomegalovirus DNA polymerase gene 总被引：55，自引：30，他引：25

下载免费PDF全文

T Kouzarides A T Bankier S C Satchwell K Weston P Tomlinson B G Barrell 《Journal of virology》1987,61(1):125-133

相似文献

6.

Cloning of the gene for the 73 kD subunit of the DNA polymerase alpha primase of Drosophila melanogaster.

S Cotterill I R Lehman P McLachlan 《Nucleic acids research》1992,20(16):4325-4330

相似文献

7.

Using Triplet Periodicity of Nucleotide Sequences for Finding Potential Reading Frame Shifts in Genes

F.E. Frenkel E.V. Korotkov 《DNA research》2009,16(2):105-114

We introduce a novel approach for the detection of possible mutations leading to a reading frame (RF) shift in a gene. Deletions and insertions of DNA coding regions are considerable events for genes because an RF shift results in modifications of the extensive region of amino acid sequence coded by a gene. The suggested method is based on the phenomenon of triplet periodicity (TP) in coding regions of genes and its relative resistance to substitutions in DNA sequence. We attempted to extend 326 933 regions of continuous TP found in genes from the KEGG databank by considering possible insertions and deletions. We revealed totally 824 genes where such extension was possible and statistically significant. Then we generated amino acid sequences according to active (KEGG''s) and hypothetically ancient RFs in order to find confirmation of a shift at a protein level. Consequently, 64 sequences have protein similarities only for ancient RF, 176 only for active RF, 3 for both and 581 have no protein similarity at all. We aimed to have revealed lower bound for the number of genes in which a shift between RF and TP is possible. Further ways to increase the number of revealed RF shifts are discussed. 相似文献

8.

Cloning and sequencing of Plasmodium falciparum DNA fragments containing repetitive regions potentially coding for histidine-rich proteins: identification of two overlapping reading frames 总被引：2，自引：0，他引：2

R Lenstra L d'Auriol B Andrieu J Le Bras F Galibert 《Biochemical and biophysical research communications》1987,146(1):368-377

DNA sequences, potentially coding for histidine-rich proteins, were isolated from a P. falciparum genomic library using an oligonucleotide probe consisting of histidine codon repeats. Sequencing revealed that the different DNA fragments contain long repetitive regions very homologous to the probe. One clone was fully sequenced and contains two open reading frames that overlap in the repetitive region but are located on opposite strands. Analysis suggests that both are coding. One frame could code for a small histidine-rich protein, the other for a protein containing many aspartic acid residues. Southern blotting revealed that these sequences are conserved in all three P. falciparum strains studied. 相似文献

9.

Evolutionary conservation of putative functional domains in the human homolog of the murine His-1 gene

《Gene》1997,184(2):169-176

相似文献

10.

Complete c-mos (rat) nucleotide sequence: presence of conserved domains in c-mos proteins 总被引：5，自引：1，他引：4

下载免费PDF全文

F A van der Hoorn J Firzlaff 《Nucleic acids research》1984,12(4):2147-2156

Recently we described the isolation of c-mos (rat). The gene belongs to the family of oncogenes. Some facts render c-mos unique among the oncogenes : a) it does not contain intervening sequences and b) its expression was never detected in a large number of normal mouse tissues examined. We undertook the sequence analysis of c-mos (rat) in order to compare it to the nucleotide sequences published for c-mos (mouse), c-mos (human), c-src and bovine protein kinase. c-mos (rat) contains an open reading frame of 1017 nucleotides, coding for a polypeptide of 339 amino acids. c-mos (rat)-makes use of the same ATG that defines the N-terminus of the c-mos (human) protein. By comparing all c-mos sequences available we found sequences with high mutational rates to be confined to certain domains. This comparison, together with data on the biological activities of the cloned DNA, allowed us to tentatively define regions involved in (a) function(s) of c-mos other than transformation. 相似文献

11.

Construction of a multiframe vector to express coding sequences in Escherichia coli

Domínguez-Martínez V Guarneros-Peña G Segura-Nieto M Curiel-Quesada E 《Canadian journal of microbiology》2001,47(1):72-76

Cloning of foreign DNA fragments for coding sequence analysis in Escherichia coli usually involves sets of three vectors. To simplify this, we constructed an expression vector named pMFV7 containing three ATG codons in different frames downstream of a Shine-Dalgarno sequence, assuming that the ribosome can use any of the three start codons in an alternative manner. Translation beginning at either of the start codons would drive the expression of any coding fragment cloned downstream. To test the feasibility of this proposal, we cloned DNA fragments of the lacZ gene in each of the possible reading frames downstream from pMFV7 start codons. Sequence analysis of the N-terminus regions around the fusion sites indicates that ribosomes indeed initiate translation at each of the three initiation codons. In one case, levels of beta-galactosidase activity depended largely on the N-terminus of the translation products. We conclude that pMFV7 may be useful for expressing coding sequences regardless of their reading frame. 相似文献

12.

Conservation of gene organization in the lymphotropic herpesviruses herpesvirus Saimiri and Epstein-Barr virus. 总被引：13，自引：12，他引：1

下载免费PDF全文

U A Gompels M A Craxton R W Honess 《Journal of virology》1988,62(3):757-767

By analyses of short DNA sequences, we have deduced the overall arrangement of genes in the (A + T)-rich coding sequences of herpesvirus saimiri (HVS) relative to the arrangements of homologous genes in the (G + C)-rich coding sequences of the Epstein-Barr virus (EBV) genome and the (A + T)-rich sequences of the varicella-zoster virus (VZV) genome. Fragments of HVS DNA from 13 separate sites within the 111 kilobase pairs of the light DNA coding sequences of the genome were subcloned into M13 vectors, and sequences of up to 350 bases were determined from each of these sites. Amino acid sequences predicted for fragments of open reading frames defined by these sequences were compared with a library of the protein sequences of major open reading frames predicted from the complete DNA sequences of VZV and EBV. Of the 13 short amino acid sequences obtained from HVS, only 3 were recognizably homologous to proteins encoded by VZV, but all 13 HVS sequences were unambiguously homologous to gene products encoded by EBV. The HVS reading frames identified by this method included homologs of the major capsid polypeptides, glycoprotein H, the major nonstructural DNA-binding protein, thymidine kinase, and the homolog of the regulatory gene product of the BMLF1 reading frame of EBV. Locally as well as globally, the order and relative orientation of these genes resembled that of their homologs on the EBV genome. Despite the major differences in their nucleotide compositions and in the nature and arrangements of reiterated DNA sequences, the genomes of the lymphotropic herpesviruses HVS and EBV encode closely related proteins, and they share a common organization of these coding sequences which differs from that of the neurotropic herpesviruses, VZV and herpes simplex virus. 相似文献

13.

Nucleotide sequence of the gene responsible for D-xylose uptake in Escherichia coli.

下载免费PDF全文

N Kurose K Watanabe A Kimura 《Nucleic acids research》1986,14(17):7115-7123

相似文献

14.

OCPAT: an online codon-preserved alignment tool for evolutionary genomic analysis of protein coding sequences

Guozhen Liu Monica Uddin Munirul Islam Morris Goodman Lawrence I Grossman Roberto Romero Derek E Wildman 《Source code for biology and medicine》2007,2(1):5

相似文献

15.

Nucleotide sequence of a gene cluster involved in entry of E colicins and single-stranded DNA of infecting filamentous bacteriophages into Escherichia coli. 总被引：29，自引：19，他引：10

下载免费PDF全文

T P Sun R E Webster 《Journal of bacteriology》1987,169(6):2667-2674

Mutations in fii or tolA of the fii-tolA-tolB gene cluster at 17 min on the Escherichia coli map render cells tolerant to high concentrations of the E colicins and do not allow the DNA of infecting single-stranded filamentous bacteriophages to enter the bacterial cytoplasm. The nucleotide sequence of a 1,854-base-pair DNA fragment carrying the fii region was determined. This sequence predicts three open reading frames sequentially coding for proteins of 134, 230, and 142 amino acids, followed by the potential start of the tolA gene. Oligonucleotide mutagenesis of each open reading frame and maxicell analysis demonstrated that all open reading frames are expressed in vivo. Sequence analysis of mutant fii genes identified the 230-amino acid protein as the fii gene product. Chromosomal insertion mutations were constructed in each of the two remaining open reading frames. The phenotype resulting from an insertion of the chloramphenicol gene into the gene coding for the 142-amino acid protein is identical to that of mutations in fii and tolA. This gene is located between fii and tolA, and we propose the designation of tolQRA for this cluster in which tolQ is the former fii gene and tolR is the new open reading frame. The protein products of this gene cluster play an important role in the transport of large molecules such as the E colicins and filamentous phage DNA into the bacterium. 相似文献

16.

The primary structure of a halorhodopsin from Natronobacterium pharaonis. Structural, functional and evolutionary implications for bacterial rhodopsins and halorhodopsins 总被引：3，自引：0，他引：3

J K Lanyi A Duschl G W Hatfield K May D Oesterhelt 《The Journal of biological chemistry》1990,265(3):1253-1260

We cloned and sequenced the gene coding for the polypeptide of a halorhodopsin in Natronobacterium pharaonis (named here pharaonis halorhodopsin). Peptide sequencing of cyanogen bromide fragments, and immunoreactions of the protein and synthetic peptides derived from the COOH-terminal gene sequence, confirmed that the open reading frame is the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences, as well as those for other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences (mutations/nucleotide at codon positions which do not result in amino acid changes) were calculated. These indicate very considerable evolutionary distance between each pair of genes. In spite of this, the three protein sequences show extensive similarities, indicating strong selective pressures. Conserved and conservatively replaced amino acid residues in all three proteins identify general features essential for ion-motive bacterial rhodopsins, responsible for overall structure and chromophore properties. Comparison of the bacteriorhodopsin sequence with those of the two halorhodopsins, on the other hand, identifies features involved in their specific (proton and chloride ion) transport functions. 相似文献

17.

Are Noncoding Sequences of Rickettsia prowazekii Remnants of ``Neutralized' Genes?

Holste D Weiss O Grosse I Herzel H 《Journal of molecular evolution》2000,51(4):353-362

It has been hypothesized that a large fraction of 24% noncoding DNA in R. prowazekii consists of degraded genes. This hypothesis has been based on the relatively high G+C content of noncoding DNA. However, a comparison with other genomes also having a low overall G+C content shows that this argument would also apply to other bacteria. To test this hypothesis, we study the coding potential in sets of genes, pseudogenes, and intergenic regions. We find that the correlation function and the χ²-measure are clearly indicative of the coding function of genes and pseudogenes. However, both coding potentials make almost no indication of a preexisting reading frame in the remaining 23% of noncoding DNA. We simulate the degradation of genes due to single-nucleotide substitutions and insertions/deletions and quantify the number of mutations required to remove indications of the reading frame. We discuss a reduced selection pressure as another possible origin of this comparatively large fraction of noncoding sequences. Received: 27 December 1999 / Accepted: 5 July 2000 相似文献

18.

Classification of triplet periodicity in DNA sequences of genes taken from KEGG databank

Frenkel' FE Korotkov EV 《Molekuliarnaia biologiia》2008,42(4):707-720

We conducted classification for 472,288 regions of triplet periodicity found in 578,868 genes from release 29 of KEGG databank. A new concept of triplet periodicity class and a measure of similarity between them are introduced. Totally 2520 classes were created that contain 94% of found triplet periodicity. For 92% of triplet periodicity regions contained in classes an identical linkage of triplet periodicity to reading frame is observed. For the rest triplet periodicity cases a shift between reading frame of a gene and reading frame common for majority of genes contained in a class of triplet periodicity was observed. These periodicity regions were encoded into hypothetical amino acid sequences in accordance with reading frame built by triplet periodicity class. By BLAST program it was shown that 2660 hypothetical amino acid sequences have statistically significant similarity with proteins from UniProt databank. We suppose that 8% of triplet periodicity regions that joined classes mutated by means of reading frame shift. Created classes of triplet periodicity can be used for identification of coding regions of genes as well as for searching for mutations arisen from reading frame shift. 相似文献

19.

Cysteine residue periodicity is a conserved structural feature of variable surface proteins from Paramecium tetraurelia. 总被引：3，自引：0，他引：3

E Nielsen Y You J Forney 《Journal of molecular biology》1991,222(4):835-841

The DNA sequences of the entire coding regions of the A and C type variable surface protein genes from Paramecium tetraurelia, stock 51 have been determined. The 8151 nucleotide open reading frame of the A gene contains several tandem repeats of 210 nucleotides within the central portion of the molecule as well as a periodic structure defined by cysteine residues. The 6699 nucleotide open reading frame of the C gene does not contain any identifiable tandem repeats or internal similarity but maintains a periodicity based on the cysteine residue spacing. The deduced amino acid sequences encoded by the two genes are most similar within the 600 amino-terminal and 600 carboxyl-terminal amino acid residues, the central portions show only limited sequence similarity. We conclude that internal repeats are not a conserved feature of variable surface proteins in Paramecium and discuss the possible importance of the regular pattern of cysteine residues. 相似文献

20.

Sequence organization of two recombinant plasmids containing genes for the major heat shock-induced protein of D. melanogaster. 总被引：35，自引：0，他引：35

E A Craig B J McCarthy S C Wadsworth 《Cell》1979,16(3):575-588

We have isolated recombinant DNA clones which include cDNA and chromosomal DNA sequences of the major heat shock-inducible gene of Drosophila. With the cDNA fragments used as specific hybridization probes, DNA:DNA reassociation and in situ hybridization analysis demonstrated that the DNA sequences are repeated approximately 7 times in the haploid Drosophila genome, and that gene sequences are present at both the 87A and 87C loci on the cytological map. The cloned cDNA and homologous cloned chromosomal DNA hybridized to mRNA which translated in vitro into the major 70K heat shock-specific protein. Here we summarize a study of the organization of genes coding for the 70K heat shock-specific protein contained in the two recombinant chromosomal DNA plasmids pG3 and pG5. On the basis of R loop hybridization experiments and restriction enzyme analysis, we conclude that a 14 kb fragment, G3, contains three copies of the gene coding for the 70K protein. A second 9.2 kb fragment, G5, contains one copy of the gene coding for the 70K protein. Hybridization of labeled poly(A)-containing RNA to restriction endonuclease-cleaved DNA indicates that the mRNA coding regions in G3 and G5 are each approximately 2100 bp long. The three tandemly repeated genes of G3 are separated by approximately 1400 bp of spacer DNA. The two internal spacer regions in G3 appear to be identical, whereas differences in restriction enzyme sites indicate that the sequences adjacent to the cluster differ from the internal spacer and from each other. 相似文献