首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
To manage and intelligently mine the avalanche of genomic sequences intuitive and user-friendly graphical interfaces are required. Here we present BlastXtract2 which exclusively facilitates early exploration of un-annotated genomic and metagenomic sequences. Various formats of translated searches, including the commonly used BlastX, of multiple sequences against multiple protein databases can be uploaded to a relational database server, which can be accessed via a locally installed web-server. There, an intuitive GUI allows straightforward data-mining and enables quick detection of potential frameshifts and poorly sequenced or assembled regions, thereby contributing in making BlastXtract2 a unique and valuable tool for early exploration of (meta)genomic sequences.

Availability

Source code, documentation and an online demo version are available at https://github.com/ ClaessonLab/BlastXtract2  相似文献   

2.
3.
AIMS: To identify a Listeria welshimeri-specific gene that can be used for identification of this species by PCR. METHODS AND RESULTS: Through comparative analysis of genomic DNA from Listeria species using dot blot hybridization, an L. welshimeri-specific clone was isolated that contained a gene segment whose translated protein sequence is similar to enzyme IIBC from phosphotransferase systems in other bacteria. Using oligonucleotide primers derived from this L. welshimeri-specific clone, a 608-bp fragment was amplified from L. welshimeri genomic DNA and not from other Listeria species or other Gram-negative and Gram-positive species. CONCLUSION AND SIGNIFICANCE: The PCR employing L. welshimeri-specific primers shows promise as a useful method for differentiating L. welshimeri from other Listeria species and related bacteria.  相似文献   

4.
MOTIVATION: Locating protein-coding exons (CDSs) on a eukaryotic genomic DNA sequence is the initial and an essential step in predicting the functions of the genes embedded in that part of the genome. Accurate prediction of CDSs may be achieved by directly matching the DNA sequence with a known protein sequence or profile of a homologous family member(s). RESULTS: A new convention for encoding a DNA sequence into a series of 23 possible letters (translated codon or tron code) was devised to improve this type of analysis. Using this convention, a dynamic programming algorithm was developed to align a DNA sequence and a protein sequence or profile so that the spliced and translated sequence optimally matches the reference the same as the standard protein sequence alignment allowing for long gaps. The objective function also takes account of frameshift errors, coding potentials, and translational initiation, termination and splicing signals. This method was tested on Caenorhabditis elegans genes of known structures. The accuracy of prediction measured in terms of a correlation coefficient (CC) was about 95% at the nucleotide level for the 288 genes tested, and 97. 0% for the 170 genes whose product and closest homologue share more than 30% identical amino acids. We also propose a strategy to improve the accuracy of prediction for a set of paralogous genes by means of iterative gene prediction and reconstruction of the reference profile derived from the predicted sequences. AVAILABILITY: The source codes for the program 'aln' written in ANSI-C and the test data will be available via anonymous FTP at ftp.genome.ad.jp/pub/genomenet/saitama-cc. CONTACT: gotoh@cancer-c.pref.saitama.jp  相似文献   

5.
L H Soe  C K Shieh  S C Baker  M F Chang    M M Lai 《Journal of virology》1987,61(12):3968-3976
A 28-kilodalton protein has been suggested to be the amino-terminal protein cleavage product of the putative coronavirus RNA polymerase (gene A) (M.R. Denison and S. Perlman, Virology 157:565-568, 1987). To elucidate the structure and mechanism of synthesis of this protein, the nucleotide sequence of the 5' 2.0 kilobases of the coronavirus mouse hepatitis virus strain JHM genome was determined. This sequence contains a single, long open reading frame and predicts a highly basic amino-terminal region. Cell-free translation of RNAs transcribed in vitro from DNAs containing gene A sequences in pT7 vectors yielded proteins initiated from the 5'-most optimal initiation codon at position 215 from the 5' end of the genome. The sequence preceding this initiation codon predicts the presence of a stable hairpin loop structure. The presence of an RNA secondary structure at the 5' end of the RNA genome is supported by the observation that gene A sequences were more efficiently translated in vitro when upstream noncoding sequences were removed. By comparing the translation products of virion genomic RNA and in vitro transcribed RNAs, we established that our clones encompassing the 5'-end mouse hepatitis virus genomic RNA encode the 28-kilodalton N-terminal cleavage product of the gene A protein. Possible cleavage sites for this protein are proposed.  相似文献   

6.
7.
Recombinant and native forms of cyclohexanone monooxygenase (CMO) from Acinetobacter NCIB 9871 were analyzed by mass spectrometry to probe ambiguities arising from the presence of multiple DNA sequences for the enzyme in GenBank. A CMO gene corresponding exactly to the nucleotide sequence described by Iwaki et al. (10) was amplified from genomic DNA, cloned into pET15b, and the recombinant protein purified from a bacterial expression system. Electrospray mass spectrometry of both the recombinant material and the native form of CMO isolated from Acinetobacter yielded molecular weights within 0.01% of those predicted from the translated gene sequence of Iwaki et al. (10). Trypsin and chymotrypsin digests of native CMO, analyzed by electrospray and MALDI mass spectrometry, provided greater than 97% coverage of the protein and confirmed the presence of specific peptide sequences predicted by the Iwaki sequence alone. Therefore, the primary sequence of native Acinetobacter CMO is identical to the gene sequence for chnB deposited under accession number AB006902.  相似文献   

8.
9.
D Bgu  P V Graves  C Domec  G Arselin  S Litvak    A Araya 《The Plant cell》1990,2(12):1283-1290
RNA editing of subunit 9 of the wheat mitochondrial ATP synthase has been studied by cDNA and protein sequence analysis. Most of the cDNA clones sequenced (95%) showed that editing by C-to-U transitions occurred at eight positions in the coding region. Consequently, 5 amino acids were changed in the protein when compared with the sequence predicted from the gene. Two edited codons gave no changes (silent editing). One of the C-to-U transitions generated a stop codon by modifying the arginine codon CGA to UGA. Thus, the protein produced is 6 amino acids shorter than that deduced from the genomic sequence. Minor forms of cDNA with partial or overedited sequences were also found. Protein sequence and amino acid composition analyses confirmed the results obtained by cDNA sequencing and showed that the major form of edited atp9 mRNA is translated.  相似文献   

10.
We used the N-terminal amino acid sequence of dihydrolipoamide dehydrogenase from Haloferax volcanii, to design and synthesize two oligonucleotide probes that were used to identify and clone a 4.3 kilobase pair (kbp) fragment from MboI restriction endonuclease digestion of Hf. volcanii genomic DNA. The nucleotide sequence of a 1.5-kbp region of this clone was determined and this revealed an open reading frame that translated into a protein with good homology to dihydrolipoamide dehydrogenase from other sources. The first 48 amino acids were identical with the N-terminal sequence data obtained from the purified protein. The complete primary structure of the halophilic dihydrolipoamide dehydrogenase was analyzed in terms of its homologies to dihydrolipoamide dehydrogenases from other sources and its molecular adaptations to high intracellular ionic strength.  相似文献   

11.
Endonexin is a 32kDa, calcium-dependent membrane-binding protein that is one of a group of proteins that binds to chromaffin granule membranes and may regulate membrane fusion events occurring during exocytosis. In this study an oligonucleotide probe that codes for a highly conserved, repeated sequence present in this and related proteins was used to isolate a 2,048 nucleotide cDNA encoding endonexin from a bovine liver cDNA library. The translated amino acid sequence of endonexin shows the four domain structure characteristic of proteins in this class. The nucleotide sequence is 55 to 61% identical to that of the related membrane-binding proteins lipocortin, calpactin, endonexin II and (half of) 68kDa calelectrin. Southern blot analysis of bovine genomic DNA suggests the presence of a single gene for this protein. A consensus nucleotide sequence (TCTGGGAACTTC) was identified in the 5' nontranslated portion of the endonexin mRNA that is also represented in the messages for calpactin and endonexin II.  相似文献   

12.
The gene for bovine interphotoreceptor retinoid-binding protein (IRBP) has been cloned, and its nucleotide sequence has been determined. The IRBP gene is about 11.6 kilobase pairs (kb) and contains four exons and three introns. It transcribed into a large mRNA of approximately 6.4 kb and translated into a large protein of 145,000 daltons. To prove the identity of the genomic clone, we determined the protein sequence of several tryptic and cyanogen bromide fragments of purified bovine IRBP protein and localized them in the protein predicted from its nucleotide sequence. There is a 4-fold repeat structure in the protein sequence with 30-40% sequence identity and many conservative substitutions between any two of the four protein repeats. The third and fourth repeats are the most similar pair. All three of the introns in the IRBP gene fall in the fourth protein repeat. Two of the exons, the first and the fourth, are large, 3173 and 2447 bases, respectively. The introns are each about 1.5-2.2 kb long. The human IRBP gene has a sequence that is similar to one of the introns from the bovine gene. The unexpected gene structure and protein repeat structure in the bovine gene lead us to propose a model for the evolution of the IRBP gene.  相似文献   

13.
14.
S Fabijanski  M Pellegrini 《Gene》1982,18(3):267-276
A Drosophila genomic DNA library in the vector Charon 4 was screened using cDNA derived from the small (6S-12S) poly(A)+ mRNA of 2-6-h-old Drosophila embryos. This fraction of mRNA is enriched for ribosomal protein-coding sequences. The selected recombinants were hybridized to total mRNA under conditions which allowed for isolation of homologous mRNAs. The mRNA from these RNA/DNA hybrids was eluted and translated in vitro. The translation products were analyzed by one- and two-dimensional electrophoresis with authentic ribosomal proteins as standards. One cloned DNA segment was found to contain a ribosomal protein gene, and a sequence which hybridizes strongly to at least 5 other ribosomal protein mRNAs.  相似文献   

15.
16.
The long terminal repeat (LTR) region of mouse mammary tumor virus (MMTV) is known to contain an open reading frame of sufficient length to code for a protein of 36,000 Mr. The coding capacity of the 3' sequences of MMTV genomic RNA has been demonstrated by in vitro translation studies, which have reported the synthesis of four related proteins: p36, p24, p21, and p18. These proteins are overlapping translation products of the same open reading frame, with the smaller ones initiating at internal methionine codons. From the predicted amino acid sequence of the LTR protein, we have selected a region likely to be antigenic, obtained a synthetic peptide of that region, and raised antiserum to the peptide. The antipeptide serum specifically immunoprecipitated all four proteins from in vitro translated genomic 3' MMTV RNA, plus an additional one of 32,000 Mr. Published sequence data of MMRV LTRs show an internal AUG codon at a position which could initiate a protein of 32,000 Mr. The three smaller in vitro translation products (p24, p21, and p18) were consistently synthesized in much greater amounts than the p36 or p32 protein. The relative amount of each in vitro synthesized protein from genomic MMTV RNA could be predicted and was in good agreement with the postulated effect of flanking nucleotides on the efficiency of the respective AUG initiation codon. Polyadenylated RNAs, isolated from various mouse tissues, were selected by hybridization to plasmid DNA containing MMTV LTR sequences immobilized on nitrocellulose. In vitro translation of hybrid-selected mRNAs isolated from BALB/c mouse lactating mammary glands and carcinogen-induced mammary tumors, followed by immunoprecipitation with antipeptide serum, revealed that only one polypeptide was synthesized by the MMTV LTR-specific mRNA, the 36,000 Mr species.  相似文献   

17.
Abstract The aceA gene from Acetobacter xylinum was identified and cloned from a genomic DNA library. The complete DNA sequence was determined and computer analysis of the translated gene sequence revealed homology with the deduced amino acid sequence of gumD from Xanthomonas campestris . Therefore aceA is likely to encode the phosphate-prenyl glucose I -phosphate transferase catalyzing the first step in acetan biosynthesis in A. xylinum .  相似文献   

18.

Background  

High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy of the genetic code. It is desirable, therefore, to make use of the amino acid sequence when aligning coding nucleic acid sequences. In many cases, however, only a part of the sequence of interest is translated. On the other hand, overlapping reading frames may encode multiple alternative proteins, possibly with intermittent non-coding parts. Examples are, in particular, RNA virus genomes.  相似文献   

19.
Protein A1 is one of the major component of mammalian ribonucleoprotein particles (hnRNP). Human protein A1 cDNA cloning and sequencing revealed the existence of at least two protein isoforms. Among the cDNAs examined, sequence differences were found both in the structural portion, leading to aminoacid changes (Tyr to Phe or Arg to Lys) and in the non translated 3'-region where two T-stretches of different length were observed. Interestingly one of the aminoacid substitutions falls into a consensus sequence common to many RNA binding proteins. Northern blot analysis of poly A+ RNAs from five human tissues revealed two mRNA forms of 1500 and 1900 n due to alternative polyadenylation. Analysis of genomic DNA showed at least 30 A1-specific sequences, some of which correspond to processed pseudogenes. These results suggest that protein A1 is encoded by a multigene family.  相似文献   

20.
We screened an expression library of the yeast form of Paracoccidioides brasiliensis with a pool of human sera that was pre-adsorbed with mycelium, from patients with paracoccidioidomycosis (PCM). A sequence (PbYmnt) was obtained and characterized. A genomic clone was obtained by PCR of P. brasiliensis total DNA. The sequence contained a single open reading frame (ORF) encoding a protein of 357 amino acid residues, with a molecular mass of 39.78 kDa. The deduced amino acid sequence exhibited identity to mannosyl- and glycosyltransferases from several sources. A DXD motif was present in the translated gene and this sequence is characteristic of the glycosyltransferases. Hydropathy analysis revealed a single transmembrane region near the amino terminus of the molecule that suggested a type II membrane protein. The PbYmnt was expressed preferentially in the yeast parasitic phase. The accession number of the nucleotide sequence of PbYmnt and its flanking regions is AF374353. A recombinant protein was generated in Escherichia coli. Our data suggest that PbYmnt encodes one member of a glycosyltransferase family of proteins and that our strategy was useful in the isolation of differentially expressed genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号