首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
F Tao  C Tai  Z Liu  A Wang  Y Wang  L Li  C Gao  C Ma  P Xu 《Journal of bacteriology》2012,194(16):4457-4458
Klebsiella pneumoniae LZ is a bacterium isolated from soil which can produce 1,3-propanediol from glycerol. Here we present a 5,431,750-bp assembly of its genome sequence. We annotated 9 coding sequences (CDSs) responsible for glycerol fermentation to 1,3-propanediol, 19 CDSs encoding glycerol utilization, and 134 CDSs related to its virulence and defense.  相似文献   

2.
MOTIVATION: Overlapping gene coding sequences (CDSs) are particularly common in viruses but also occur in more complex genomes. Detecting such genes with conventional gene-finding algorithms can be difficult for several reasons. If an overlapping CDS is on the same read-strand as a known CDS, then there may not be a distinct promoter or mRNA. Furthermore, the constraints imposed by double-coding can result in atypical codon biases. However, these same constraints lead to particular mutation patterns that may be detectable in sequence alignments. RESULTS: In this paper, we investigate several statistics for detecting double-coding sequences with pairwise alignments--including a new maximum-likelihood method. We also develop a model for double-coding sequence evolution. Using simulated sequences generated with the model, we characterize the distribution of each statistic as a function of sequence composition, length, divergence time and double-coding frame. Using these results, we develop several algorithms for detecting overlapping CDSs. The algorithms were tested on known overlapping CDSs and other overlapping open reading frames (ORFs) in the hepatitis B virus (HBV), Escherichia coli and Salmonella typhimurium genomes. The algorithms should prove useful for detecting novel overlapping genes--especially short coding ORFs in viruses. AVAILABILITY: Programs may be obtained from the authors. SUPPLEMENTARY INFORMATION: http://biochem.otago.ac.nz/double.html.  相似文献   

3.
Our goal was to identify evolutionary conserved frame transitions in protein coding regions and to uncover an underlying functional role of these structural aberrations. We used the ab initio frameshift prediction program, GeneTack, to detect reading frame transitions in 206 991 genes (fs-genes) from 1106 complete prokaryotic genomes. We grouped 102 731 fs-genes into 19 430 clusters based on sequence similarity between protein products (fs-proteins) as well as conservation of predicted position of the frameshift and its direction. We identified 4010 pseudogene clusters and 146 clusters of fs-genes apparently using recoding (local deviation from using standard genetic code) due to possessing specific sequence motifs near frameshift positions. Particularly interesting was finding of a novel type of organization of the dnaX gene, where recoding is required for synthesis of the longer subunit, τ. We selected 20 clusters of predicted recoding candidates and designed a series of genetic constructs with a reporter gene or affinity tag whose expression would require a frameshift event. Expression of the constructs in Escherichia coli demonstrated enrichment of the set of candidates with sequences that trigger genuine programmed ribosomal frameshifting; we have experimentally confirmed four new families of programmed frameshifts.  相似文献   

4.
Ribosomes can be programmed to shift from one reading frame to another during translation. Hepatitis C virus (HCV) uses such a mechanism to produce F protein from the -2/+1 reading frame. We now report that the HCV frameshift signal can mediate the synthesis of the core protein of the zero frame, the F protein of the -2/+1 frame, and a 1.5-kDa protein of the -1/+2 frame. This triple decoding function does not require sequences flanking the frameshift signal and is apparently independent of membranes and the synthesis of the HCV polyprotein. Two consensus -1 frameshift sequences in the HCV type 1 frameshift signal facilitate ribosomal frameshifts into both overlapping reading frames. A sequence which is located immediately downstream of the frameshift signal and has the potential to form a double stem-loop structure can significantly enhance translational frameshifting in the presence of the peptidyl-transferase inhibitor puromycin. Based on these results, a model is proposed to explain the triple decoding activities of the HCV ribosomal frameshift signal.  相似文献   

5.
6.
7.
F Tao  X Wang  C Ma  C Yang  H Tang  Z Gai  P Xu 《Journal of bacteriology》2012,194(17):4755-4756
Xanthomonas campestris JX, a soil bacterium, is an industrially productive strain for xanthan gum. Here we present a 5.0-Mb assembly of its genome sequence. We have annotated 12 coding sequences (CDSs) responsible for xanthan gum biosynthesis, 346 CDSs encoding carbohydrate metabolism, and 69 CDSs related to virulence, defense, and plant disease.  相似文献   

8.
Tao F  Tang H  Gai Z  Su F  Wang X  He X  Xu P 《Journal of bacteriology》2011,193(24):7011-7012
Pseudomonas putida Idaho is an organic-solvent-tolerant strain which can degrade and adapt to high concentrations of organic solvents. Here, we announce its first draft genome sequence (6,363,067 bp). We annotated 192 coding sequences (CDSs) responsible for aromatic compound metabolism, 40 CDSs encoding phospholipid synthesis, and 212 CDSs related to stress response.  相似文献   

9.

Background  

Detecting new coding sequences (CDSs) in viral genomes can be difficult for several reasons. The typically compact genomes often contain a number of overlapping coding and non-coding functional elements, which can result in unusual patterns of codon usage; conservation between related sequences can be difficult to interpret – especially within overlapping genes; and viruses often employ non-canonical translational mechanisms – e.g. frameshifting, stop codon read-through, leaky-scanning and internal ribosome entry sites – which can conceal potentially coding open reading frames (ORFs).  相似文献   

10.
11.
12.
13.
M D Ryan  J Drew 《The EMBO journal》1994,13(4):928-933
We describe the construction of a plasmid (pCAT2AGUS) encoding a polyprotein in which a 19 amino acid sequence spanning the 2A region of the foot-and-mouth disease virus (FMDV) polyprotein was inserted between the reporter genes chloramphenicol acetyl transferase (CAT) and beta-glucuronidase (GUS) maintaining a single, long open reading frame. Analysis of translation reactions programmed by this construct showed that the inserted FMDV sequence functioned in a manner similar to that observed in FMDV polyprotein processing: the CAT2AGUS polyprotein underwent a cotranslational, apparently autoproteolytic, cleavage yielding CAT-2A and GUS. Analysis of translation products derived from a series of constructs in which sequences were progressively deleted from the N-terminal region of the FMDV 2A insertion showed that cleavage required a minimum of 13 residues. The FMDV 2A sequence therefore provides the opportunity to engineer either whole proteins or domains such that they are cleaved apart cotranslationally with high efficiency.  相似文献   

14.
Erwinia amylovora causes the economically important disease fire blight that affects rosaceous plants, especially pear and apple. Here we report the complete genome sequence and annotation of strain ATCC 49946. The analysis of the sequence and its comparison with sequenced genomes of closely related enterobacteria revealed signs of pathoadaptation to rosaceous hosts.Erwinia amylovora, a plant-associated member of the Enterobacteriaceae, causes fire blight, a devastating disease of rosaceous plants, especially pear and apple (6). The complete genome of Ea273 (ATCC 49946), a virulent strain isolated from an infected apple tree in New York State, was sequenced. Total DNA was extracted and prepared in pMAQ1 shotgun libraries. The complete shotgun sequence was obtained by using dye terminator chemistry in ABI 3730 automated sequencers and contains 88,457 reads (11.12-fold coverage), yielding a theoretical coverage of the genome of 99.99%. The sequence was assembled, finished, and annotated as described previously (1, 5), using Artemis (4) to collate data and facilitate annotation.The genome of E. amylovora consists of a circular chromosome of 3,805,874 bp and two plasmids, AMYP1 (28,243 bp) and AMYP2 (71,487 bp). Coding regions in the chromosome account for 85.1% of the total sequence, with 3,483 identified coding sequences (CDS). Two hundred fifty-four (7%) of the CDSs do not have any matches in current NCBI databases; 114 (3.3%) correspond to conserved hypothetical proteins. Forty-nine CDSs (1.4%) are similar to genes from mobile elements such as integrases, transposases, and bacteriophages, and 110 CDSs (3.2%) were classified as pseudogenes due to interruptions or truncations of the CDSs. The remaining 2,956 annotated CDSs include among other categories genes involved in biosynthesis of the cellular envelope and modifications of surface proteins (299 CDSs [11%]) and genes involved in signal transduction and regulation (228 CDSs [8%]). Seven rRNA operons and 78 tRNA sequences were identified in the chromosome; two new clusters were identified (AMY1550-1575 and AMY2648-2676) that resemble the T3SS-encoding SSR-1 island of Sodalis glossinidius (2), and four clusters that contain genes for biosynthesis of flagella, which based on their location might be regulated independently.The smaller plasmid, AMYP1, had been reported as pEA29 (3); its sequence is nearly identical to the one reported here. The larger plasmid, AMYP2, renamed pEA72 for consistency in nomenclature, contains 87 predicted CDSs, with two predicted mobile-element-related CDSs and one pseudogene. Among the CDSs with annotated functions are a cluster of genes (AMYP2_49 to AMYP2_62) that encode a putative type IV fimbrial system (pil genes).The genome of E. amylovora is only 3.8 Mb long, whereas most free-living enterobacteria, including plant pathogens, have genomes of 4.5 Mb to 5.5 Mb. Comparison of the genome of Ea273 with the sequenced genomes of 15 closely related enterobacteria identified 21 lineage-specific regions, which might be considered genomic islands. E. amylovora has many more predicted pseudogenes, relative to other enterobacteria with similar lifestyles. Given its size and the preponderance of pseudogenes, genome reduction may have occurred via mutational inactivation and subsequent deletion with the following consequences: E. amylovora has fewer genes involved in anaerobic respiration and fermentation than are found in typical related enterobacteria; this likely result in a reduced capacity to live in anaerobic environments.The genome sequence of E. amylovora has revealed clear signs of pathoadaptation to the rosaceous plant environment. For example, T3SS-related proteins are present that are more similar to proteins of other plant pathogens than to proteins of closely related enterobacteria. These include type III effectors, homologous to those of plant-pathogenic pseudomonads, which confer virulence to E. amylovora in plants, and a sorbitol-metabolizing cluster that may confer a competitive advantage for survival in rosaceous plants. The reduced genome size and erosion or loss of genes involved in anaerobic respiration and nitrate assimilation are remarkable, relative to other plant- and animal-pathogenic members of the Enterobacteriaceae.  相似文献   

15.
Pseudomonas putida strain S12, a well-studied solvent-tolerant bacterium, is considered a platform strain for the production of many chemicals. Here, we present a 6.28-Mb assembly of its genome sequence. We have annotated 32 coding sequences (CDSs) encoding efflux systems of organic compounds and 195 CDSs responsible for the metabolism of aromatic compounds.  相似文献   

16.
17.
An "integrated model" of programmed ribosomal frameshifting   总被引:10,自引:0,他引:10  
Many viral mRNAs, including those of HIV-1, can make translating ribosomes change reading frame. Altering the efficiencies of programmed ribosomal frameshift (PRF) inhibits viral propagation. As a new target for potential antiviral agents, it is therefore important to understand how PRF is controlled. Incorporation of the current models describing PRF into the context of the translation elongation cycle leads us to propose an 'integrated model' of PRF both as a guide towards further characterization of PRF at the molecular and biochemical levels, and for the identification of new targets for antiviral therapeutics.  相似文献   

18.
The gene-finding programs developed so far have not paid muchattention to the detection of short protein coding regions (CDSs).However, the detection of short CDSs is important for the studyof photosynthesis. We utilized GeneHacker, a gene-finding programbased on the hidden Markov model (HMM), to detect short CDSs(from 90 to 300 bases) in a 1.0 mega contiguous sequence ofcyanobacterium Synechocystis sp. strain PCC6803 which carriesa complete set of genes for oxygenic photosynthesis. GeneHackerdiffers from other gene-finding programs based on the HMM inthat it utilizes di-codon statistics as well. GeneHacker successfullydetected seven out of the eight short CDSs annotated in thissequence and was clearly superior to GeneMark in this rangeof length. GeneHacker detected 94 potentially new CDSs, 9 ofwhich have counterparts in the genetic databases. Four of thenine CDSs were less than 150 bases and were photosynthesis-relatedgenes. The results show the effectiveness of GeneHacker in detectingvery short CDSs corresponding to genes.  相似文献   

19.
Sphingomonas sp. strain ATCC 31555 can produce an anionic heteropolysaccharide, welan gum, which shows excellent stability and viscosity retention even at high temperatures. Here we present a 4.0-Mb assembly of its genome sequence. We have annotated 10 coding sequences (CDSs) responsible for the welan gum biosynthesis and 55 CDSs related to monosaccharide metabolism.  相似文献   

20.
Phytophthora infestans is a devastating phytopathogenic oomycete that causes late blight on tomato and potato. Recent genome sequencing efforts of P. infestans and other Phytophthora species are generating vast amounts of sequence data providing opportunities to unlock the complex nature of pathogenesis. However, accurate annotation of Phytophthora genomes will be a significant challenge. Most of the information about gene structure in these species was gathered from a handful of genes resulting in significant limitations for development of ab initio gene-calling programs. In this study, we collected a total of 150 bioinformatically determined near full-length cDNA (FLcDNA) sequences of P. infestans that were predicted to contain full open reading frame sequences. We performed detailed computational analyses of these FLcDNA sequences to obtain a snapshot of P. infestans gene structure, gauge the degree of sequence conservation between P. infestans genes and those of Phytophthora sojae and Phytophthora ramorum, and identify patterns of gene conservation between P. infestans and various eukaryotes, particularly fungi, for which genome-wide translated protein sequences are available. These analyses helped us to define the structural characteristics of P. infestans genes using a validated data set. We also determined the degree of sequence conservation within the genus Phytophthora and identified a set of fast evolving genes. Finally, we identified a set of genes that are shared between Phytophthora and fungal phytopathogens but absent in animal fungal pathogens. These results confirm that plant pathogenic oomycetes and fungi share virulence components, and suggest that eukaryotic microbial pathogens that share similar lifestyles also share a similar set of genes independently of their phylogenetic relatedness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号