首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Overlapping of genes, especially in an anti-parallel fashion, is quite rare in eukaryotic genomes. We have found a rare instance of exon overlapping involving CHRNE and MINK gene loci on chromosome 17 in humans. CHRNE codes for the subunit of the nicotinic acetylcholine receptor (AChR) whereas MINK encodes a serine/threonine kinase belonging to the GCK family. To elucidate the evolutionary trail of this gene overlapping event, we examined the genomes of a number of primates and found that mutations in the polyadenylation signal of the CHRNE gene in early hominoids led to the overlap. Upon extending this analysis to genomes of other orders of placental mammals, we observed that the overlapping occurred at least three times independently during the course of mammalian evolution. Because CHRNE and MINK are differentially expressed, the potentially hazardous mutations responsible for the exon overlap seem to have escaped evolutionary pressures by differential temporo-spatial expression of the two genes.  相似文献   

3.

Background  

Overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. In fact, they are ubiquitous in microbial genomes and more conserved between species than non-overlapping genes. Based on this property, we have previously implemented a web server, named OGtree, that allows the user to reconstruct genome trees of some prokaryotes according to their pairwise OG distances. By analogy to the analyses of gene content and gene order, the OG distance between two genomes we defined was based on a measure of combining OG content (i.e., the normalized number of shared orthologous OG pairs) and OG order (i.e., the normalized OG breakpoint distance) in their whole genomes. A shortcoming of using the concept of breakpoints to define the OG distance is its inability to analyze the OG distance of multi-chromosomal genomes. In addition, the amount of overlapping coding sequences between some distantly related prokaryotic genomes may be limited so that it is hard to find enough OGs to properly evaluate their pairwise OG distances.  相似文献   

4.
We present a bacterial genome computational analysis pipeline, called GenVar. The pipeline, based on the program GeneWise, is designed to analyze an annotated genome and automatically identify missed gene calls and sequence variants such as genes with disrupted reading frames (split genes) and those with insertions and deletions (indels). For a given genome to be analyzed, GenVar relies on a database containing closely related genomes (such as other species or strains) as well as a few additional reference genomes. GenVar also helps identify gene disruptions probably caused by sequencing errors. We exemplify GenVar's capabilities by presenting results from the analysis of four Brucella genomes. Brucella is an important human pathogen and zoonotic agent. The analysis revealed hundreds of missed gene calls, new split genes and indels, several of which are species specific and hence provide valuable clues to the understanding of the genome basis of Brucella pathogenicity and host specificity.  相似文献   

5.
Biodiversity estimates based on ribosomal operon sequence diversity rely on the premise that a sequence is characteristic of a single specific taxon or operational taxonomic unit (OTU). Here, we have studied the sequence diversity of 14 ribosomal RNA operons (rrn) contained in the genomes of two isolates (five operons in each genome) and four metagenomic fosmids, all from the same seawater sample. Complete sequencing of the isolate genomes and the fosmids establish that they represent strains of the same species, Alteromonas macleodii, with average nucleotide identity (ANI) values >97 %. Nonetheless, we observed high levels of intragenomic heterogeneity (i.e., variability between operons of a single genome) affecting multiple regions of the 16S and 23S rRNA genes as well as the internally transcribed spacer 1 (ITS-1) region. Furthermore, the ribosomal operons exhibited intergenomic heterogeneity (i.e., variability between operons located in separate genomes) in each of these regions, compounding the variability. Our data reveal the extensive heterogeneity observed in natural populations of A. macleodii at a single point in time and support the idea that distinct lineages of A. macleodii exist in the deep Mediterranean. These findings highlight the potential of rRNA fingerprinting methods to misrepresent species diversity while simultaneously failing to recognize the ecological significance of individual strains.  相似文献   

6.
Feng Y  Chen Z  Liu SL 《PloS one》2011,6(11):e27754

Background

Many facultative bacterial pathogens have undergone extensive gene decay processes, possibly due to lack of selection pressure during evolutionary conversion from free-living to intracellular lifestyle. Shigella, the causative agents of human shigellosis, have arisen from different E. coli-like ancestors independently by convergent paths. As these bacteria all have lost large numbers of genes by mutation or deletion, they can be used as ideal models for systematically studying the process of gene function loss in different bacteria living under similar selection pressures.

Methodologies/Principal Findings

We compared the sequenced Shigella genomes and re-defined decayed genes (pseudogenes plus deleted genes) in these bacteria. Altogether, 85 genes are commonly decayed in the five analyzed Shigella strains and 1456 genes are decayed in at least one Shigella strain. Genes coding for carbon utilization, cell motility, transporter or membrane proteins are prone to be inactivated. Decayed genes tend to concentrate in certain operons rather than distribute averagely across the whole genome. Genes in the decayed operon accumulated more non-synonymous mutations than the rest genes and meanwhile have lower expression levels.

Conclusions

Different Shigella lineages underwent convergent gene decay processes, and inactivation of one gene would lead to a lesser selection pressure for the other genes in the same operon. The pool of superfluous genes for Shigella may contain at least two thousand genes and the gene decay processes may still continue in Shigella until a minimum genome harboring only essential genes is reached.  相似文献   

7.
Chiu CH  Tang P  Chu C  Hu S  Bao Q  Yu J  Chou YY  Wang HS  Lee YS 《Nucleic acids research》2005,33(5):1690-1698
Salmonella enterica serovar Choleraesuis (S.Choleraesuis), a highly invasive serovar among non-typhoidal Salmonella, usually causes sepsis or extra-intestinal focal infections in humans. S.Choleraesuis infections have now become particularly difficult to treat because of the emergence of resistance to multiple antimicrobial agents. The 4.7 Mb genome sequence of a multidrug-resistant S.Choleraesuis strain SC-B67 was determined. Genome wide comparison of three sequenced Salmonella genomes revealed that more deletion events occurred in S.Choleraesuis SC-B67 and S.Typhi CT18 relative to S.Typhimurium LT2. S.Choleraesuis has 151 pseudogenes, which, among the three Salmonella genomes, include the highest percentage of pseudogenes arising from the genes involved in bacterial chemotaxis signal-transduction pathways. Mutations in these genes may increase smooth swimming of the bacteria, potentially allowing more effective interactions with and invasion of host cells to occur. A key regulatory gene of TetR/AcrR family, acrR, was inactivated through the introduction of an internal stop codon resulting in overexpression of AcrAB that appears to be associated with ciprofloxacin resistance. While lateral gene transfer providing basic functions to allow niche expansion in the host and environment is maintained during the evolution of different serovars of Salmonella, genes providing little overall selective benefit may be lost rapidly. Our findings suggest that the formation of pseudogenes may provide a simple evolutionary pathway that complements gene acquisition to enhance virulence and antimicrobial resistance in S.Choleraesuis.  相似文献   

8.
9.
Microorganisms have evolved to occupy certain environmental niches, and the metabolic genes essential for growth in these locations are retained in the genomes. Many microorganisms inhabit niches located in the human body, sometimes causing disease, and may retain genes essential for growth in locations such as the bloodstream and urinary tract, or growth during intracellular invasion of the hosts’ macrophage cells. Strains of Escherichia coli (E. coli) and Salmonella spp. are thought to have evolved over 100 million years from a common ancestor, and now cause disease in specific niches within humans. Here we have used a genome scale metabolic model representing the pangenome of E. coli which contains all metabolic reactions encoded by genes from 16 E. coli genomes, and have simulated environmental conditions found in the human bloodstream, urinary tract, and macrophage to determine essential metabolic genes needed for growth in each location. We compared the predicted essential genes for three E. coli strains and one Salmonella strain that cause disease in each host environment, and determined that essential gene retention could be accurately predicted using this approach. This project demonstrated that simulating human body environments such as the bloodstream can successfully lead to accurate computational predictions of essential/important genes.  相似文献   

10.
11.
12.
13.
The level of sequence heterogeneity among rrn operons within genomes determines the accuracy of diversity estimation by 16S rRNA-based methods. Furthermore, the occurrence of widespread horizontal gene transfer (HGT) between distantly related rrn operons casts doubt on reconstructions of phylogenetic relationships. For this study, patterns of distribution of rrn copy numbers, interoperonic divergence, and redundancy of 16S rRNA sequences were evaluated. Bacterial genomes display up to 15 operons and operon numbers up to 7 are commonly found, but ~40% of the organisms analyzed have either one or two operons. Among the Archaea, a single operon appears to dominate and the highest number of operons is five. About 40% of sequences among 380 operons in 76 bacterial genomes with multiple operons were identical to at least one other 16S rRNA sequence in the same genome, and in 38% of the genomes all 16S rRNAs were invariant. For Archaea, the number of identical operons was only 25%, but only five genomes with 21 operons are currently available. These considerations suggest an upper bound of roughly threefold overestimation of bacterial diversity resulting from cloning and sequencing of 16S rRNA genes from the environment; however, the inclusion of genomes with a single rrn operon may lower this correction factor to ~2.5. Divergence among operons appears to be small overall for both Bacteria and Archaea, with the vast majority of 16S rRNA sequences showing <1% nucleotide differences. Only five genomes with operons with a higher level of nucleotide divergence were detected, and Thermoanaerobacter tengcongensis exhibited the highest level of divergence (11.6%) noted to date. Overall, four of the five extreme cases of operon differences occurred among thermophilic bacteria, suggesting a much higher incidence of HGT in these bacteria than in other groups.  相似文献   

14.
The genomes of three strains of Listeria monocytogenes that have been associated with food-borne illness in the USA were subjected to whole genome comparative analysis. A total of 51, 97 and 69 strain-specific genes were identified in L.monocytogenes strains F2365 (serotype 4b, cheese isolate), F6854 (serotype 1/2a, frankfurter isolate) and H7858 (serotype 4b, meat isolate), respectively. Eighty-three genes were restricted to serotype 1/2a and 51 to serotype 4b strains. These strain- and serotype-specific genes probably contribute to observed differences in pathogenicity, and the ability of the organisms to survive and grow in their respective environmental niches. The serotype 1/2a-specific genes include an operon that encodes the rhamnose biosynthetic pathway that is associated with teichoic acid biosynthesis, as well as operons for five glycosyl transferases and an adenine-specific DNA methyltransferase. A total of 8603 and 105 050 high quality single nucleotide polymorphisms (SNPs) were found on the draft genome sequences of strain H7858 and strain F6854, respectively, when compared with strain F2365. Whole genome comparative analyses revealed that the L.monocytogenes genomes are essentially syntenic, with the majority of genomic differences consisting of phage insertions, transposable elements and SNPs.  相似文献   

15.
Salmonella Newport has ranked in the top three Salmonella serotypes associated with foodborne outbreaks from 1995 to 2011 in the United States. In the current study, we selected 26 S. Newport strains isolated from diverse sources and geographic locations and then conducted 454 shotgun pyrosequencing procedures to obtain 16–24 × coverage of high quality draft genomes for each strain. Comparative genomic analysis of 28 S. Newport strains (including 2 reference genomes) and 15 outgroup genomes identified more than 140,000 informative SNPs. A resulting phylogenetic tree consisted of four sublineages and indicated that S. Newport had a clear geographic structure. Strains from Asia were divergent from those from the Americas. Our findings demonstrated that analysis using whole genome sequencing data resulted in a more accurate picture of phylogeny compared to that using single genes or small sets of genes. We selected loci around the mutS gene of S. Newport to differentiate distinct lineages, including those between invH and mutS genes at the 3′ end of Salmonella Pathogenicity Island 1 (SPI-1), ste fimbrial operon, and Clustered, Regularly Interspaced, Short Palindromic Repeats (CRISPR) associated-proteins (cas). These genes in the outgroup genomes held high similarity with either S. Newport Lineage II or III at the same loci. S. Newport Lineages II and III have different evolutionary histories in this region and our data demonstrated genetic flow and homologous recombination events around mutS. The findings suggested that S. Newport Lineages II and III diverged early in the serotype evolution and have evolved largely independently. Moreover, we identified genes that could delineate sublineages within the phylogenetic tree and that could be used as potential biomarkers for trace-back investigations during outbreaks. Thus, whole genome sequencing data enabled us to better understand the genetic background of pathogenicity and evolutionary history of S. Newport and also provided additional markers for epidemiological response.  相似文献   

16.
The immunodominant lipopolysaccharide is a key antigenic factor for Gram-negative pathogens such as salmonellae where it plays key roles in host adaptation, virulence, immune evasion, and persistence. Variation in the lipopolysaccharide is also the major differentiating factor that is used to classify Salmonella into over 2600 serovars as part of the Kaufmann-White scheme. While lipopolysaccharide diversity is generally associated with sequence variation in the lipopolysaccharide biosynthesis operon, extraneous genetic factors such as those encoded by the glucosyltransferase (gtr) operons provide further structural heterogeneity by adding additional sugars onto the O-antigen component of the lipopolysaccharide. Here we identify and examine the O-antigen modifying glucosyltransferase genes from the genomes of Salmonella enterica and Salmonella bongori serovars. We show that Salmonella generally carries between 1 and 4 gtr operons that we have classified into 10 families on the basis of gtrC sequence with apparent O-antigen modification detected for five of these families. The gtr operons localize to bacteriophage-associated genomic regions and exhibit a dynamic evolutionary history driven by recombination and gene shuffling events leading to new gene combinations. Furthermore, evidence of Dam- and OxyR-dependent phase variation of gtr gene expression was identified within eight gtr families. Thus, as O-antigen modification generates significant intra- and inter-strain phenotypic diversity, gtr-mediated modification is fundamental in assessing Salmonella strain variability. This will inform appropriate vaccine and diagnostic approaches, in addition to contributing to our understanding of host-pathogen interactions.  相似文献   

17.

Background

Frankia is a genus of soil actinobacteria forming nitrogen-fixing root-nodule symbiotic relationships with non-leguminous woody plant species, collectively called actinorhizals, from eight dicotyledonous families. Frankia strains are classified into four host-specificity groups (HSGs), each of which exhibits a distinct host range. Genome sizes of representative strains of Alnus, Casuarina, and Elaeagnus HSGs are highly diverged and are positively correlated with the size of their host ranges.

Results

The content and size of 12 Frankia genomes were investigated by in silico comparative genome hybridization and pulsed-field gel electrophoresis, respectively. Data were collected from four query strains of each HSG and compared with those of reference strains possessing completely sequenced genomes. The degree of difference in genome content between query and reference strains varied depending on HSG. Elaeagnus query strains were missing the greatest number (22–32%) of genes compared with the corresponding reference genome; Casuarina query strains lacked the fewest (0–4%), with Alnus query strains intermediate (14–18%). In spite of the remarkable gene loss, genome sizes of Alnus and Elaeagnus query strains were larger than would be expected based on total length of the absent genes. In contrast, Casuarina query strains had smaller genomes than expected.

Conclusions

The positive correlation between genome size and host range held true across all investigated strains, supporting the hypothesis that size and genome content differences are responsible for observed diversity in host plants and host plant biogeography among Frankia strains. In addition, our results suggest that different dynamics of shuffling of genome content have contributed to these symbiotic and biogeographic adaptations. Elaeagnus strains, and to a lesser extent Alnus strains, have gained and lost many genes to adapt to a wide range of environments and host plants. Conversely, rather than acquiring new genes, Casuarina strains have discarded genes to reduce genome size, suggesting an evolutionary orientation towards existence as specialist symbionts.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-609) contains supplementary material, which is available to authorized users.  相似文献   

18.
Almost 50 years following the discovery of the prokaryotic operon, the functional relevance of gene order within operons remains unclear. In this work, we take advantage of the eroded genome of Mycobacterium leprae to add evidence supporting the notion that functionally less important genes have a tendency to be located at the end of its operons. M. leprae’s genome includes 1133 pseudogenes and 1614 protein-coding genes and can be compared with the close genome of M. tuberculosis. Assuming M. leprae’s pseudogenes to represent dispensable genes, we have studied the position of these pseudogenes in the operons of M. leprae and of their orthologs in M. tuberculosis. We observed that both tend to be located in the 3′ (downstream) half of the operon (P-values of 0.03 and 0.18, respectively). Analysis of pseudogenes in all available prokaryotic genomes confirms this trend (P-value of 7.1 × 10−7). In a complementary analysis, we found a significant tendency for essential genes to be located at the 5′ (upstream) half of the operon (P-value of 0.006). Our work provides an indication that, in prokarya, functionally less important genes have a tendency to be located at the end of operons, while more relevant genes tend to be located toward operon starts.  相似文献   

19.
The complete SfiI and I-CeuI physical maps of four Bacillus subtilis (natto) strains, which were previously isolated as natto (fermented soybean) starters, were constructed to elucidate the genome structure. Not only the similarity in genome size and organization but also the microheterogeneity of the gene context was revealed. No large-scale genome rearrangements among the four strains were indicated by mapping of the genes, including 10 rRNA operons (rrn) and relevant genes required for natto production, to the loci corresponding to those of the B. subtilis strain Marburg 168. However, restriction fragment length polymorphism and the presence or absence of strain-specific DNA sequences, such as the prophages SPβ, skin element, and PBSX, as well as the insertion element IS4Bsu1, could be used to identify one of these strains as a Marburg type and the other three strains as natto types. The genome structure and gene heterogeneity were also consistent with the type of indigenous plasmids harbored by the strains.  相似文献   

20.
Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the pan- and core genomes of this set of sequenced strains. A hierarchical clustering of variable genes allowed clear separation of the strains into clusters, including known pathotypes; clinically relevant serotypes can also be resolved in this way. In contrast, when in silico MLST was performed, many of the various strains appear jumbled and less well resolved. The predicted pan-genome comprises 15,741 gene families, and only 993 (6%) of the families are represented in every genome, comprising the core genome. The variable or ‘accessory’ genes thus make up more than 90% of the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of Enterobacteriaceae.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号