首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hacker J  Carniel E 《EMBO reports》2001,2(5):376-381
The compositions of bacterial genomes can be changed rapidly and dramatically through a variety of processes including horizontal gene transfer. This form of change is key to bacterial evolution, as it leads to ‘evolution in quantum leaps’. Horizontal gene transfer entails the incorporation of genetic elements transferred from another organism—perhaps in an earlier generation—directly into the genome, where they form ‘genomic islands’, i.e. blocks of DNA with signatures of mobile genetic elements. Genomic islands whose functions increase bacterial fitness, either directly or indirectly, have most likely been positively selected and can be termed ‘fitness islands’. Fitness islands can be divided into several subtypes: ‘ecological islands’ in environmental bacteria and ‘saprophytic islands’, ‘symbiosis islands’ or ‘pathogenicity islands’ (PAIs) in microorganisms that interact with living hosts. Here we discuss ways in which PAIs contribute to the pathogenic potency of bacteria, and the idea that genetic entities similar to genomic islands may also be present in the genomes of eukaryotes.  相似文献   

2.
Microbial genes that are “novel” (no detectable homologs in other species) have become of increasing interest as environmental sampling suggests that there are many more such novel genes in yet-to-be-cultured microorganisms. By analyzing known microbial genomic islands and prophages, we developed criteria for systematic identification of putative genomic islands (clusters of genes of probable horizontal origin in a prokaryotic genome) in 63 prokaryotic genomes, and then characterized the distribution of novel genes and other features. All but a few of the genomes examined contained significantly higher proportions of novel genes in their predicted genomic islands compared with the rest of their genome (Paired t test = 4.43E-14 to 1.27E-18, depending on method). Moreover, the reverse observation (i.e., higher proportions of novel genes outside of islands) never reached statistical significance in any organism examined. We show that this higher proportion of novel genes in predicted genomic islands is not due to less accurate gene prediction in genomic island regions, but likely reflects a genuine increase in novel genes in these regions for both bacteria and archaea. This represents the first comprehensive analysis of novel genes in prokaryotic genomic islands and provides clues regarding the origin of novel genes. Our collective results imply that there are different gene pools associated with recently horizontally transmitted genomic regions versus regions that are primarily vertically inherited. Moreover, there are more novel genes within the gene pool associated with genomic islands. Since genomic islands are frequently associated with a particular microbial adaptation, such as antibiotic resistance, pathogen virulence, or metal resistance, this suggests that microbes may have access to a larger “arsenal” of novel genes for adaptation than previously thought.  相似文献   

3.
Recognizing the pseudogenes in bacterial genomes   总被引:9,自引:0,他引:9  
Pseudogenes are now known to be a regular feature of bacterial genomes and are found in particularly high numbers within the genomes of recently emerged bacterial pathogens. As most pseudogenes are recognized by sequence alignments, we use newly available genomic sequences to identify the pseudogenes in 11 genomes from 4 bacterial genera, each of which contains at least 1 human pathogen. The numbers of pseudogenes range from 27 in Staphylococcus aureus MW2 to 337 in Yersinia pestis CO92 (e.g. 1–8% of the annotated genes in the genome). Most pseudogenes are formed by small frameshifting indels, but because stop codons are A + T-rich, the two low-G + C Gram-positive taxa (Streptococcus and Staphylococcus) have relatively high fractions of pseudogenes generated by nonsense mutations when compared with more G + C-rich genomes. Over half of the pseudogenes are produced from genes whose original functions were annotated as ‘hypothetical’ or ‘unknown’; however, several broadly distributed genes involved in nucleotide processing, repair or replication have become pseudogenes in one of the sequenced Vibrio vulnificus genomes. Although many of our comparisons involved closely related strains with broadly overlapping gene inventories, each genome contains a largely unique set of pseudogenes, suggesting that pseudogenes are formed and eliminated relatively rapidly from most bacterial genomes.  相似文献   

4.
Predatory bacteria seek and consume other live bacteria. Although belonging to taxonomically diverse groups, relatively few bacterial predator species are known. Consequently, it is difficult to assess the impact of predation within the bacterial realm. As no genetic signatures distinguishing them from non-predatory bacteria are known, genomic resources cannot be exploited to uncover novel predators. In order to identify genes specific to predatory bacteria, we developed a bioinformatic tool called DiffGene. This tool automatically identifies marker genes that are specific to phenotypic or taxonomic groups, by mapping the complete gene content of all available fully-sequenced genomes for the presence/absence of each gene in each genome. A putative ‘predator region’ of ~60 amino acids in the tryptophan 2,3-dioxygenase (TDO) protein was found to probably be a predator-specific marker. This region is found in all known obligate predator and a few facultative predator genomes, and is absent from most facultative predators and all non-predatory bacteria. We designed PCR primers that uniquely amplify a ~180bp-long sequence within the predators’ TDO gene, and validated them in monocultures as well as in metagenetic analysis of environmental wastewater samples. This marker, in addition to its usage in predator identification and phylogenetics, may finally permit reliable enumeration and cataloguing of predatory bacteria from environmental samples, as well as uncovering novel predators.  相似文献   

5.
The pangenomic diversity in Burkholderia pseudomallei is high, with approximately 5.8% of the genome consisting of genomic islands. Genomic islands are known hotspots for recombination driven primarily by site-specific recombination associated with tRNAs. However, recombination rates in other portions of the genome are also high, a feature we expected to disrupt gene order. We analyzed the pangenome of 37 isolates of B. pseudomallei and demonstrate that the pangenome is ‘open’, with approximately 136 new genes identified with each new genome sequenced, and that the global core genome consists of 4568±16 homologs. Genes associated with metabolism were statistically overrepresented in the core genome, and genes associated with mobile elements, disease, and motility were primarily associated with accessory portions of the pangenome. The frequency distribution of genes present in between 1 and 37 of the genomes analyzed matches well with a model of genome evolution in which 96% of the genome has very low recombination rates but 4% of the genome recombines readily. Using homologous genes among pairs of genomes, we found that gene order was highly conserved among strains, despite the high recombination rates previously observed. High rates of gene transfer and recombination are incompatible with retaining gene order unless these processes are either highly localized to specific sites within the genome, or are characterized by symmetrical gene gain and loss. Our results demonstrate that both processes occur: localized recombination introduces many new genes at relatively few sites, and recombination throughout the genome generates the novel multi-locus sequence types previously observed while preserving gene order.  相似文献   

6.
Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes. The Escherichia coli genome was used for development, but we have applied this method to several other bacterial and archaeal genomes. Networks based on nucleotide composition were 80–90% accurate in jackknife testing experiments for bacteria and 90–99% for hyperthermophilic archaea. We also achieved a significant improvement in accuracy by combining these predictions with those obtained using a second set of parameters consisting of known RNA sequence motifs and the calculated free energy of folding. Several known fRNAs not included in the training datasets were identified as well as several hundred predicted novel RNAs. These studies indicate that there are many unidentified RNAs in simple genomes that can be predicted computationally as a precursor to experimental study. Public access to our RNA gene predictions and an interface for user predictions is available via the web.  相似文献   

7.
In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be recognized, among others, by atypical clustering of dinucleotides. We hypothesized that atypical clustering of hexameric endonuclease recognition sites in aDNA allows the specific isolation of anomalous sequences in vitro. Clustering of endonuclease recognition sites in aDNA regions of eight published prokaryotic genome sequences was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome, using four selected endonucleases, revealed that out of 27 of the small fragments predicted (<5 kb), 21 were located in known genomic islands. Of the 24 calculated fragments (>300 bp and <5 kb), 22 met our criteria for aDNA, i.e. a high dinucleotide dissimilarity and/or aberrant GC content. The four enzymes also allowed the identification of aDNA fragments from the related Z2491 strain. Similarly, the sequenced genomes of three strains of Escherichia coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments of anomalous composition. In vitro applicability of the method was demonstrated by using adaptor-linked PCR, yielding the predicted fragments from the N.meningitidis MC58 genome. In conclusion, this strategy allows the selective isolation of aDNA from prokaryotic genomes by a simple restriction digest–amplification–cloning–sequencing scheme.  相似文献   

8.
A gene in a genome is defined as putative alien (pA) if its codon usage difference from the average gene exceeds a high threshold and codon usage differences from ribosomal protein genes, chaperone genes and protein-synthesis-processing factors are also high. pA gene clusters in bacterial genomes are relevant for detecting genomic islands (GIs), including pathogenicity islands (PAIs). Four other analyses appropriate to this task are G+C genome variation (the standard method); genomic signature divergences (dinucleotide bias); extremes of codon bias; and anomalies of amino acid usage. For example, the cagA domain of Helicobacter pylori is highly deviant in its genome signature and codon bias from the rest of the genome. Using these methods we can detect two potential PAIs in the Neisseria meningitidis genome, which contain hemagglutinin and/or hemolysin-related genes. Additionally, G+C variation and genome signature differences of the Mycobacterium tuberculosis genome indicate two pA gene clusters.  相似文献   

9.
Genes vary greatly in their long-term phylogenetic stability and there exists no general explanation for these differences. The cytochrome P450 (CYP450) gene superfamily is well suited to investigating this problem because it is large and well studied, and it includes both stable and unstable genes. CYP450 genes encode oxidase enzymes that function in metabolism of endogenous small molecules and in detoxification of xenobiotic compounds. Both types of enzymes have been intensively studied. My analysis of ten nearly complete vertebrate genomes indicates that each genome contains 50–80 CYP450 genes, which are about evenly divided between phylogenetically stable and unstable genes. The stable genes are characterized by few or no gene duplications or losses in species ranging from bony fish to mammals, whereas unstable genes are characterized by frequent gene duplications and losses (birth–death evolution) even among closely related species. All of the CYP450 genes that encode enzymes with known endogenous substrates are phylogenetically stable. In contrast, most of the unstable genes encode enzymes that function as xenobiotic detoxifiers. Nearly all unstable CYP450 genes in the mouse and human genomes reside in a few dense gene clusters, forming unstable gene islands that arose by recurrent local gene duplication. Evidence for positive selection in amino acid sequence is restricted to these unstable CYP450 genes, and sites of selection are associated with substrate-binding regions in the protein structure. These results can be explained by a general model in which phylogenetically stable genes have core functions in development and physiology, whereas unstable genes have accessory functions associated with unstable environmental interactions such as toxin and pathogen exposure. Unstable gene islands in vertebrates share some functional properties with bacterial genomic islands, though they arise by local gene duplication rather than horizontal gene transfer.  相似文献   

10.
Linkage maps are valuable tools in genetic and genomic studies. For sweet cherry, linkage maps have been constructed using mainly microsatellite markers (SSRs) and, recently, using single nucleotide polymorphism markers (SNPs) from a cherry 6K SNP array. Genotyping-by-sequencing (GBS), a new methodology based on high-throughput sequencing, holds great promise for identification of high number of SNPs and construction of high density linkage maps. In this study, GBS was used to identify SNPs from an intra-specific sweet cherry cross. A total of 8,476 high quality SNPs were selected for mapping. The physical position for each SNP was determined using the peach genome, Peach v1.0, as reference, and a homogeneous distribution of markers along the eight peach scaffolds was obtained. On average, 65.6% of the SNPs were present in genic regions and 49.8% were located in exonic regions. In addition to the SNPs, a group of SSRs was also used for construction of linkage maps. Parental and consensus high density maps were constructed by genotyping 166 siblings from a ‘Rainier’ x ‘Rivedel’ (Ra x Ri) cross. Using Ra x Ri population, 462, 489 and 985 markers were mapped into eight linkage groups in ‘Rainier’, ‘Rivedel’ and the Ra x Ri map, respectively, with 80% of mapped SNPs located in genic regions. Obtained maps spanned 549.5, 582.6 and 731.3 cM for ‘Rainier’, ‘Rivedel’ and consensus maps, respectively, with an average distance of 1.2 cM between adjacent markers for both ‘Rainier’ and ‘Rivedel’ maps and of 0.7 cM for Ra x Ri map. High synteny and co-linearity was observed between obtained maps and with Peach v1.0. These new high density linkage maps provide valuable information on the sweet cherry genome, and serve as the basis for identification of QTLs and genes relevant for the breeding of the species.  相似文献   

11.
Parallel analysis of RNA ends (PARE) is a technique utilizing high-throughput sequencing to profile uncapped, mRNA cleavage or decay products on a genome-wide basis. Tools currently available to validate miRNA targets using PARE data employ only annotated genes, whereas important targets may be found in unannotated genomic regions. To handle such cases and to scale to the growing availability of PARE data and genomes, we developed a new tool, ‘sPARTA’ (small RNA-PARE target analyzer) that utilizes a built-in, plant-focused target prediction module (aka ‘miRferno’). sPARTA not only exhibits an unprecedented gain in speed but also it shows greater predictive power by validating more targets, compared to a popular alternative. In addition, the novel ‘seed-free’ mode, optimized to find targets irrespective of complementarity in the seed-region, identifies novel intergenic targets. To fully capitalize on the novelty and strengths of sPARTA, we developed a web resource, ‘comPARE’, for plant miRNA target analysis; this facilitates the systematic identification and analysis of miRNA-target interactions across multiple species, integrated with visualization tools. This collation of high-throughput small RNA and PARE datasets from different genomes further facilitates re-evaluation of existing miRNA annotations, resulting in a ‘cleaner’ set of microRNAs.  相似文献   

12.
Connected gene neighborhoods in prokaryotic genomes   总被引:12,自引:1,他引:11  
A computational method was developed for delineating connected gene neighborhoods in bacterial and archaeal genomes. These gene neighborhoods are not typically present, in their entirety, in any single genome, but are held together by overlapping, partially conserved gene arrays. The procedure was applied to comparing the orders of orthologous genes, which were extracted from the database of Clusters of Orthologous Groups of proteins (COGs), in 31 prokaryotic genomes and resulted in the identification of 188 clusters of gene arrays, which included 1001 of 2890 COGs. These clusters were projected onto actual genomes to produce extended neighborhoods including additional genes, which are adjacent to the genes from the clusters and are transcribed in the same direction, which resulted in a total of 2387 COGs being included in the neighborhoods. Most of the neighborhoods consist predominantly of genes united by a coherent functional theme, but also include a minority of genes without an obvious functional connection to the main theme. We hypothesize that although some of the latter genes might have unsuspected roles, others are maintained within gene arrays because of the advantage of expression at a level that is typical of the given neighborhood. We designate this phenomenon ‘genomic hitchhiking’. The largest neighborhood includes 79 genes (COGs) and consists of overlapping, rearranged ribosomal protein superoperons; apparent genome hitchhiking is particularly typical of this neighborhood and other neighborhoods that consist of genes coding for translation machinery components. Several neighborhoods involve previously undetected connections between genes, allowing new functional predictions. Gene neighborhoods appear to evolve via complex rearrangement, with different combinations of genes from a neighborhood fixed in different lineages.  相似文献   

13.
Because the properties of horizontally-transferred genes will reflect the mutational proclivities of their donor genomes, they often show atypical compositional properties relative to native genes. Parametric methods use these discrepancies to identify bacterial genes recently acquired by horizontal transfer. However, compositional patterns of native genes vary stochastically, leaving no clear boundary between typical and atypical genes. As a result, while strongly atypical genes are readily identified as alien, genes of ambiguous character are poorly classified when a single threshold separates typical and atypical genes. This limitation affects all parametric methods that examine genes independently, and escaping it requires the use of additional genomic information. We propose that the performance of all parametric methods can be improved by using a multiple-threshold approach. First, strongly atypical alien genes and strongly typical native genes would be identified using conservative thresholds. Genes with ambiguous compositional features would then be classified by examining gene context, including the class (native or alien) of flanking genes. By including additional genomic information in a multiple-threshold framework, we observed a remarkable improvement in the performance of several popular, but algorithmically distinct, methods for alien gene detection.  相似文献   

14.
The α-proteobacteria represent one of the most diverse bacterial subdivisions, displaying extreme variations in lifestyle, geographical distribution and genome size. Species for which genome data are available have been classified into a species tree based on a conserved set of vertically inherited core genes. By mapping the variation in gene content onto the species tree, genomic changes can be associated with adaptations to specific growth niches. Genes for adaptive traits are mostly located in ‘plasticity zones’ in the bacterial genome, which also contain mobile elements and are highly variable across strains. By physically separating genes for information processing from genes involved in interactions with the surrounding environment, the rate of evolutionary change can be substantially enhanced for genes underlying adaptation to new growth habitats, possibly explaining the ecological success of the α-proteo-bacterial subdivision.  相似文献   

15.
Bacterial genomes generally consist of stable regions termed core genome, and variable regions that form the so-called flexible gene pool. The flexible part is composed of bacteriophages, plasmids, transposons as well as unstable large regions that have been termed genomic islands. Genomic islands encoding virulence factors of pathogenic bacteria have been designated "pathogenicity islands". Pathogenicity islands were first discovered in uropathogenic Escherichia coli and presently more than 30 bacterial species carrying pathogenicity islands have been described. This review summarises the current knowledge on bacterial genomic islands and their general features, and discusses their putative role in the evolution of microbes in the light of genomics of pathogenic bacteria.  相似文献   

16.
17.
We devised software tools to systematically investigate the contents and contexts of bacterial tRNA and tmRNA genes, which are known insertion hotspots for genomic islands (GIs). The strategy, based on MAUVE-facilitated multigenome comparisons, was used to examine 87 Escherichia coli MG1655 tRNA and tmRNA genes and their orthologues in E.coli EDL933, E.coli CFT073 and Shigella flexneri Sf301. Our approach identified 49 GIs occupying ~1.7 Mb that mapped to 18 tRNA genes, missing 2 but identifying a further 30 GIs as compared with Islander [Y. Mantri and K. P. Williams (2004), Nucleic Acids Res., 32, D55–D58]. All these GIs had many strain-specific CDS, anomalous GC contents and/or significant dinucleotide biases, consistent with foreign origins. Our analysis demonstrated marked conservation of sequences flanking both empty tRNA sites and tRNA-associated GIs across all four genomes. Remarkably, there were only 2 upstream and 5 downstream deletions adjacent to the 328 loci investigated. In silico PCR analysis based on conserved flanking regions was also used to interrogate hotspots in another eight completely or partially sequenced E.coli and Shigella genomes. The tools developed are ideal for the analysis of other bacterial species and will lead to in silico and experimental discovery of new genomic islands.  相似文献   

18.
The occurrence of polyploidy in land plant evolution has led to an acceleration of genome modifications relative to other crown eukaryotes and is correlated with key innovations in plant evolution. Extensive genome resources provide for relating genomic changes to the origins of novel morphological and physiological features of plants. Ancestral gene contents for key nodes of the plant family tree are inferred. Pervasive polyploidy in angiosperms appears likely to be the major factor generating novel angiosperm genes and expanding some gene families. However, most gene families lose most duplicated copies in a quasi-neutral process, and a few families are actively selected for single-copy status. One of the great challenges of evolutionary genomics is to link genome modifications to speciation, diversification and the morphological and/or physiological innovations that collectively compose biodiversity. Rapid accumulation of genomic data and its ongoing investigation may greatly improve the resolution at which evolutionary approaches can contribute to the identification of specific genes responsible for particular innovations. The resulting, more ‘particulate’ understanding of plant evolution, may elevate to a new level fundamental knowledge of botanical diversity, including economically important traits in the crop plants that sustain humanity.  相似文献   

19.
20.
Comparative whole-genome analyses have demonstrated that horizontal gene transfer (HGT) provides a significant contribution to prokaryotic genome innovation. The evolution of specific prokaryotes is therefore tightly linked to the environment in which they live and the communal pool of genes available within that environment. Here we use the term supergenome to describe the set of all genes that a prokaryotic ‘individual’ can draw on within a particular environmental setting. Conjugative plasmids can be considered particularly successful entities within the communal pool, which have enabled HGT over large taxonomic distances. These plasmids are collections of discrete regions of genes that function as ‘backbone modules’ to undertake different aspects of overall plasmid maintenance and propagation. Conjugative plasmids often carry suites of ‘accessory elements’ that contribute adaptive traits to the hosts and, potentially, other resident prokaryotes within specific environmental niches. Insight into the evolution of plasmid modules therefore contributes to our knowledge of gene dissemination and evolution within prokaryotic communities. This communal pool provides the prokaryotes with an important mechanistic framework for obtaining adaptability and functional diversity that alleviates the need for large genomes of specialized ‘private genes’.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号