首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 895 毫秒
1.
The gene-dense chromosomes of archaea and bacteria were long thought to be devoid of pseudogenes, but with the massive increase in available genome sequences, whole genome comparisons between closely related species have identified mutations that have rendered numerous genes inactive. Comparative analyses of sequenced archaeal genomes revealed numerous pseudogenes, which can constitute up to 8.6% of the annotated coding sequences in some genomes. The largest proportion of pseudogenes is created by gene truncations, followed by frameshift mutations. Within archaeal genomes, large numbers of pseudogenes contain more than one inactivating mutation, suggesting that pseudogenes are deleted from the genome more slowly in archaea than in bacteria. Although archaea seem to retain pseudogenes longer than do bacteria, most archaeal genomes have unique repertoires of pseudogenes.  相似文献   

2.
Studies of neutrally evolving sequences suggest that differences in eukaryotic genome sizes result from different rates of DNA loss. However, very few pseudogenes have been identified in microbial species, and the processes whereby genes and genomes deteriorate in bacteria remain largely unresolved. The typhus-causing agent, Rickettsia prowazekii, is exceptional in that as much as 24% of its 1.1-Mb genome consists of noncoding DNA and pseudogenes. To test the hypothesis that the noncoding DNA in the R. prowazekii genome represents degraded remnants of ancestral genes, we systematically examined all of the identified pseudogenes and their flanking sequences in three additional Rickettsia species. Consistent with the hypothesis, we observe sequence similarities between genes and pseudogenes in one species and intergenic DNA in another species. We show that the frequencies and average sizes of deletions are larger than insertions in neutrally evolving pseudogene sequences. Our results suggest that inactivated genetic material in the Rickettsia genomes deteriorates spontaneously due to a mutation bias for deletions and that the noncoding sequences represent DNA in the final stages of this degenerative process.  相似文献   

3.
Recognizing the pseudogenes in bacterial genomes   总被引:9,自引:0,他引:9  
Pseudogenes are now known to be a regular feature of bacterial genomes and are found in particularly high numbers within the genomes of recently emerged bacterial pathogens. As most pseudogenes are recognized by sequence alignments, we use newly available genomic sequences to identify the pseudogenes in 11 genomes from 4 bacterial genera, each of which contains at least 1 human pathogen. The numbers of pseudogenes range from 27 in Staphylococcus aureus MW2 to 337 in Yersinia pestis CO92 (e.g. 1–8% of the annotated genes in the genome). Most pseudogenes are formed by small frameshifting indels, but because stop codons are A + T-rich, the two low-G + C Gram-positive taxa (Streptococcus and Staphylococcus) have relatively high fractions of pseudogenes generated by nonsense mutations when compared with more G + C-rich genomes. Over half of the pseudogenes are produced from genes whose original functions were annotated as ‘hypothetical’ or ‘unknown’; however, several broadly distributed genes involved in nucleotide processing, repair or replication have become pseudogenes in one of the sequenced Vibrio vulnificus genomes. Although many of our comparisons involved closely related strains with broadly overlapping gene inventories, each genome contains a largely unique set of pseudogenes, suggesting that pseudogenes are formed and eliminated relatively rapidly from most bacterial genomes.  相似文献   

4.
5.
Homma K  Fukuchi S  Kawabata T  Ota M  Nishikawa K 《Gene》2002,294(1-2):25-33
Pseudogenes are open reading frames (ORFs) encoding dysfunctional proteins with high homology to known protein-coding genes. Although pseudogenes were reported to exist in the genomes of many eukaryotes and bacteria, no systematic search for pseudogenes in the Escherichia coli genome has been carried out. Genome comparisons of E. coli strains K-12 and O157 revealed that many protein-coding sequences have prematurely terminated orthologs encoding unstable proteins. To systematically screen for pseudogenes, we selected ORFs generated by premature termination of the orthologous protein-coding genes and subsequently excluded those possibly arising from sequence errors. Lastly we eliminated those with close homologs in this and other species, as these shortened ORFs may actually have functions. The process produced 95 and 101 pseudogene candidates in K-12 and O157, respectively. The assigned three-dimensional structures suggest that most of the encoded proteins cannot fold properly and thus are dysfunctional, indicating that they are probably pseudogenes. Therefore, the existence of a significant number of probable pseudogenes in E. coli is predicted, awaiting experimental verification. Most of them were found to be genes with paralogs or horizontally transferred genes or both. We suggest that pseudogenes constitute a small fraction of the genomes of free-living bacteria in general, reflecting the faster elimination than production of pseudogenes.  相似文献   

6.
J Y Tso  X H Sun  T H Kao  K S Reece    R Wu 《Nucleic acids research》1985,13(7):2485-2502
Full length cDNAs encoding the glycolytic enzyme glyceraldehyde-3-phosphate dehydrogenase (GAPDH) from rat and man have been isolated and sequenced. Many GAPDH gene-related sequences have been found in both genomes based on genomic blot hybridization analysis. Only one functional gene product is known. Results from genomic library screenings suggest that there are 300-400 copies of these sequences in the rat genome and approximately 100 in the human genome. Some of these related sequences have been shown to be processed pseudogenes. We have isolated several rat cDNA clones corresponding to these pseudogenes indicating that some pseudogenes are transcribed. Rat and human cDNAs are 89% homologous in the coding region, and 76% homologous in the first 100 base pairs of the 3'-noncoding region. Comparison of these two cDNA sequences with those of the chicken, Drosophila and yeast genes allows the analysis of the evolution of the GAPDH genes in detail.  相似文献   

7.
8.
9.
MOTIVATION: Mammalian genomes contain many 'genomic fossils' i.e. pseudogenes. These are disabled copies of functional genes that have been retained in the genome by gene duplication or retrotransposition events. Pseudogenes are important resources in understanding the evolutionary history of genes and genomes. RESULTS: We have developed a homology-based computational pipeline ('PseudoPipe') that can search a mammalian genome and identify pseudogene sequences in a comprehensive and consistent manner. The key steps in the pipeline involve using BLAST to rapidly cross-reference potential "parent" proteins against the intergenic regions of the genome and then processing the resulting "raw hits" -- i.e. eliminating redundant ones, clustering together neighbors, and associating and aligning clusters with a unique parent. Finally, pseudogenes are classified based on a combination of criteria including homology, intron-exon structure, and existence of stop codons and frameshifts.  相似文献   

10.
Extensive gene rearrangement is reported in the mitochondrial genomes of lungless salamanders (Plethodontidae). In each genome with a novel gene order, there is evidence that the rearrangement was mediated by duplication of part of the mitochondrial genome, including the presence of both pseudogenes and additional, presumably functional, copies of duplicated genes. All rearrangement-mediating duplications include either the origin of light-strand replication and the nearby tRNA genes or the regions flanking the origin of heavy-strand replication. The latter regions comprise nad6, trnE, cob, trnT, an intergenic spacer between trnT and trnP and, in some genomes, trnP, the control region, trnF, rrnS, trnV, rrnL, trnL1, and nad1. In some cases, two copies of duplicated genes, presumptive regulatory regions, and/or sequences with no assignable function have been retained in the genome following the initial duplication; in other genomes, only one of the duplicated copies has been retained. Both tandem and nontandem duplications are present in these genomes, suggesting different duplication mechanisms. In some of these mitochondrial DNAs, up to 25% of the total length is composed of tandem duplications of noncoding sequence that includes putative regulatory regions and/or pseudogenes of tRNAs and protein-coding genes along with the otherwise unassignable sequences. These data indicate that imprecise initiation and termination of replication, slipped-strand mispairing, and intramolecular recombination may all have played a role in generating repeats during the evolutionary history of plethodontid mitochondrial genomes.  相似文献   

11.
Classification and nomenclature of all human homeobox genes   总被引:2,自引:0,他引:2  

Background

The homeobox genes are a large and diverse group of genes, many of which play important roles in the embryonic development of animals. Increasingly, homeobox genes are being compared between genomes in an attempt to understand the evolution of animal development. Despite their importance, the full diversity of human homeobox genes has not previously been described.

Results

We have identified all homeobox genes and pseudogenes in the euchromatic regions of the human genome, finding many unannotated, incorrectly annotated, unnamed, misnamed or misclassified genes and pseudogenes. We describe 300 human homeobox loci, which we divide into 235 probable functional genes and 65 probable pseudogenes. These totals include 3 genes with partial homeoboxes and 13 pseudogenes that lack homeoboxes but are clearly derived from homeobox genes. These figures exclude the repetitive DUX1 to DUX5 homeobox sequences of which we identified 35 probable pseudogenes, with many more expected in heterochromatic regions. Nomenclature is established for approximately 40 formerly unnamed loci, reflecting their evolutionary relationships to other loci in human and other species, and nomenclature revisions are proposed for around 30 other loci. We use a classification that recognizes 11 homeobox gene 'classes' subdivided into 102 homeobox gene 'families'.

Conclusion

We have conducted a comprehensive survey of homeobox genes and pseudogenes in the human genome, described many new loci, and revised the classification and nomenclature of homeobox genes. The classification scheme may be widely applicable to homeobox genes in other animal genomes and will facilitate comparative genomics of this important gene superclass.  相似文献   

12.
13.
14.
The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content between strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. The loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars.  相似文献   

15.
Pseudogenes are defined as non-functional relatives of genes whose protein-coding abilities are lost and are no longer expressed within cells. They are an outcome of accumulation of mutations within a gene whose end product is not essential for survival. Proper investigation of the procedure of pseudogenization is relevant for estimating occurrence of duplications in genomes. Frankineae houses an interesting group of microorganisms, carving a niche in the microbial world. This study was undertaken with the objective of determining the abundance of pseudogenes, understanding strength of purifying selection, investigating evidence of pseudogene expression, and analysing their molecular nature, their origin, evolution and deterioration patterns amongst domain families. Investigation revealed the occurrence of 956 core pFAM families sharing common characteristics indicating co-evolution. WD40, Rve_3, DDE_Tnp_IS240 and phage integrase core domains are larger families, having more pseudogenes, signifying a probability of harmful foreign genes being disabled within transposable elements. High selective pressure depicted that gene families rapidly duplicating and evolving undoubtedly facilitated creation of a number of pseudogenes in Frankineae. Codon usage analysis between protein-coding genes and pseudogenes indicated a wide degree of variation with respect to different factors. Moreover, the majority of pseudogenes were under the effect of purifying selection. Frankineae pseudogenes were under stronger selective constraints, indicating that they were functional for a very long time and became pseudogenes abruptly. The origin and deterioration of pseudogenes has been attributed to selection and mutational pressure acting upon sequences for adapting to stressed soil environments.  相似文献   

16.

Background

Insertion sequences (ISs) are approximately 1 kbp long “jumping” genes found in prokaryotes. ISs encode the protein Transposase, which facilitates the excision and reinsertion of ISs in genomes, making these sequences a type of class I (“cut-and-paste”) Mobile Genetic Elements. ISs are proposed to be involved in the reductive evolution of symbiotic prokaryotes. Our previous sequencing of the genome of the cyanobacterium ‘Nostoc azollae’ 0708, living in a tight perpetual symbiotic association with a plant (the water fern Azolla), revealed the presence of an eroding genome, with a high number of insertion sequences (ISs) together with an unprecedented large proportion of pseudogenes. To investigate the role of ISs in the reductive evolution of ‘Nostoc azollae’ 0708, and potentially in the formation of pseudogenes, a bioinformatic investigation of the IS identities and positions in 47 cyanobacterial genomes was conducted. To widen the scope, the IS contents were analysed qualitatively and quantitatively in 20 other genomes representing both free-living and symbiotic bacteria.

Results

Insertion Sequences were not randomly distributed in the bacterial genomes and were found to transpose short distances from their original location (“local hopping”) and pseudogenes were enriched in the vicinity of IS elements. In general, symbiotic organisms showed higher densities of IS elements and pseudogenes than non-symbiotic bacteria. A total of 1108 distinct repeated sequences over 500 bp were identified in the 67 genomes investigated. In the genome of ‘Nostoc azollae’ 0708, IS elements were apparent at 970 locations (14.3%), with 428 being full-length. Morphologically complex cyanobacteria with large genomes showed higher frequencies of IS elements, irrespective of life style.

Conclusions

The apparent co-location of IS elements and pseudogenes found in prokaryotic genomes implies earlier IS transpositions into genes. As transpositions tend to be local rather than genome wide this likely explains the proximity between IS elements and pseudogenes. These findings suggest that ISs facilitate the reductive evolution in for instance in the symbiotic cyanobacterium ‘Nostoc azollae’ 0708 and in other obligate prokaryotic symbionts.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1386-7) contains supplementary material, which is available to authorized users.  相似文献   

17.
We present the 4.8-Mb complete genome sequence of Salmonella enterica serovar Typhi strain Ty2, a human-specific pathogen causing typhoid fever. A comparison with the genome sequence of recently isolated S. enterica serovar Typhi strain CT18 showed that 29 of the 4,646 predicted genes in Ty2 are unique to this strain, while 84 genes are unique to CT18. Both genomes contain more than 200 pseudogenes; 9 of these genes in CT18 are intact in Ty2, while 11 intact CT18 genes are pseudogenes in Ty2. A half-genome interreplichore inversion in Ty2 relative to CT18 was confirmed. The two strains exhibit differences in prophages, insertion sequences, and island structures. While CT18 carries two plasmids, one conferring multiple drug resistance, Ty2 has no plasmids and is sensitive to antibiotics.  相似文献   

18.
Pseudogenes   总被引:8,自引:0,他引:8  
  相似文献   

19.
One of the common features of bacterial genomes is a strong compositional asymmetry between differently replicating DNA strands (leading and lagging). The main cause of the observed bias is the mutational pressure associated with replication. This suggests that genes translocated between differently replicating DNA strands are subjected to a higher mutational pressure, which may influence their composition and divergence rate. Analyses of groups of completely sequenced bacterial genomes have revealed that the highest divergence rate is observed for the DNA sequences that in closely related genomes are located on different DNA strands in respect to their role in replication. Paradoxically, for this group of sequences the absolute values of divergence rate are higher for closely related species than for more diverged ones. Since this effect concerns only the specific group of orthologs, there must be a specific mechanism introducing bias into the structure of chromosome by enriching the set of homologs in trans position in newly diverged species in relatively highly diverged sequences. These highly diverged sequences may be of varied nature: (1) paralogs or other fast-evolving genes under weak selection; or (2) pseudogenes that will probably be eliminated from the genome during further evolution; or (3) genes whose history after divergence is longer than the history of the genomes in which they are found. The use of these highly diverged sequences for phylogenetic analyses may influence the topology and branch length of phylogenetic trees. The changing mutational pressure may contribute to arising of genes with new functions as well.  相似文献   

20.
All six arms of the group 1 chromosomes of hexaploid wheat (Triticum aestivum) were sequenced with Roche/454 to 1.3- to 2.2-fold coverage and compared with similar data sets from the homoeologous chromosome 1H of barley (Hordeum vulgare). Six to ten thousand gene sequences were sampled per chromosome. These were classified into genes that have their closest homologs in the Triticeae group 1 syntenic region in Brachypodium, rice (Oryza sativa), and/or sorghum (Sorghum bicolor) and genes that have their homologs elsewhere in these model grass genomes. Although the number of syntenic genes was similar between the homologous groups, the amount of nonsyntenic genes was found to be extremely diverse between wheat and barley and even between wheat subgenomes. Besides a small core group of genes that are nonsyntenic in other grasses but conserved among Triticeae, we found thousands of genic sequences that are specific to chromosomes of one single species or subgenome. By examining in detail 50 genes from chromosome 1H for which BAC sequences were available, we found that many represent pseudogenes that resulted from transposable element activity and double-strand break repair. Thus, Triticeae seem to accumulate nonsyntenic genes frequently. Since many of them are likely to be pseudogenes, total gene numbers in Triticeae are prone to pronounced overestimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号