首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.

Background

Comparative evolutionary analysis of whole genomes requires not only accurate annotation of gene space, but also proper annotation of the repetitive fraction which is often the largest component of most if not all genomes larger than 50 kb in size.

Results

Here we present the Rice TE database (RiTE-db) - a genus-wide collection of transposable elements and repeated sequences across 11 diploid species of the genus Oryza and the closely-related out-group Leersia perrieri. The database consists of more than 170,000 entries divided into three main types: (i) a classified and curated set of publicly-available repeated sequences, (ii) a set of consensus assemblies of highly-repetitive sequences obtained from genome sequencing surveys of 12 species; and (iii) a set of full-length TEs, identified and extracted from 12 whole genome assemblies.

Conclusions

This is the first report of a repeat dataset that spans the majority of repeat variability within an entire genus, and one that includes complete elements as well as unassembled repeats. The database allows sequence browsing, downloading, and similarity searches. Because of the strategy adopted, the RiTE-db opens a new path to unprecedented direct comparative studies that span the entire nuclear repeat content of 15 million years of Oryza diversity.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1762-3) contains supplementary material, which is available to authorized users.  相似文献   

2.
3.

Background

Pseudomonas aeruginosa is an important opportunistic pathogen responsible for many infections in hospitalized and immunocompromised patients. Previous reports estimated that approximately 10% of its 6.6 Mbp genome varies from strain to strain and is therefore referred to as “accessory genome”. Elements within the accessory genome of P. aeruginosa have been associated with differences in virulence and antibiotic resistance. As whole genome sequencing of bacterial strains becomes more widespread and cost-effective, methods to quickly and reliably identify accessory genomic elements in newly sequenced P. aeruginosa genomes will be needed.

Results

We developed a bioinformatic method for identifying the accessory genome of P. aeruginosa. First, the core genome was determined based on sequence conserved among the completed genomes of twelve reference strains using Spine, a software program developed for this purpose. The core genome was 5.84 Mbp in size and contained 5,316 coding sequences. We then developed an in silico genome subtraction program named AGEnt to filter out core genomic sequences from P. aeruginosa whole genomes to identify accessory genomic sequences of these reference strains. This analysis determined that the accessory genome of P. aeruginosa ranged from 6.9-18.0% of the total genome, was enriched for genes associated with mobile elements, and was comprised of a majority of genes with unknown or unclear function. Using these genomes, we showed that AGEnt performed well compared to other publically available programs designed to detect accessory genomic elements. We then demonstrated the utility of the AGEnt program by applying it to the draft genomes of two previously unsequenced P. aeruginosa strains, PA99 and PA103.

Conclusions

The P. aeruginosa genome is rich in accessory genetic material. The AGEnt program accurately identified the accessory genomes of newly sequenced P. aeruginosa strains, even when draft genomes were used. As P. aeruginosa genomes become available at an increasingly rapid pace, this program will be useful in cataloging the expanding accessory genome of this bacterium and in discerning correlations between phenotype and accessory genome makeup. The combination of Spine and AGEnt should be useful in defining the accessory genomes of other bacterial species as well.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-737) contains supplementary material, which is available to authorized users.  相似文献   

4.
《PloS one》2009,4(7)

Background

Streptococcus suis is a zoonotic pathogen that infects pigs and can occasionally cause serious infections in humans. S. suis infections occur sporadically in human Europe and North America, but a recent major outbreak has been described in China with high levels of mortality. The mechanisms of S. suis pathogenesis in humans and pigs are poorly understood.

Methodology/Principal Findings

The sequencing of whole genomes of S. suis isolates provides opportunities to investigate the genetic basis of infection. Here we describe whole genome sequences of three S. suis strains from the same lineage: one from European pigs, and two from human cases from China and Vietnam. Comparative genomic analysis was used to investigate the variability of these strains. S. suis is phylogenetically distinct from other Streptococcus species for which genome sequences are currently available. Accordingly, ∼40% of the ∼2 Mb genome is unique in comparison to other Streptococcus species. Finer genomic comparisons within the species showed a high level of sequence conservation; virtually all of the genome is common to the S. suis strains. The only exceptions are three ∼90 kb regions, present in the two isolates from humans, composed of integrative conjugative elements and transposons. Carried in these regions are coding sequences associated with drug resistance. In addition, small-scale sequence variation has generated pseudogenes in putative virulence and colonization factors.

Conclusions/Significance

The genomic inventories of genetically related S. suis strains, isolated from distinct hosts and diseases, exhibit high levels of conservation. However, the genomes provide evidence that horizontal gene transfer has contributed to the evolution of drug resistance.  相似文献   

5.
Yan C  Sun G  Sun D 《PloS one》2011,6(10):e26853

Background

Previous cytological and single copy nuclear genes data suggested the St and Y genome in the StY-genomic Elymus species originated from different donors: the St from a diploid species in Pseudoroegneria and the Y from an unknown diploid species, which are now extinct or undiscovered. However, ITS data suggested that the Y and St genome shared the same progenitor although rather few St genome species were studied. In a recent analysis of many samples of St genome species Pseudoroegneria spicata (Pursh) À. Löve suggested that one accession of P. spicata species was the most likely donor of the Y genome. The present study tested whether intraspecific variation during sampling could affect the outcome of analyses to determining the origin of Y genome in allotetraploid StY species. We also explored the evolutionary dynamics of these species.

Methodology/Principal Findings

Two single copy nuclear genes, the second largest subunit of RNA polymerase II (RPB2) and the translation elongation factor G (EF-G) sequences from 58 accessions of Pseudoroegneria and Elymus species, together with those from Hordeum (H), Agropyron (P), Australopyrum (W), Lophopyrum (Ee), Thinopyrum (Ea), Thinopyrum (Eb), and Dasypyrum (V) were analyzed using maximum parsimony, maximum likelihood and Bayesian methods. Sequence comparisons among all these genomes revealed that the St and Y genomes are relatively dissimilar. Extensive sequence variations have been detected not only between the sequences from St and Y genome, but also among the sequences from diploid St genome species. Phylogenetic analyses separated the Y sequences from the St sequences.

Conclusions/Significance

Our results confirmed that St and Y genome in Elymus species have originated from different donors, and demonstrated that intraspecific variation does not affect the identification of genome origin in polyploids. Moreover, sequence data showed evidence to support the suggestion of the genome convergent evolution in allopolyploid StY genome species.  相似文献   

6.

Background and Aims

It is known that the miniature inverted-repeat terminal element (MITE) preferentially inserts into low-copy-number sequences or genic regions. Characterization of the second largest subunit of low-copy nuclear RNA polymerase II (RPB2) has indicated that MITE and indels have shaped the homoeologous RPB2 loci in the St and H genome of Eymus species in Triticeae. The aims of this study was to determine if there is MITE in the RPB2 gene in Hordeum genomes, and to compare the gene evolution of RPB2 with other diploid Triticeae species. The sequences were used to reconstruct the phylogeny of the genus Hordeum.

Methods

RPB2 regions from all diploid species of Hordeum, one tetraploid species (H. brevisubulatum) and ten accessions of diploid Triticeae species were amplified and sequenced. Parsimony analysis of the DNA dataset was performed in order to reveal the phylogeny of Hordeum species.

Key Results

MITE was detected in the Xu genome. A 27–36 bp indel sequence was found in the I and Xu genome, but deleted in the Xa and some H genome species. Interestingly, the indel length in H genomes corresponds well to their geographical distribution. Phylogenetic analysis of the RPB2 sequences positioned the H and Xa genome in one monophyletic group. The I and Xu genomes are distinctly separated from the H and Xa ones. The RPB2 data also separated all New World H genome species except H. patagonicum ssp. patagonicum from the Old World H genome species.

Conclusions

MITE and large indels have shaped the RPB2 loci between the Xu and H, I and Xa genomes. The phylogenetic analysis of the RPB2 sequences confirmed the monophyly of Hordeum. The maximum-parsimony analysis demonstrated the four genomes to be subdivided into two groups.Key words: Molecular evolution, RPB2, Hordeum, transposable element, phylogeny  相似文献   

7.

Background

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes.

Methodology/Principal Findings

We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes.

Conclusion

The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.  相似文献   

8.

Background

The mechanism of high-altitude adaptation has been studied in certain mammals. However, in avian species like the ground tit Pseudopodoces humilis, the adaptation mechanism remains unclear. The phylogeny of the ground tit is also controversial.

Results

Using next generation sequencing technology, we generated and assembled a draft genome sequence of the ground tit. The assembly contained 1.04 Gb of sequence that covered 95.4% of the whole genome and had higher N50 values, at the level of both scaffolds and contigs, than other sequenced avian genomes. About 1.7 million SNPs were detected, 16,998 protein-coding genes were predicted and 7% of the genome was identified as repeat sequences. Comparisons between the ground tit genome and other avian genomes revealed a conserved genome structure and confirmed the phylogeny of ground tit as not belonging to the Corvidae family. Gene family expansion and positively selected gene analysis revealed genes that were related to cardiac function. Our findings contribute to our understanding of the adaptation of this species to extreme environmental living conditions.

Conclusions

Our data and analysis contribute to the study of avian evolutionary history and provide new insights into the adaptation mechanisms to extreme conditions in animals.  相似文献   

9.

Background

The yaws treponemes, Treponema pallidum ssp. pertenue (TPE) strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA). Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents.

Methodology/Principal Findings

To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strains (Samoa D, CDC-2, Gauthier) were determined using next-generation sequencing techniques. TPE genome sequences were compared to four genomes of TPA strains (Nichols, DAL-1, SS14, Chicago). The genome structure was identical in all three TPE strains with similar length ranging between 1,139,330 bp and 1,139,744 bp. No major genome rearrangements were found when compared to the four TPA genomes. The whole genome nucleotide divergence (dA) between TPA and TPE subspecies was 4.7 and 4.8 times higher than the observed nucleotide diversity (π) among TPA and TPE strains, respectively, corresponding to 99.8% identity between TPA and TPE genomes. A set of 97 (9.9%) TPE genes encoded proteins containing two or more amino acid replacements or other major sequence changes. The TPE divergent genes were mostly from the group encoding potential virulence factors and genes encoding proteins with unknown function.

Conclusions/Significance

Hypothetical genes, with genetic differences, consistently found between TPE and TPA strains are candidates for syphilitic treponemes virulence factors. Seventeen TPE genes were predicted under positive selection, and eleven of them coded either for predicted exported proteins or membrane proteins suggesting their possible association with the cell surface. Sequence changes between TPE and TPA strains and changes specific to individual strains represent suitable targets for subspecies- and strain-specific molecular diagnostics.  相似文献   

10.
11.

Background

Understanding genetic determinants of a microbial phenotype generally involves creating and comparing isogenic strains differing at the locus of interest, but the naturally existing genomic and phenotypic diversity of microbial populations has rarely been exploited. Here we report use of a diverse collection of 616 carriage isolates of Streptococcus pneumoniae and their genome sequences to help identify a novel determinant of pneumococcal colonization.

Results

A spontaneously arising laboratory variant (SpnYL101) of a capsule-switched TIGR4 strain (TIGR4:19F) showed reduced ability to establish mouse nasal colonization and lower resistance to non-opsonic neutrophil-mediated killing in vitro, a phenotype correlated with in vivo success. Whole genome sequencing revealed 5 single nucleotide polymorphisms (SNPs) affecting 4 genes in SpnYL101 relative to its ancestor. To evaluate the effect of variation in each gene, we performed an in silico screen of 616 previously published genome sequences to identify pairs of closely-related, serotype-matched isolates that differ at the gene of interest, and compared their resistance to neutrophil-killing. This method allowed rapid examination of multiple candidate genes and found phenotypic differences apparently associated with variation in SP_1645, a RelA/ SpoT homolog (RSH) involved in the stringent response. To establish causality, the alleles corresponding to SP_1645 were switched between the TIGR4:19F and SpnYL101. The wild-type SP_1645 conferred higher resistance to neutrophil-killing and competitiveness in mouse colonization. Using a similar strategy, variation in another RSH gene (TIGR4 locus tag SP_1097) was found to alter resistance to neutrophil-killing.

Conclusions

These results indicate that analysis of naturally existing genomic diversity complements traditional genetics approaches to accelerate genotype-phenotype analysis.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1573-6) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp.

Results

We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes.

Conclusions

Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-561) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

Bacteriophages that infect the opportunistic pathogen Pseudomonas aeruginosa have been classified into several groups. One of them, which includes temperate phage particles with icosahedral heads and long flexible tails, bears genomes whose architecture and replication mechanism, but not their nucleotide sequences, are like those of coliphage Mu. By comparing the genomic sequences of this group of P. aeruginosa phages one could draw conclusions about their ontogeny and evolution.

Results

Two newly isolated Mu-like phages of P. aeruginosa are described and their genomes sequenced and compared with those available in the public data banks. The genome sequences of the two phages are similar to each other and to those of a group of P. aeruginosa transposable phages. Comparing twelve of these genomes revealed a common genomic architecture in the group. Each phage genome had numerous genes with homologues in all the other genomes and a set of variable genes specific for each genome. The first group, which comprised most of the genes with assigned functions, was named “core genome”, and the second group, containing mostly short ORFs without assigned functions was called “accessory genome”. Like in other phage groups, variable genes are confined to specific regions in the genome.

Conclusion

Based on the known and inferred functions for some of the variable genes of the phages analyzed here, they appear to confer selective advantages for the phage survival under particular host conditions. We speculate that phages have developed a mechanism for horizontally acquiring genes to incorporate them at specific loci in the genome that help phage adaptation to the selective pressures imposed by the host.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1146) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

The soybean-Bradyrhizobium symbiosis can be highly efficient in fixing nitrogen, but few genomic sequences of elite inoculant strains are available. Here we contribute with information on the genomes of two commercial strains that are broadly applied to soybean crops in the tropics. B. japonicum CPAC 15 (=SEMIA 5079) is outstanding in its saprophytic capacity and competitiveness, whereas B. diazoefficiens CPAC 7 (=SEMIA 5080) is known for its high efficiency in fixing nitrogen. Both are well adapted to tropical soils. The genomes of CPAC 15 and CPAC 7 were compared to each other and also to those of B. japonicum USDA 6T and B. diazoefficiens USDA 110T.

Results

Differences in genome size were found between species, with B. japonicum having larger genomes than B. diazoefficiens. Although most of the four genomes were syntenic, genome rearrangements within and between species were observed, including events in the symbiosis island. In addition to the symbiotic region, several genomic islands were identified. Altogether, these features must confer high genomic plasticity that might explain adaptation and differences in symbiotic performance. It was not possible to attribute known functions to half of the predicted genes. About 10% of the genomes was composed of exclusive genes of each strain, but up to 98% of them were of unknown function or coded for mobile genetic elements. In CPAC 15, more genes were associated with secondary metabolites, nutrient transport, iron-acquisition and IAA metabolism, potentially correlated with higher saprophytic capacity and competitiveness than seen with CPAC 7. In CPAC 7, more genes were related to the metabolism of amino acids and hydrogen uptake, potentially correlated with higher efficiency of nitrogen fixation than seen with CPAC 15.

Conclusions

Several differences and similarities detected between the two elite soybean-inoculant strains and between the two species of Bradyrhizobium provide new insights into adaptation to tropical soils, efficiency of N2 fixation, nodulation and competitiveness.

Electronic supplementary material

The online version of this article (doi: 10.1186/1471-2164-15-420) contains supplementary material, which is available to authorized users.  相似文献   

15.
16.

Background

Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, which is a strong challenge of structural and comparative crop genomics.

Results

We assembled 65.8 megabase-pairs of non-redundant euchromatic sequence of B. rapa and compared this sequence to the Arabidopsis genome to investigate chromosomal relationships, macrosynteny blocks, and microsynteny within blocks. The triplicated B. rapa genome contains only approximately twice the number of genes as in Arabidopsis because of genome shrinkage. Genome comparisons suggest that B. rapa has a distinct organization of ancestral genome blocks as a result of recent whole genome triplication followed by a unique diploidization process. A lack of the most recent whole genome duplication (3R) event in the B. rapa genome, atypical of other Brassica genomes, may account for the emergence of B. rapa from the Brassica progenitor around 8 million years ago.

Conclusions

This work demonstrates the potential of using comparative tiling sequencing for genome analysis of crop species. Based on a comparative analysis of the B. rapa sequences and the Arabidopsis genome, it appears that polyploidy and chromosomal diploidization are ongoing processes that collectively stabilize the B. rapa genome and facilitate its evolution.  相似文献   

17.

Background

Rigorous study of mitochondrial functions and cell biology in the budding yeast, Saccharomyces cerevisiae has advanced our understanding of mitochondrial genetics. This yeast is now a powerful model for population genetics, owing to large genetic diversity and highly structured populations among wild isolates. Comparative mitochondrial genomic analyses between yeast species have revealed broad evolutionary changes in genome organization and architecture. A fine-scale view of recent evolutionary changes within S. cerevisiae has not been possible due to low numbers of complete mitochondrial sequences.

Results

To address challenges of sequencing AT-rich and repetitive mitochondrial DNAs (mtDNAs), we sequenced two divergent S. cerevisiae mtDNAs using a single-molecule sequencing platform (PacBio RS). Using de novo assemblies, we generated highly accurate complete mtDNA sequences. These mtDNA sequences were compared with 98 additional mtDNA sequences gathered from various published collections. Phylogenies based on mitochondrial coding sequences and intron profiles revealed that intraspecific diversity in mitochondrial genomes generally recapitulated the population structure of nuclear genomes. Analysis of intergenic sequence indicated a recent expansion of mobile elements in certain populations. Additionally, our analyses revealed that certain populations lacked introns previously believed conserved throughout the species, as well as the presence of introns never before reported in S. cerevisiae.

Conclusions

Our results revealed that the extensive variation in S. cerevisiae mtDNAs is often population specific, thus offering a window into the recent evolutionary processes shaping these genomes. In addition, we offer an effective strategy for sequencing these challenging AT-rich mitochondrial genomes for small scale projects.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1664-4) contains supplementary material, which is available to authorized users.  相似文献   

18.
19.
Yang CH  Chang HW  Ho CH  Chou YC  Chuang LY 《PloS one》2011,6(3):e17729

Background

Complete mitochondrial (mt) genome sequencing is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. For long template sequencing, i.e., like the entire mtDNA, it is essential to design primers for Polymerase Chain Reaction (PCR) amplicons which are partly overlapping each other. The presented chromosome walking strategy provides the overlapping design to solve the problem for unreliable sequencing data at the 5′ end and provides the effective sequencing. However, current algorithms and tools are mostly focused on the primer design for a local region in the genomic sequence. Accordingly, it is still challenging to provide the primer sets for the entire mtDNA.

Methodology/Principal Findings

The purpose of this study is to develop an integrated primer design algorithm for entire mt genome in general, and for the common primer sets for closely-related species in particular. We introduce ClustalW to generate the multiple sequence alignment needed to find the conserved sequences in closely-related species. These conserved sequences are suitable for designing the common primers for the entire mtDNA. Using a heuristic algorithm particle swarm optimization (PSO), all the designed primers were computationally validated to fit the common primer design constraints, such as the melting temperature, primer length and GC content, PCR product length, secondary structure, specificity, and terminal limitation. The overlap requirement for PCR amplicons in the entire mtDNA is satisfied by defining the overlapping region with the sliding window technology. Finally, primer sets were designed within the overlapping region. The primer sets for the entire mtDNA sequences were successfully demonstrated in the example of two closely-related fish species. The pseudo code for the primer design algorithm is provided.

Conclusions/Significance

In conclusion, it can be said that our proposed sliding window-based PSO algorithm provides the necessary primer sets for the entire mt genome amplification and sequencing.  相似文献   

20.

Background

Cryptosporidium hominis is a dominant species for human cryptosporidiosis. Within the species, IbA10G2 is the most virulent subtype responsible for all C. hominis–associated outbreaks in Europe and Australia, and is a dominant outbreak subtype in the United States. In recent yearsIaA28R4 is becoming a major new subtype in the United States. In this study, we sequenced the genomes of two field specimens from each of the two subtypes and conducted a comparative genomic analysis of the obtained sequences with those from the only fully sequenced Cryptosporidium parvum genome.

Results

Altogether, 8.59-9.05 Mb of Cryptosporidium sequences in 45–767 assembled contigs were obtained from the four specimens, representing 94.36-99.47% coverage of the expected genome. These genomes had complete synteny in gene organization and 96.86-97.0% and 99.72-99.83% nucleotide sequence similarities to the published genomes of C. parvum and C. hominis, respectively. Several major insertions and deletions were seen between C. hominis and C. parvum genomes, involving mostly members of multicopy gene families near telomeres. The four C. hominis genomes were highly similar to each other and divergent from the reference IaA25R3 genome in some highly polymorphic regions. Major sequence differences among the four specimens sequenced in this study were in the 5′ and 3′ ends of chromosome 6 and the gp60 region, largely the result of genetic recombination.

Conclusions

The sequence similarity among specimens of the two dominant outbreak subtypes and genetic recombination in chromosome 6, especially around the putative virulence determinant gp60 region, suggest that genetic recombination plays a potential role in the emergence of hyper-transmissible C. hominis subtypes. The high sequence conservation between C. parvum and C. hominis genomes and significant differences in copy numbers of MEDLE family secreted proteins and insulinase-like proteases indicate that telomeric gene duplications could potentially contribute to host expansion in C. parvum.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1517-1) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号