首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Yang CW  Chen SM 《PloS one》2012,7(2):e30751

Background

Variation in the genomes of single-stranded RNA viruses affects their infectivity and pathogenicity in two ways. First, viral genome sequence variations lead to changes in viral protein sequences and activities. Second, viral genome sequence variation produces diversity at the level of nucleotide composition and diversity in the interactions between viral RNAs and host toll-like receptors (TLRs). A viral genome-typing method based on this type of diversity has not yet been established.

Methodology/Principal Findings

In this study, we propose a novel genomic trait called the “TLR stimulatory trimer composition” (TSTC) and two quantitative indicators, Score S and Score N, named “TLR stimulatory scores” (TSS). Using the complete genome sequences of 10,994 influenza A viruses (IAV) and 251 influenza B viruses, we show that TSTC analysis reveals the diversity of Score S and Score N among the IAVs isolated from various hosts. In addition, we show that low values of Score S are correlated with high pathogenicity and pandemic potential in IAVs. Finally, we use Score S and Score N to construct a logistic regression model to recognize IAV strains that are highly pathogenic or have high pandemic potential.

Conclusions/Significance

Results from the TSTC analysis indicate that there are large differences between human and avian IAV genomes (except for segment 3), as illustrated by Score S. Moreover, segments 1, 2, 3 and 4 may be major determinants of the stimulatory activity exerted on human TLRs 7 and 8. We also find that a low Score S value is associated with high pathogenicity and pandemic potential in IAV. The π value from the TSS-derived logistic regression model is useful for recognizing emerging IAVs that have high pathogenicity and pandemic potential.  相似文献   

3.

Background

Frankia is a genus of soil actinobacteria forming nitrogen-fixing root-nodule symbiotic relationships with non-leguminous woody plant species, collectively called actinorhizals, from eight dicotyledonous families. Frankia strains are classified into four host-specificity groups (HSGs), each of which exhibits a distinct host range. Genome sizes of representative strains of Alnus, Casuarina, and Elaeagnus HSGs are highly diverged and are positively correlated with the size of their host ranges.

Results

The content and size of 12 Frankia genomes were investigated by in silico comparative genome hybridization and pulsed-field gel electrophoresis, respectively. Data were collected from four query strains of each HSG and compared with those of reference strains possessing completely sequenced genomes. The degree of difference in genome content between query and reference strains varied depending on HSG. Elaeagnus query strains were missing the greatest number (22–32%) of genes compared with the corresponding reference genome; Casuarina query strains lacked the fewest (0–4%), with Alnus query strains intermediate (14–18%). In spite of the remarkable gene loss, genome sizes of Alnus and Elaeagnus query strains were larger than would be expected based on total length of the absent genes. In contrast, Casuarina query strains had smaller genomes than expected.

Conclusions

The positive correlation between genome size and host range held true across all investigated strains, supporting the hypothesis that size and genome content differences are responsible for observed diversity in host plants and host plant biogeography among Frankia strains. In addition, our results suggest that different dynamics of shuffling of genome content have contributed to these symbiotic and biogeographic adaptations. Elaeagnus strains, and to a lesser extent Alnus strains, have gained and lost many genes to adapt to a wide range of environments and host plants. Conversely, rather than acquiring new genes, Casuarina strains have discarded genes to reduce genome size, suggesting an evolutionary orientation towards existence as specialist symbionts.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-609) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

Pseudomonas aeruginosa is an important opportunistic pathogen responsible for many infections in hospitalized and immunocompromised patients. Previous reports estimated that approximately 10% of its 6.6 Mbp genome varies from strain to strain and is therefore referred to as “accessory genome”. Elements within the accessory genome of P. aeruginosa have been associated with differences in virulence and antibiotic resistance. As whole genome sequencing of bacterial strains becomes more widespread and cost-effective, methods to quickly and reliably identify accessory genomic elements in newly sequenced P. aeruginosa genomes will be needed.

Results

We developed a bioinformatic method for identifying the accessory genome of P. aeruginosa. First, the core genome was determined based on sequence conserved among the completed genomes of twelve reference strains using Spine, a software program developed for this purpose. The core genome was 5.84 Mbp in size and contained 5,316 coding sequences. We then developed an in silico genome subtraction program named AGEnt to filter out core genomic sequences from P. aeruginosa whole genomes to identify accessory genomic sequences of these reference strains. This analysis determined that the accessory genome of P. aeruginosa ranged from 6.9-18.0% of the total genome, was enriched for genes associated with mobile elements, and was comprised of a majority of genes with unknown or unclear function. Using these genomes, we showed that AGEnt performed well compared to other publically available programs designed to detect accessory genomic elements. We then demonstrated the utility of the AGEnt program by applying it to the draft genomes of two previously unsequenced P. aeruginosa strains, PA99 and PA103.

Conclusions

The P. aeruginosa genome is rich in accessory genetic material. The AGEnt program accurately identified the accessory genomes of newly sequenced P. aeruginosa strains, even when draft genomes were used. As P. aeruginosa genomes become available at an increasingly rapid pace, this program will be useful in cataloging the expanding accessory genome of this bacterium and in discerning correlations between phenotype and accessory genome makeup. The combination of Spine and AGEnt should be useful in defining the accessory genomes of other bacterial species as well.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-737) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

Metaviriomes, the viral genomes present in an environment, have been studied by direct sequencing of the viral DNA or by cloning in small insert libraries. The short reads generated by both approaches make it very difficult to assemble and annotate such flexible genomic entities. Many environmental viruses belong to unknown groups or prey on uncultured and little known cellular lineages, and hence might not be present in databases.

Methodology and Principal Findings

Here we have used a different approach, the cloning of viral DNA into fosmids before sequencing, to obtain natural contigs that are close to the size of a viral genome. We have studied a relatively low diversity extreme environment: saturated NaCl brines, which simplifies the analysis and interpretation of the data. Forty-two different viral genomes were retrieved, and some of these were almost complete, and could be tentatively identified as head-tail phages (Caudovirales).

Conclusions and Significance

We found a cluster of phage genomes that most likely infect Haloquadratum walsbyi, the square archaeon and major component of the community in these hypersaline habitats. The identity of the prey could be confirmed by the presence of CRISPR spacer sequences shared by the virus and one of the available strain genomes. Other viral clusters detected appeared to prey on the Nanohaloarchaea and on the bacterium Salinibacter ruber, covering most of the diversity of microbes found in this type of environment. This approach appears then as a viable alternative to describe metaviriomes in a much more detailed and reliable way than by the more common approaches based on direct sequencing. An example of transfer of a CRISPR cluster including repeats and spacers was accidentally found supporting the dynamic nature and frequent transfer of this peculiar prokaryotic mechanism of cell protection.  相似文献   

6.

Background

By reshuffling genomes, structural genomic reorganizations provide genetic variation on which natural selection can work. Understanding the mechanisms underlying this process has been a long-standing question in evolutionary biology. In this context, our purpose in this study is to characterize the genomic regions involved in structural rearrangements between human and macaque genomes and determine their influence on meiotic recombination as a way to explore the adaptive role of genome shuffling in mammalian evolution.

Results

We first constructed a highly refined map of the structural rearrangements and evolutionary breakpoint regions in the human and rhesus macaque genomes based on orthologous genes and whole-genome sequence alignments. Using two different algorithms, we refined the genomic position of known rearrangements previously reported by cytogenetic approaches and described new putative micro-rearrangements (inversions and indels) in both genomes. A detailed analysis of the rhesus macaque genome showed that evolutionary breakpoints are in gene-rich regions, being enriched in GO terms related to immune system. We also identified defense-response genes within a chromosome inversion fixed in the macaque lineage, underlying the relevance of structural genomic changes in evolutionary and/or adaptation processes. Moreover, by combining in silico and experimental approaches, we studied the recombination pattern of specific chromosomes that have suffered rearrangements between human and macaque lineages.

Conclusions

Our data suggest that adaptive alleles – in this case, genes involved in the immune response – might have been favored by genome rearrangements in the macaque lineage.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-530) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Mammalian genomes commonly harbor endogenous viral elements. Due to a lack of comparable genome-scale sequence data, far less is known about endogenous viral elements in avian species, even though their small genomes may enable important insights into the patterns and processes of endogenous viral element evolution.

Results

Through a systematic screening of the genomes of 48 species sampled across the avian phylogeny we reveal that birds harbor a limited number of endogenous viral elements compared to mammals, with only five viral families observed: Retroviridae, Hepadnaviridae, Bornaviridae, Circoviridae, and Parvoviridae. All nonretroviral endogenous viral elements are present at low copy numbers and in few species, with only endogenous hepadnaviruses widely distributed, although these have been purged in some cases. We also provide the first evidence for endogenous bornaviruses and circoviruses in avian genomes, although at very low copy numbers. A comparative analysis of vertebrate genomes revealed a simple linear relationship between endogenous viral element abundance and host genome size, such that the occurrence of endogenous viral elements in bird genomes is 6- to 13-fold less frequent than in mammals.

Conclusions

These results reveal that avian genomes harbor relatively small numbers of endogenous viruses, particularly those derived from RNA viruses, and hence are either less susceptible to viral invasions or purge them more effectively.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0539-3) contains supplementary material, which is available to authorized users.  相似文献   

8.

Background

Cost effective next generation sequencing technologies now enable the production of genomic datasets for many novel planktonic eukaryotes, representing an understudied reservoir of genetic diversity. O. tauri is the smallest free-living photosynthetic eukaryote known to date, a coccoid green alga that was first isolated in 1995 in a lagoon by the Mediterranean sea. Its simple features, ease of culture and the sequencing of its 13 Mb haploid nuclear genome have promoted this microalga as a new model organism for cell biology. Here, we investigated the quality of genome assemblies of Illumina GAIIx 75 bp paired-end reads from Ostreococcus tauri, thereby also improving the existing assembly and showing the genome to be stably maintained in culture.

Results

The 3 assemblers used, ABySS, CLCBio and Velvet, produced 95% complete genomes in 1402 to 2080 scaffolds with a very low rate of misassembly. Reciprocally, these assemblies improved the original genome assembly by filling in 930 gaps. Combined with additional analysis of raw reads and PCR sequencing effort, 1194 gaps have been solved in total adding up to 460 kb of sequence. Mapping of RNAseq Illumina data on this updated genome led to a twofold reduction in the proportion of multi-exon protein coding genes, representing 19% of the total 7699 protein coding genes. The comparison of the DNA extracted in 2001 and 2009 revealed the fixation of 8 single nucleotide substitutions and 2 deletions during the approximately 6000 generations in the lab. The deletions either knocked out or truncated two predicted transmembrane proteins, including a glutamate-receptor like gene.

Conclusion

High coverage (>80 fold) paired-end Illumina sequencing enables a high quality 95% complete genome assembly of a compact ~13 Mb haploid eukaryote. This genome sequence has remained stable for 6000 generations of lab culture.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1103) contains supplementary material, which is available to authorized users.  相似文献   

9.

Background

De novo genome assembly of next-generation sequencing data is one of the most important current problems in bioinformatics, essential in many biological applications. In spite of significant amount of work in this area, better solutions are still very much needed.

Results

We present a new program, SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers.

Conclusions

SAGE benefits from innovations in almost every aspect of the assembly process: error correction of input reads, string-overlap graph construction, read copy counts estimation, overlap graph analysis and reduction, contig extraction, and scaffolding. We hope that these new ideas will help advance the current state-of-the-art in an essential area of research in genomics.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-302) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background

Unclassified simian strain Treponema Fribourg-Blanc was isolated in 1966 from baboons (Papio cynocephalus) in West Africa. This strain was morphologically indistinguishable from T. pallidum ssp. pallidum or ssp. pertenue strains, and it was shown to cause human infections.

Methodology/Principal Findings

To precisely define genetic differences between Treponema Fribourg-Blanc (unclassified simian isolate, FB) and T. pallidum ssp. pertenue strains (TPE), a high quality sequence of the whole Fribourg-Blanc genome was determined with 454-pyrosequencing and Illumina sequencing platforms. Combined average coverage of both methods was greater than 500×. Restriction target sites (n = 1,773), identified in silico, of selected restriction enzymes within the Fribourg-Blanc genome were verified experimentally and no discrepancies were found. When compared to the other three sequenced TPE genomes (Samoa D, CDC-2, Gauthier), no major genome rearrangements were found. The Fribourg-Blanc genome clustered with other TPE strains (especially with the TPE CDC-2 strain), while T. pallidum ssp. pallidum strains clustered separately as well as the genome of T. paraluiscuniculi strain Cuniculi A. Within coding regions, 6 deletions, 5 insertions and 117 substitutions differentiated Fribourg-Blanc from other TPE genomes.

Conclusions/Significance

The Fribourg-Blanc genome showed similar genetic characteristics as other TPE strains. Therefore, we propose to rename the unclassified simian isolate to Treponema pallidum ssp. pertenue strain Fribourg-Blanc. Since the Fribourg-Blanc strain was shown to cause experimental infection in human hosts, non-human primates could serve as possible reservoirs of TPE strains. This could considerably complicate recent efforts to eradicate yaws. Genetic differences specific for Fribourg-Blanc could then contribute for identification of cases of animal-derived yaws infections.  相似文献   

11.

Background

More than 20% of the world’s population is at risk for infection by filarial nematodes and >180 million people worldwide are already infected. Along with infection comes significant morbidity that has a socioeconomic impact. The eight filarial nematodes that infect humans are Wuchereria bancrofti, Brugia malayi, Brugia timori, Onchocerca volvulus, Loa loa, Mansonella perstans, Mansonella streptocerca, and Mansonella ozzardi, of which three have published draft genome sequences. Since all have humans as the definitive host, standard avenues of research that rely on culturing and genetics have often not been possible. Therefore, genome sequencing provides an important window into understanding the biology of these parasites. The need for large amounts of high quality genomic DNA from homozygous, inbred lines; the availability of only short sequence reads from next-generation sequencing platforms at a reasonable expense; and the lack of random large insert libraries has limited our ability to generate high quality genome sequences for these parasites. However, the Pacific Biosciences single molecule, real-time sequencing platform holds great promise in reducing input amounts and generating sufficiently long sequences that bypass the need for large insert paired libraries.

Results

Here, we report on efforts to generate a more complete genome assembly for L. loa using genetically heterogeneous DNA isolated from a single clinical sample and sequenced on the Pacific Biosciences platform. To obtain the best assembly, numerous assemblers and sequencing datasets were analyzed, combined, and compared. Quiver-informed trimming of an assembly of only Pacific Biosciences reads by HGAP2 was selected as the final assembly of 96.4 Mbp in 2,250 contigs. This results in ~9% more of the genome in ~85% fewer contigs from ~80% less starting material at a fraction of the cost of previous Roche 454-based sequencing efforts.

Conclusions

The result is the most complete filarial nematode assembly produced thus far and demonstrates the utility of single molecule sequencing on the Pacific Biosciences platform for genetically heterogeneous metazoan genomes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-788) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

Influenza viruses exist as a large group of closely related viral genomes, also called quasispecies. The composition of this influenza viral quasispecies can be determined by an accurate and sensitive sequencing technique and data analysis pipeline. We compared the suitability of two benchtop next-generation sequencers for whole genome influenza A quasispecies analysis: the Illumina MiSeq sequencing-by-synthesis and the Ion Torrent PGM semiconductor sequencing technique.

Results

We first compared the accuracy and sensitivity of both sequencers using plasmid DNA and different ratios of wild type and mutant plasmid. Illumina MiSeq sequencing reads were one and a half times more accurate than those of the Ion Torrent PGM. The majority of sequencing errors were substitutions on the Illumina MiSeq and insertions and deletions, mostly in homopolymer regions, on the Ion Torrent PGM. To evaluate the suitability of the two techniques for determining the genome diversity of influenza A virus, we generated plasmid-derived PR8 virus and grew this virus in vitro. We also optimized an RT-PCR protocol to obtain uniform coverage of all eight genomic RNA segments. The sequencing reads obtained with both sequencers could successfully be assembled de novo into the segmented influenza virus genome. After mapping of the reads to the reference genome, we found that the detection limit for reliable recognition of variants in the viral genome required a frequency of 0.5% or higher. This threshold exceeds the background error rate resulting from the RT-PCR reaction and the sequencing method. Most of the variants in the PR8 virus genome were present in hemagglutinin, and these mutations were detected by both sequencers.

Conclusions

Our approach underlines the power and limitations of two commonly used next-generation sequencers for the analysis of influenza virus gene diversity. We conclude that the Illumina MiSeq platform is better suited for detecting variant sequences whereas the Ion Torrent PGM platform has a shorter turnaround time. The data analysis pipeline that we propose here will also help to standardize variant calling in small RNA genomes based on next-generation sequencing data.  相似文献   

13.

Background

Multipartite mitochondrial genomes are very rare in animals but have been found previously in two insect orders with highly rearranged genomes, the Phthiraptera (parasitic lice), and the Psocoptera (booklice/barklice).

Results

We provide the first report of a multipartite mitochondrial genome architecture in a third order with highly rearranged genomes: Thysanoptera (thrips). We sequenced the complete mitochondrial genomes of two divergent members of the Scirtothrips dorsalis cryptic species complex. The East Asia 1 species has the single circular chromosome common to animals while the South Asia 1 species has a genome consisting of two circular chromosomes. The fragmented South Asia 1 genome exhibits extreme chromosome size asymmetry with the majority of genes on the large, 14.28 kb, chromosome and only nad6 and trnC on the 0.92 kb mini-circle chromosome. This genome also features paralogous control regions with high similarity suggesting a very recent origin of the nad6 mini-circle chromosome in the South Asia 1 cryptic species.

Conclusions

Thysanoptera, along with the other minor paraenopteran insect orders should be considered models for rapid mitochondrial genome evolution, including fragmentation. Continued use of these models will facilitate a greater understanding of recombination and other mitochondrial genome evolutionary processes across eukaryotes.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1672-4) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

Trypanosoma cruzi is the causal agent of Chagas Disease. Recently, the genomes of representative strains from two major evolutionary lineages were sequenced, allowing the construction of a detailed genetic diversity map for this important parasite. However this map is focused on coding regions of the genome, leaving a vast space of regulatory regions uncharacterized in terms of their evolutionary conservation and/or divergence.

Methodology

Using data from the hybrid CL Brener and Sylvio X10 genomes (from the TcVI and TcI Discrete Typing Units, respectively), we identified intergenic regions that share a common evolutionary ancestry, and are present in both CL Brener haplotypes (TcII-like and TcIII-like) and in the TcI genome; as well as intergenic regions that were conserved in only two of the three genomes/haplotypes analyzed. The genetic diversity in these regions was characterized in terms of the accumulation of indels and nucleotide changes.

Principal Findings

Based on this analysis we have identified i) a core of highly conserved intergenic regions, which remained essentially unchanged in independently evolving lineages; ii) intergenic regions that show high diversity in spite of still retaining their corresponding upstream and downstream coding sequences; iii) a number of defined sequence motifs that are shared by a number of unrelated intergenic regions. A fraction of indels explains the diversification of some intergenic regions by the expansion/contraction of microsatellite-like repeats.  相似文献   

15.

Background

So-called 936-type phages are among the most frequently isolated phages in dairy facilities utilising Lactococcus lactis starter cultures. Despite extensive efforts to control phage proliferation and decades of research, these phages continue to negatively impact cheese production in terms of the final product quality and consequently, monetary return.

Results

Whole genome sequencing and in silico analysis of three 936-type phage genomes identified several putative (orphan) methyltransferase (MTase)-encoding genes located within the packaging and replication regions of the genome. Utilising SMRT sequencing, methylome analysis was performed on all three phages, allowing the identification of adenine modifications consistent with N-6 methyladenine sequence methylation, which in some cases could be attributed to these phage-encoded MTases. Heterologous gene expression revealed that M.Phi145I/M.Phi93I and M.Phi93DAM, encoded by genes located within the packaging module, provide protection against the restriction enzymes HphI and DpnII, respectively, representing the first functional MTases identified in members of 936-type phages.

Conclusions

SMRT sequencing technology enabled the identification of the target motifs of MTases encoded by the genomes of three lytic 936-type phages and these MTases represent the first functional MTases identified in this species of phage. The presence of these MTase-encoding genes on 936-type phage genomes is assumed to represent an adaptive response to circumvent host encoded restriction-modification systems thereby increasing the fitness of the phages in a dynamic dairy environment.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-831) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

There is a need to characterize genomes of the foodborne pathogen, Salmonella enterica serovar Enteritidis (SE) and identify genetic information that could be ultimately deployed for differentiating strains of the organism, a need that is yet to be addressed mainly because of the high degree of clonality of the organism. In an effort to achieve the first characterization of the genomes of SE of Canadian origin, we carried out massively parallel sequencing of the nucleotide sequence of 11 SE isolates obtained from poultry production environments (n = 9), a clam and a chicken, assembled finished genomes and investigated diversity of the SE genome.

Results

The median genome size was 4,678,683 bp. A total of 4,833 chromosomal genes defined the pan genome of our field SE isolates consisting of 4,600 genes present in all the genomes, i.e., core genome, and 233 genes absent in at least one genome (accessory genome). Genome diversity was demonstrable by the presence of 1,360 loci showing single nucleotide polymorphism (SNP) in the core genome which was used to portray the genetic distances by means of a phylogenetic tree for the SE isolates. The accessory genome consisted mostly of previously identified SE prophage sequences as well as two, apparently full- sized, novel prophages namely a 28 kb sequence provisionally designated as SE-OLF-10058 (3) prophage and a 43 kb sequence provisionally designated as SE-OLF-10012 prophage.

Conclusions

The number of SNPs identified in the relatively large core genome of SE is a reflection of substantial diversity that could be exploited for strain differentiation as shown by the development of an informative phylogenetic tree. Prophage sequences can also be exploited for SE strain differentiation and lineage tracking. This work has laid the ground work for further studies to develop a readily adoptable laboratory test for the subtyping of SE.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-713) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.

Background

Nucleomorphs are residual nuclei derived from eukaryotic endosymbionts in chlorarachniophyte and cryptophyte algae. The endosymbionts that gave rise to nucleomorphs and plastids in these two algal groups were green and red algae, respectively. Despite their independent origin, the chlorarachniophyte and cryptophyte nucleomorph genomes share similar genomic features such as extreme size reduction and a three-chromosome architecture. This suggests that similar reductive evolutionary forces have acted to shape the nucleomorph genomes in the two groups. Thus far, however, only a single chlorarachniophyte nucleomorph and plastid genome has been sequenced, making broad evolutionary inferences within the chlorarachniophytes and between chlorarachniophytes and cryptophytes difficult. We have sequenced the nucleomorph and plastid genomes of the chlorarachniophyte Lotharella oceanica in order to gain insight into nucleomorph and plastid genome diversity and evolution.

Results

The L. oceanica nucleomorph genome was found to consist of three linear chromosomes totaling ~610 kilobase pairs (kbp), much larger than the 373 kbp nucleomorph genome of the model chlorarachniophyte Bigelowiella natans. The L. oceanica plastid genome is 71 kbp in size, similar to that of B. natans. Unexpectedly long (~35 kbp) sub-telomeric repeat regions were identified in the L. oceanica nucleomorph genome; internal multi-copy regions were also detected. Gene content analyses revealed that nucleomorph house-keeping genes and spliceosomal intron positions are well conserved between the L. oceanica and B. natans nucleomorph genomes. More broadly, gene retention patterns were found to be similar between nucleomorph genomes in chlorarachniophytes and cryptophytes. Chlorarachniophyte plastid genomes showed near identical protein coding gene complements as well as a high level of synteny.

Conclusions

We have provided insight into the process of nucleomorph genome evolution by elucidating the fine-scale dynamics of sub-telomeric repeat regions. Homologous recombination at the chromosome ends appears to be frequent, serving to expand and contract nucleomorph genome size. The main factor influencing nucleomorph genome size variation between different chlorarachniophyte species appears to be expansion-contraction of these telomere-associated repeats rather than changes in the number of unique protein coding genes. The dynamic nature of chlorarachniophyte nucleomorph genomes lies in stark contrast to their plastid genomes, which appear to be highly stable in terms of gene content and synteny.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-374) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

Human papillomavirus 16 (HPV16) species group (alpha-9) of the Alphapapillomavirus genus contains HPV16, HPV31, HPV33, HPV35, HPV52, HPV58 and HPV67. These HPVs account for 75% of invasive cervical cancers worldwide. Viral variants of these HPVs differ in evolutionary history and pathogenicity. Moreover, a comprehensive nomenclature system for HPV variants is lacking, limiting comparisons between studies.

Methods

DNA from cervical samples previously characterized for HPV type were obtained from multiple geographic regions to screen for novel variants. The complete 8 kb genomes of 120 variants representing the major and minor lineages of the HPV16-related alpha-9 HPV types were sequenced to capture maximum viral heterogeneity. Viral evolution was characterized by constructing phylogenic trees based on complete genomes using multiple algorithms. Maximal and viral region specific divergence was calculated by global and pairwise alignments. Variant lineages were classified and named using an alphanumeric system; the prototype genome was assigned to the A lineage for all types.

Results

The range of genome-genome sequence heterogeneity varied from 0.6% for HPV35 to 2.2% for HPV52 and included 1.4% for HPV31, 1.1% for HPV33, 1.7% for HPV58 and 1.1% for HPV67. Nucleotide differences of approximately 1.0% - 10.0% and 0.5%–1.0% of the complete genomes were used to define variant lineages and sublineages, respectively. Each gene/region differs in sequence diversity, from most variable to least variable: noncoding region 1 (NCR1) /noncoding region 2 (NCR2) >upstream regulatory region (URR)> E6/E7 > E2/L2 > E1/L1.

Conclusions

These data define maximum viral genomic heterogeneity of HPV16-related alpha-9 HPV variants. The proposed nomenclature system facilitates the comparison of variants across epidemiological studies. Sequence diversity and phylogenies of this clinically important group of HPVs provides the basis for further studies of discrete viral evolution, epidemiology, pathogenesis and preventative/therapeutic interventions.  相似文献   

20.

Background

Single-cell genome sequencing has the potential to allow the in-depth exploration of the vast genetic diversity found in uncultured microbes. We used the marine cyanobacterium Prochlorococcus as a model system for addressing important challenges facing high-throughput whole genome amplification (WGA) and complete genome sequencing of individual cells.

Methodology/Principal Findings

We describe a pipeline that enables single-cell WGA on hundreds of cells at a time while virtually eliminating non-target DNA from the reactions. We further developed a post-amplification normalization procedure that mitigates extreme variations in sequencing coverage associated with multiple displacement amplification (MDA), and demonstrated that the procedure increased sequencing efficiency and facilitated genome assembly. We report genome recovery as high as 99.6% with reference-guided assembly, and 95% with de novo assembly starting from a single cell. We also analyzed the impact of chimera formation during MDA on de novo assembly, and discuss strategies to minimize the presence of incorrectly joined regions in contigs.

Conclusions/Significance

The methods describe in this paper will be useful for sequencing genomes of individual cells from a variety of samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号