首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Genome sequencing currently requires DNA from pools of numerous nearly identical cells (clones), leaving the genome sequences of many difficult-to-culture microorganisms unattainable. We report a sequencing strategy that eliminates culturing of microorganisms by using real-time isothermal amplification to form polymerase clones (plones) from the DNA of single cells. Two Escherichia coli plones, analyzed by Affymetrix chip hybridization, demonstrate that plonal amplification is specific and the bias is randomly distributed. Whole-genome shotgun sequencing of Prochlorococcus MIT9312 plones showed 62% coverage of the genome from one plone at a sequencing depth of 3.5x, and 66% coverage from a second plone at a depth of 4.7x. Genomic regions not revealed in the initial round of sequencing are recovered by sequencing PCR amplicons derived from plonal DNA. The mutation rate in single-cell amplification is <2 x 10(5), better than that of current genome sequencing standards. Polymerase cloning should provide a critical tool for systematic characterization of genome diversity in the biosphere.  相似文献   

2.
Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host–microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >105-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2–9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs.  相似文献   

3.
The multiple displacement amplification method has revolutionized genomic studies of uncultured bacteria, where the extraction of pure DNA in sufficient quantity for next-generation sequencing is challenging. However, the method is problematic in that it amplifies the target DNA unevenly, induces the formation of chimeric reads and also amplifies contaminating DNA. Here, we have tested the reproducibility of the multiple displacement amplification method using serial dilutions of extracted genomic DNA and intact cells from the cultured endosymbiont Bartonella australis. The amplified DNA was sequenced with the Illumina sequencing technology, and the results were compared to sequence data obtained from unamplified DNA in this study as well as from a previously published genome project. We show that artifacts such as the extent of the amplification bias, the percentage of chimeric reads and the relative fraction of contaminating DNA increase dramatically for the smallest amounts of template DNA. The pattern of read coverage was reproducibly obtained for samples with higher amounts of template DNA, suggesting that the bias is non-random and genome-specific. A re-analysis of previously published sequence data obtained after amplification from clonal endosymbiont populations confirmed these predictions. We conclude that many of the artifacts associated with the use of the multiple displacement amplification method can be alleviated or much reduced by using multiple cells as the template for the amplification. These findings should be particularly useful for researchers studying the genomes of endosymbionts and other uncultured bacteria, for which a small clonal population of cells can be isolated.  相似文献   

4.

Background

Next-generation sequencing sample preparation requires nanogram to microgram quantities of DNA; however, many relevant samples are comprised of only a few cells. Genomic analysis of these samples requires a whole genome amplification method that is unbiased and free of exogenous DNA contamination. To address these challenges we have developed protocols for the production of DNA-free consumables including reagents and have improved upon multiple displacement amplification (iMDA).

Results

A specialized ethylene oxide treatment was developed that renders free DNA and DNA present within Gram positive bacterial cells undetectable by qPCR. To reduce DNA contamination in amplification reagents, a combination of ion exchange chromatography, filtration, and lot testing protocols were developed. Our multiple displacement amplification protocol employs a second strand-displacing DNA polymerase, improved buffers, improved reaction conditions and DNA free reagents. The iMDA protocol, when used in combination with DNA-free laboratory consumables and reagents, significantly improved efficiency and accuracy of amplification and sequencing of specimens with moderate to low levels of DNA. The sensitivity and specificity of sequencing of amplified DNA prepared using iMDA was compared to that of DNA obtained with two commercial whole genome amplification kits using 10 fg (~1-2 bacterial cells worth) of bacterial genomic DNA as a template. Analysis showed >99% of the iMDA reads mapped to the template organism whereas only 0.02% of the reads from the commercial kits mapped to the template. To assess the ability of iMDA to achieve balanced genomic coverage, a non-stochastic amount of bacterial genomic DNA (1 pg) was amplified and sequenced, and data obtained were compared to sequencing data obtained directly from genomic DNA. The iMDA DNA and genomic DNA sequencing had comparable coverage 99.98% of the reference genome at ≥1X coverage and 99.9% at ≥5X coverage while maintaining both balance and representation of the genome.

Conclusions

The iMDA protocol in combination with DNA-free laboratory consumables, significantly improved the ability to sequence specimens with low levels of DNA. iMDA has broad utility in metagenomics, diagnostics, ancient DNA analysis, pre-implantation embryo screening, single-cell genomics, whole genome sequencing of unculturable organisms, and forensic applications for both human and microbial targets.  相似文献   

5.
Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using phi29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2% genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small-subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9% of the sequences had significant similarities to known proteins, and "clusters of orthologous groups" (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible. The reported SSU rRNA sequences and library clone end sequences are listed with their respective GenBank accession numbers, DQ 404590 to DQ 404652, DQ 404654 to DQ 404938, and DX 385314 to DX 389173.  相似文献   

6.
Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using 29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2% genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small-subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9% of the sequences had significant similarities to known proteins, and “clusters of orthologous groups” (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.  相似文献   

7.
A rat PAC library was constructed in the vector pPAC4 from genomic DNA isolated from female Brown Norway rats. This library consists of 215,409 clones arrayed in 614 384-well microtiter plates. An average insert size of 143 kb was estimated from 217 randomly isolated clones, thus representing approximately 10-fold genome coverage. This coverage provides a very high probability that the library contains a unique sequence in genome screening. Tests on randomly selected clones demonstrated that they are very stable, with only 4 of 130 clones showing restriction digest fragment alterations after 80 generations of serial growth. FISH analysis using 70 randomly chosen PACs revealed no significant chimeric clones. About 7% of the clones analyzed contained repetitive sequences related to centromeric regions that hybridized to some but not all centromeres. DNA plate pools and superpools were made, and high-density filters each containing an array of 8 plates in duplicate were prepared. Library screening on these superpools and appropriate filters with 10 single-locus rat markers revealed an average of 8 positive clones, in agreement with the estimated high genomic coverage of this library and representation of the rat genome. This library provides a new resource for rat genome analysis, in particular the identification of genes involved in models of multifactorial disease. The library and high-density filters are currently available to the scientific community.  相似文献   

8.
Corynebacterium pseudotuberculosis is a gram-positive bacterium that causes caseous lymphadenitis in sheep and goats. However, despite the economic losses caused by caseous lymphadenitis, there is little information about the molecular mechanisms of pathogenesis of this bacterium. Genomic libraries constructed in bacterial artificial chromosome (BAC) vectors have become the method of choice for clone development in high-throughput genomic-sequencing projects. Large-insert DNA libraries are useful for isolation and characterization of important genomic regions and genes. In order to identify targets that might be useful for genome sequencing, we constructed a C. pseudotuberculosis BAC library in the vector pBeloBAC11. This library contains about 18,000 BAC clones, with inserts ranging in size from 25 to 120 kb, theoretically representing a 390-fold coverage of the C. pseudotuberculosis genome (estimated to be 2.5-3.1 Mb). Many genomic survey sequences (GSSs) with homology to C. diphtheriae, C. glutamicum, C. efficiens, and C. jeikeium proteins were observed within a sample of 215 sequenced clones, confirming their close phylogenetic relationship. Computer analyses of GSSs did not detect chimeric, deleted, or rearranged BAC clones, showing that this library has low redundancy. This GSSs collection is now available for further genetic and physical analysis of the C. pseudotuberculosis genome. The GSS strategy that we used to develop our library proved to be efficient for the identification of genes and will be an important tool for mapping, assembly, comparative, and functional genomic studies in a C. pseudotuberculosis genome sequencing project that will begin this year.  相似文献   

9.

Background

With an estimated 38 million people worldwide currently infected with human immunodeficiency virus (HIV), and an additional 4.1 million people becoming infected each year, it is important to understand how this virus mutates and develops resistance in order to design successful therapies.

Methodology/Principal Findings

We report a novel experimental method for amplifying full-length HIV genomes without the use of sequence-specific primers for high throughput DNA sequencing, followed by assembly of full length viral genome sequences from the resulting large dataset. Illumina was chosen for sequencing due to its ability to provide greater coverage of the HIV genome compared to prior methods, allowing for more comprehensive characterization of the heterogeneity present in the HIV samples analyzed. Our novel amplification method in combination with Illumina sequencing was used to analyze two HIV populations: a homogenous HIV population based on the canonical NL4-3 strain and a heterogeneous viral population obtained from a HIV patient''s infected T cells. In addition, the resulting sequence was analyzed using a new computational approach to obtain a consensus sequence and several metrics of diversity.

Significance

This study demonstrates how a lower bias amplification method in combination with next generation DNA sequencing provides in-depth, complete coverage of the HIV genome, enabling a stronger characterization of the quasispecies present in a clinically relevant HIV population as well as future study of how HIV mutates in response to a selective pressure.  相似文献   

10.

Background

Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences.

Results

We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates.

Conclusion

We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.  相似文献   

11.
Construction of DNA fragment libraries for next-generation sequencing can prove challenging, especially for samples with low DNA yield. Protocols devised to circumvent the problems associated with low starting quantities of DNA can result in amplification biases that skew the distribution of genomes in metagenomic data. Moreover, sample throughput can be slow, as current library construction techniques are time-consuming. This study evaluated Nextera, a new transposon-based method that is designed for quick production of DNA fragment libraries from a small quantity of DNA. The sequence read distribution across nine phage genomes in a mock viral assemblage met predictions for six of the least-abundant phages; however, the rank order of the most abundant phages differed slightly from predictions. De novo genome assemblies from Nextera libraries provided long contigs spanning over half of the phage genome; in four cases where full-length genome sequences were available for comparison, consensus sequences were found to match over 99% of the genome with near-perfect identity. Analysis of areas of low and high sequence coverage within phage genomes indicated that GC content may influence coverage of sequences from Nextera libraries. Comparisons of phage genomes prepared using both Nextera and a standard 454 FLX Titanium library preparation protocol suggested that the coverage biases according to GC content observed within the Nextera libraries were largely attributable to bias in the Nextera protocol rather than to the 454 sequencing technology. Nevertheless, given suitable sequence coverage, the Nextera protocol produced high-quality data for genomic studies. For metagenomics analyses, effects of GC amplification bias would need to be considered; however, the library preparation standardization that Nextera provides should benefit comparative metagenomic analyses.  相似文献   

12.
Reliable and accurate pre-implantation genetic diagnosis(PGD) of patient’s embryos by next-generation sequencing(NGS) is dependent on efficient whole genome amplification(WGA) of a representative biopsy sample. However, the performance of the current state of the art WGA methods has not been evaluated for sequencing. Using low template DNA(15 pg) and single cells, we showed that the two PCR-based WGA systems Sure Plex and MALBAC are superior to the REPLI-g WGA multiple displacement amplification(MDA) system in terms of consistent and reproducible genome coverage and sequence bias across the 24 chromosomes, allowing better normalization of test to reference sequencing data. When copy number variation sequencing(CNV-Seq) was applied to single cell WGA products derived by either Sure Plex or MALBAC amplification, we showed that known disease CNVs in the range of 3e15 Mb could be reliably and accurately detected at the correct genomic positions. These findings indicate that our CNV-Seq pipeline incorporating either Sure Plex or MALBAC as the key initial WGA step is a powerful methodology for clinical PGD to identify euploid embryos in a patient’s cohort for uterine transplantation.  相似文献   

13.
14.

Background

Knowledge of the origins, distribution, and inheritance of variation in the malaria parasite (Plasmodium falciparum) genome is crucial for understanding its evolution; however the 81% (A+T) genome poses challenges to high-throughput sequencing technologies. We explore the viability of the Roche 454 Genome Sequencer FLX (GS FLX) high throughput sequencing technology for both whole genome sequencing and fine-resolution characterization of genetic exchange in malaria parasites.

Results

We present a scheme to survey recombination in the haploid stage genomes of two sibling parasite clones, using whole genome pyrosequencing that includes a sliding window approach to predict recombination breakpoints. Whole genome shotgun (WGS) sequencing generated approximately 2 million reads, with an average read length of approximately 300 bp. De novo assembly using a combination of WGS and 3 kb paired end libraries resulted in contigs ≤ 34 kb. More than 8,000 of the 24,599 SNP markers identified between parents were genotyped in the progeny, resulting in a marker density of approximately 1 marker/3.3 kb and allowing for the detection of previously unrecognized crossovers (COs) and many non crossover (NCO) gene conversions throughout the genome.

Conclusions

By sequencing the 23 Mb genomes of two haploid progeny clones derived from a genetic cross at more than 30× coverage, we captured high resolution information on COs, NCOs and genetic variation within the progeny genomes. This study is the first to resequence progeny clones to examine fine structure of COs and NCOs in malaria parasites.  相似文献   

15.
In the present study, we describe the deep sequencing and structural analysis of the Holstein breed bull genome. Our aim was to receive a high-quality Holstein bull genome reference sequence and to describe different types of variations in its genome compared to Hereford breed as a reference. We generated four mate-paired libraries and one fragment library from 30 μg of genomic DNA. Colour space fasta were mapped and paired to the reference cow (Bos taurus) genome assembly from Oct. 2011 (Baylor 4.6.1/bosTau7). Initial sequencing resulted in the 4,864,054,296 of 50-bp reads. Average mapping efficiency was 71.7 % and altogether 3,494,534,136 reads and 157,928,163,086 bp were successfully mapped, resulting in 60 × coverage. This is the highest coverage for bovine genome published so far. Tertiary analysis found 6,362,988 SNPs in the bull’s genome, 4,045,889 heterozygous and 2,317,099 homozygous variants. Annotation revealed that 4,330,337 of all discovered SNPs were annotated in the dbSNP database (build 137) and therefore 2,032,651 SNPs were novel. Large indel variations accounted for the 245,947,845 bp of the variation in entire genome and their number was 312,879. We also found that small indels (number was 633,310) accounted for the total variation of 2,542,552 nucleotides in the genome. Only 106,768 small indels were listed in the dbSNP. Finally, we identified 2,758 inversions in the genome of the bull covering in total 23,099,054 bp of genome’s variation. The largest inversion was 87,440 bp in size. In conclusion, the present study discovered different types of novel variants in bull’s genome after high-coverage sequencing. Better knowledge of the functions of these variations is needed.  相似文献   

16.
Resting eggs banks are unique windows that allow us to directly observe shifts in population genetics, and phenotypes over time as natural populations evolve. Though a variety of planktonic organisms also produce resting stages, the keystone freshwater consumer, Daphnia, is a well‐known model for paleogenetics and resurrection ecology. Nevertheless, paleogenomic investigations are limited largely because resting eggs do not contain enough DNA for genomic sequencing. In fact, genomic studies even on extant populations include a laborious preparatory phase of batch culturing dozens of individuals to generate sufficient genomic DNA. Here, we furnish a protocol to generate whole genomes of single ephippial (resting) eggs and single daphniids. Whole genomes of single ephippial eggs and single adults were amplified using Qiagen REPLI‐g Single Cell kit reaction, followed by NEBNext Ultra DNA Library Prep Kit for library construction and Illumina sequencing. We compared the quality of the single‐egg and single‐individual amplified genomes to the standard batch genomic DNA extraction in the absence of genome amplification. At mean 20× depth, coverage was essentially identical for the amplified single individual relative to the unamplified batch extracted genome (>90% of the genome was covered and callable). Finally, while amplification resulted in the slight loss of heterozygosity for the amplified genomes, estimates were largely comparable and illustrate the utility and limitations of this approach in estimating population genetic parameters over long periods of time in natural populations of Daphnia and also other small species known to produce resting stages.  相似文献   

17.
Multiple Displacement Amplification (MDA) of DNA using φ29 (phi29) DNA polymerase amplifies DNA several billion-fold, which has proved to be potentially very useful for evaluating genome information in a culture-independent manner. Whole genome sequencing using DNA from a single prokaryotic genome copy amplified by MDA has not yet been achieved due to the formation of chimeras and skewed amplification of genomic regions during the MDA step, which then precludes genome assembly. We have hereby addressed the issue by using 10 ng of genomic Vibrio cholerae DNA extracted within an agarose plug to ensure circularity as a starting point for MDA and then sequencing the amplified yield using the SOLiD platform. We successfully managed to assemble the entire genome of V. cholerae strain LMA3984-4 (environmental O1 strain isolated in urban Amazonia) using a hybrid de novo assembly strategy. Using our method, only 178 out of 16,713 (1%) of contigs were not able to be inserted into either chromosome scaffold, and out of these 178, only 3 appeared to be chimeras. The other contigs seem to be the result of template-independent non-specific amplification during MDA, yielding spurious reads. Extraction of genomic DNA within an agarose plug in order to ensure circularity of the extracted genome might be key to minimizing amplification bias by MDA for WGS.  相似文献   

18.
A bacterial artificial chromosome (BAC) library containing a large genomlc DNA insert is an important tool for genome physical mapping, map-based cloning, and genome sequencing. To Isolate genes via a map-based cloning strategy and to perform physical mapping of the cotton genome, a high-quality BAC library containing large cotton DNA Inserts Is needed. We have developed a BAC library of the restoring line 0-613-2R for Isolating the fertility restorer (Rf1) gene and genomic research in cotton (Gossypium hirsutum L.). The BAC library contains 97 825 clones stored In 255 pieces of a 384-well mlcrotiter plate. Random samples of BACs digested with the Notl enzyme Indicated that the average Insert size Is approximately 130 kb, with a range of 80-275 kb, and 95.7% of the BAC clones in the library have an average insert size larger than 100 kb. Based on a cotton genome size of 2 250 Mb, library coverage is 5.7 × haploid genome equivalents. Four clones were selected randomly from the library to determine the stability of the BAC clones. There were no different fingerprints for 0 and 100 generations of each clone digested with Notl and Hlndiii enzymes. Thus, the atabiiity of a single BAC clone can be sustained at iesat for 100 generations. Eight simple sequence repeat (SSR) markers flanking the Rf; gene were chosen to screen the BAC library by pool using PCR method and 25 positive clones were identified with 3.1 positive clones per SSR marker.  相似文献   

19.
20.

Background

Bacterial viruses (phages) play a critical role in shaping microbial populations as they influence both host mortality and horizontal gene transfer. As such, they have a significant impact on local and global ecosystem function and human health. Despite their importance, little is known about the genomic diversity harbored in phages, as methods to capture complete phage genomes have been hampered by the lack of knowledge about the target genomes, and difficulties in generating sufficient quantities of genomic DNA for sequencing. Of the approximately 550 phage genomes currently available in the public domain, fewer than 5% are marine phage.

Methodology/Principal Findings

To advance the study of phage biology through comparative genomic approaches we used marine cyanophage as a model system. We compared DNA preparation methodologies (DNA extraction directly from either phage lysates or CsCl purified phage particles), and sequencing strategies that utilize either Sanger sequencing of a linker amplification shotgun library (LASL) or of a whole genome shotgun library (WGSL), or 454 pyrosequencing methods. We demonstrate that genomic DNA sample preparation directly from a phage lysate, combined with 454 pyrosequencing, is best suited for phage genome sequencing at scale, as this method is capable of capturing complete continuous genomes with high accuracy. In addition, we describe an automated annotation informatics pipeline that delivers high-quality annotation and yields few false positives and negatives in ORF calling.

Conclusions/Significance

These DNA preparation, sequencing and annotation strategies enable a high-throughput approach to the burgeoning field of phage genomics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号