期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Small deletion variants have stable breakpoints commonly associated with alu elements

de Smith AJ Walters RG Coin LJ Steinfeld I Yakhini Z Sladek R Froguel P Blakemore AI 《PloS one》2008,3(8):e3104

Copy number variants (CNVs) contribute significantly to human genomic variation, with over 5000 loci reported, covering more than 18% of the euchromatic human genome. Little is known, however, about the origin and stability of variants of different size and complexity. We investigated the breakpoints of 20 small, common deletions, representing a subset of those originally identified by array CGH, using Agilent microarrays, in 50 healthy French Caucasian subjects. By sequencing PCR products amplified using primers designed to span the deleted regions, we determined the exact size and genomic position of the deletions in all affected samples. For each deletion studied, all individuals carrying the deletion share identical upstream and downstream breakpoints at the sequence level, suggesting that the deletion event occurred just once and later became common in the population. This is supported by linkage disequilibrium (LD) analysis, which has revealed that most of the deletions studied are in moderate to strong LD with surrounding SNPs, and have conserved long-range haplotypes. Analysis of the sequences flanking the deletion breakpoints revealed an enrichment of microhomology at the breakpoint junctions. More significantly, we found an enrichment of Alu repeat elements, the overwhelming majority of which intersected deletion breakpoints at their poly-A tails. We found no enrichment of LINE elements or segmental duplications, in contrast to other reports. Sequence analysis revealed enrichment of a conserved motif in the sequences surrounding the deletion breakpoints, although whether this motif has any mechanistic role in the formation of some deletions has yet to be determined. Considered together with existing information on more complex inherited variant regions, and reports of de novo variants associated with autism, these data support the presence of different subgroups of CNV in the genome which may have originated through different mechanisms. 相似文献

2.

Chromosomal inversions associated with environmental adaptation in honeybees

Matthew J. Christmas Andreas Wallberg Ignas Bunikis Anna Olsson Ola Wallerman Matthew T. Webster 《Molecular ecology》2019,28(6):1358-1374

Chromosomal inversions can facilitate local adaptation in the presence of gene flow by suppressing recombination between well‐adapted native haplotypes and poorly adapted migrant haplotypes. East African mountain populations of the honeybee Apis mellifera are highly divergent from neighbouring lowland populations at two extended regions in the genome, despite high similarity in the rest of the genome, suggesting that these genomic regions harbour inversions governing local adaptation. Here, we utilize a new highly contiguous assembly of the honeybee genome to characterize these regions. Using whole‐genome sequencing data from 55 highland and lowland bees, we find that the highland haplotypes at both regions are present at high frequencies in three independent highland populations but extremely rare elsewhere. The boundaries of both divergent regions are characterized by regions of high homology with each other positioned in opposite orientations and contain highly repetitive, long inverted repeats with homology to transposable elements. These regions are likely to represent inversion breakpoints that participate in nonallelic homologous recombination. Using long‐read data, we confirm that the lowland samples are contiguous across breakpoint regions. We do not find evidence for disruption of functional sequence by these breakpoints, which suggests that the inversions are likely maintained due to their allelic content conferring local adaptation in highland environments. Finally, we identify a third divergent genomic region, which contains highly divergent segregating haplotypes that also may contain inversion variants under selection. The results add to a growing body of evidence indicating the importance of chromosomal inversions in local adaptation. 相似文献

3.

A high-resolution map of synteny disruptions in gibbon and human genomes

下载免费PDF全文

Carbone L Vessere GM ten Hallers BF Zhu B Osoegawa K Mootnick A Kofler A Wienberg J Rogers J Humphray S Scott C Harris RA Milosavljevic A de Jong PJ 《PLoS genetics》2006,2(12):e223

Gibbons are part of the same superfamily (Hominoidea) as humans and great apes, but their karyotype has diverged faster from the common hominoid ancestor. At least 24 major chromosome rearrangements are required to convert the presumed ancestral karyotype of gibbons into that of the hominoid ancestor. Up to 28 additional rearrangements distinguish the various living species from the common gibbon ancestor. Using the northern white-cheeked gibbon (2n = 52) (Nomascus leucogenys leucogenys) as a model, we created a high-resolution map of the homologous regions between the gibbon and human. The positions of 100 synteny breakpoints relative to the assembled human genome were determined at a resolution of about 200 kb. Interestingly, 46% of the gibbon–human synteny breakpoints occur in regions that correspond to segmental duplications in the human lineage, indicating a common source of plasticity leading to a different outcome in the two species. Additionally, the full sequences of 11 gibbon BACs spanning evolutionary breakpoints reveal either segmental duplications or interspersed repeats at the exact breakpoint locations. No specific sequence element appears to be common among independent rearrangements. We speculate that the extraordinarily high level of rearrangements seen in gibbons may be due to factors that increase the incidence of chromosome breakage or fixation of the derivative chromosomes in a homozygous state. 相似文献

4.

Segmental duplications and copy-number variation in the human genome 总被引：33，自引：0，他引：33

下载免费PDF全文

Sharp AJ Locke DP McGrath SD Cheng Z Bailey JA Vallente RU Pertz LM Clark RA Schwartz S Segraves R Oseroff VV Albertson DG Pinkel D Eichler EE 《American journal of human genetics》2005,77(1):78-88

The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders. 相似文献

5.

Sequence-level analysis of the diploidization process in the triplicated FLOWERING LOCUS C region of Brassica rapa

下载免费PDF全文

Yang TJ Kim JS Kwon SJ Lim KB Choi BS Kim JA Jin M Park JY Lim MH Kim HI Lim YP Kang JJ Hong JH Kim CB Bhak J Bancroft I Park BS 《The Plant cell》2006,18(6):1339-1347

Strong evidence exists for polyploidy having occurred during the evolution of the tribe Brassiceae. We show evidence for the dynamic and ongoing diploidization process by comparative analysis of the sequences of four paralogous Brassica rapa BAC clones and the homologous 124-kb segment of Arabidopsis thaliana chromosome 5. We estimated the times since divergence of the paralogous and homologous lineages. The three paralogous subgenomes of B. rapa triplicated 13 to 17 million years ago (MYA), very soon after the Arabidopsis and Brassica divergence occurred at 17 to 18 MYA. In addition, a pair of BACs represents a more recent segmental duplication, which occurred approximately 0.8 MYA, and provides an exception to the general expectation of three paralogous segments within the B. rapa genome. The Brassica genome segments show extensive interspersed gene loss relative to the inferred structure of the ancestral genome, whereas the Arabidopsis genome segment appears little changed. Representatives of all 32 genes in the Arabidopsis genome segment are represented in Brassica, but the hexaploid complement of 96 has been reduced to 54 in the three subgenomes, with compression of the genomic region lengths they occupy to between 52 and 110 kb. The gene content of the recently duplicated B. rapa genome segments is identical, but intergenic sequences differ. 相似文献

6.

Heterogeneous duplications in patients with Pelizaeus-Merzbacher disease suggest a mechanism of coupled homologous and nonhomologous recombination

下载免费PDF全文

Woodward KJ Cundall M Sperle K Sistermans EA Ross M Howell G Gribble SM Burford DC Carter NP Hobson DL Garbern JY Kamholz J Heng H Hodes ME Malcolm S Hobson GM 《American journal of human genetics》2005,77(6):966-987

We describe genomic structures of 59 X-chromosome segmental duplications that include the proteolipid protein 1 gene (PLP1) in patients with Pelizaeus-Merzbacher disease. We provide the first report of 13 junction sequences, which gives insight into underlying mechanisms. Although proximal breakpoints were highly variable, distal breakpoints tended to cluster around low-copy repeats (LCRs) (50% of distal breakpoints), and each duplication event appeared to be unique (100 kb to 4.6 Mb in size). Sequence analysis of the junctions revealed no large homologous regions between proximal and distal breakpoints. Most junctions had microhomology of 1-6 bases, and one had a 2-base insertion. Boundaries between single-copy and duplicated DNA were identical to the reference genomic sequence in all patients investigated. Taken together, these data suggest that the tandem duplications are formed by a coupled homologous and nonhomologous recombination mechanism. We suggest repair of a double-stranded break (DSB) by one-sided homologous strand invasion of a sister chromatid, followed by DNA synthesis and nonhomologous end joining with the other end of the break. This is in contrast to other genomic disorders that have recurrent rearrangements formed by nonallelic homologous recombination between LCRs. Interspersed repetitive elements (Alu elements, long interspersed nuclear elements, and long terminal repeats) were found at 18 of the 26 breakpoint sequences studied. No specific motif that may predispose to DSBs was revealed, but single or alternating tracts of purines and pyrimidines that may cause secondary structures were common. Analysis of the 2-Mb region susceptible to duplications identified proximal-specific repeats and distal LCRs in addition to the previously reported ones, suggesting that the unique genomic architecture may have a role in nonrecurrent rearrangements by promoting instability. 相似文献

7.

Haplotype-based genomic sequencing of a chromosomal polymorphism in the white-throated sparrow (Zonotrichia albicollis)

Davis JK Mittel LB Lowman JJ Thomas PJ Maney DL Martin CL;NISC Comparative Sequencing Program Thomas JW 《The Journal of heredity》2011,102(4):380-390

Inversion polymorphisms have been linked to a variety of fundamental biological and evolutionary processes. Yet few studies have used large-scale genomic sequencing to directly compare the haplotypes associated with the standard and inverted chromosome arrangements. Here we describe the targeted genomic sequencing and comparison of haplotypes representing alternative arrangements of a common inversion polymorphism linked to a suite of phenotypes in the white-throated sparrow (Zonotrichia albicollis). More than 7.4 Mb of genomic sequence was generated and assembled from both the standard (ZAL2) and inverted (ZAL2(m)) arrangements. Sequencing of a pair of inversion breakpoints led to the identification of a ZAL2-specific segmental duplication, as well as evidence of breakpoint reusage. Comparison of the haplotype-based sequence assemblies revealed low genetic differentiation outside versus inside the inversion indicative of historical patterns of gene flow and suppressed recombination between ZAL2 and ZAL2(m). Finally, despite ZAL2(m) being maintained in a near constant state of heterozygosity, no signatures of genetic degeneration were detected on this chromosome. Overall, these results provide important insights into the genomic attributes of an inversion polymorphism linked to mate choice and variation in social behavior. 相似文献

8.

Molecular characterization of a meiotic recombinational hotspot enhancing homologous equal crossing-over. 总被引：14，自引：4，他引：10

下载免费PDF全文

Y Uematsu H Kiefer R Schulze K Fischer-Lindahl M Steinmetz 《The EMBO journal》1986,5(9):2123-2129

We have cloned and sequenced a meiotic recombinational hotspot between the A beta 3 and A beta 2 genes in the major histocompatibility complex (MHC) of the mouse. This recombinational hotspot in the Mus musculus castaneus cas3 haplotype was previously localized to a region of 9.5 kb of DNA in which five independent crossing-over events occurred at the unusually high frequency of 0.6%. Aside from cas3, the hotspot appears to be absent in many other MHC haplotypes. We have now confined the five recombinational breakpoints to a stretch of 3.5 kb of DNA. From the nucleotide sequence around the recombinational breakpoints, determined in the parental cas3 and b haplotypes as well as for two recombinant haplotypes, we show that the two recombinant haplotypes were generated by homologous equal crossing-over and place the breakpoints within two non-overlapping stretches of 10 and 36 bp, respectively. Comparison of the DNA sequences of the hotspot-positive cas3 and the hotspot-negative b haplotypes reveals a number of differences, in particular, a CAGA-repeat sequence which is present in CAS3 in six, but only four copies in C57BL/6 DNA. This repeat sequence is reminiscent of one in a previously characterized hotspot in the E beta gene. 相似文献

9.

Linked‐read sequencing enables haplotype‐resolved resequencing at population scale

Dave Lutgen Raphael Ritter Remi‐Andr Olsen Holger Schielzeth Joel Gruselius Philip Ewels Jesús T. García Hadoram Shirihai Manuel Schweizer Alexander Suh Reto Burri 《Molecular ecology resources》2020,20(5):1311-1322

The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences – including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps – are still limited by the lack of high‐quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype‐resolved genome resequencing at population scale, we investigated properties of linked‐read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25×, 20×, 15×, 10×, 7×, and 5×) with high‐coverage data (46–68×) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15× coverage, phased haplotypes span about 90% of the genome assembly, with 50% and 90% of phased sequences located in phase blocks longer than 1.25–4.6 Mb (N50) and 0.27–0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15× coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1 Mb [N50/N90] at 25× coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher‐quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase‐sized genomes like birds, linked‐read sequencing at moderate depth opens an affordable avenue towards haplotype‐resolved genome resequencing at population scale. 相似文献

10.

Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry

Kelkar DS Kumar D Kumar P Balakrishnan L Muthusamy B Yadav AK Shrivastava P Marimuthu A Anand S Sundaram H Kingsbury R Harsha HC Nair B Prasad TS Chauhan DS Katoch K Katoch VM Kumar P Chaerkady R Ramachandran S Dash D Pandey A 《Molecular & cellular proteomics : MCP》2011,10(12):M111.011627

The genome sequencing of H37Rv strain of Mycobacterium tuberculosis was completed in 1998 followed by the whole genome sequencing of a clinical isolate, CDC1551 in 2002. Since then, the genomic sequences of a number of other strains have become available making it one of the better studied pathogenic bacterial species at the genomic level. However, annotation of its genome remains challenging because of high GC content and dissimilarity to other model prokaryotes. To this end, we carried out an in-depth proteogenomic analysis of the M. tuberculosis H37Rv strain using Fourier transform mass spectrometry with high resolution at both MS and tandem MS levels. In all, we identified 3176 proteins from Mycobacterium tuberculosis representing ~80% of its total predicted gene count. In addition to protein database search, we carried out a genome database search, which led to identification of ~250 novel peptides. Based on these novel genome search-specific peptides, we discovered 41 novel protein coding genes in the H37Rv genome. Using peptide evidence and alternative gene prediction tools, we also corrected 79 gene models. Finally, mass spectrometric data from N terminus-derived peptides confirmed 727 existing annotations for translational start sites while correcting those for 33 proteins. We report creation of a high confidence set of protein coding regions in Mycobacterium tuberculosis genome obtained by high resolution tandem mass-spectrometry at both precursor and fragment detection steps for the first time. This proteogenomic approach should be generally applicable to other organisms whose genomes have already been sequenced for obtaining a more accurate catalogue of protein-coding genes. 相似文献

11.

Chromosomal targeting of replicating plasmids in the yeast Hansenula polymorpha.

K N Faber G J Swaving F Faber G Ab W Harder M Veenhuis P Haima 《Journal of general microbiology》1992,138(11):2405-2416

Using an optimized transformation protocol we have studied the possible interactions between transforming plasmid DNA and the Hansenula polymorpha genome. Plasmids consisting only of a pBR322 replicon, an antibiotic resistance marker for Escherichia coli and the Saccharomyces cerevisiae LEU2 gene were shown to replicate autonomously in the yeast at an approximate copy number of 6 (copies per genome equivalent). This autonomous behaviour is probably due to an H. polymorpha replicon-like sequence present on the S. cerevisiae LEU2 gene fragment. Plasmids replicated as multimers consisting of monomers connected in a head-to-tail configuration. Two out of nine transformants analysed appeared to contain plasmid multimers in which one of the monomers contained a deletion. Plasmids containing internal or flanking regions of the genomic alcohol oxidase gene were shown to integrate by homologous single or double cross-over recombination. Both single- and multi-copy (two or three) tandem integrations were observed. Targeted integration occurred in 1-22% of the cases and was only observed with plasmids linearized within the genomic sequences, indicating that homologous linear ends are recombinogenic in H. polymorpha. In the cases in which no targeted integration occurred, double-strand breaks were efficiently repaired in a homology-independent way. Repair of double-strand breaks was precise in 50-68% of the cases. Linearization within homologous as well as nonhomologous plasmid regions stimulated transformation frequencies up to 15-fold. 相似文献

12.

Cytogenetically balanced translocations are associated with focal copy number alterations

Watson SK deLeeuw RJ Horsman DE Squire JA Lam WL 《Human genetics》2007,120(6):795-805

Current cytogenetic methods (e.g., G-banding and multicolor chromosomal painting) allow detection of translocation events but lack the resolution to (a) locate the breakpoints precisely at the chromosome band level or (b) discriminate balanced translocations from translocations with copy number alterations not previously reported, or imperfectly balanced translocations. In this study, we demonstrate that cytogenetically balanced translocations are in fact frequently associated with segmental gain or loss of DNA. The recent development of a whole genome tiling path BAC array has enabled tiling resolution analysis of genomic segmental copy number status. Combining tiling resolution BAC array comparative genomic hybridization (array CGH) with G-Banding analysis and multicolor chromosomal painting approaches such as spectral karyotyping (SKY) facilitates high-resolution mapping of genomic alterations associated with imperfectly balanced translocations. Using a refined version of our CGH array we have deduced the copy number status throughout the genomes of three cytogenetically well-characterized prostate cancer cell lines (PC3, DU145, LNCaP) to determine whether translocations are associated with focal gains and losses of DNA. At 78 kb tiling resolution we identified the boundaries of 170, 80, and 34 known and novel copy number alterations (CNA) in these cell line genomes, respectively. Thirty-three of the 36 known translocations (92%, P < 0.001) in DU145 were associated with segmental CNA. Likewise, 80% (P < 0.001) of the known translocations showed association in LNCaP. Although many translocation breakpoints exhibit segmental alteration in PC3, the pattern of chromosomal rearrangements is too complex for use in comprehensive association with CNA boundaries. Our results reveal that imperfectly balanced translocations in tumor genomes are a phenomenon that occurs at frequencies much higher than previously demonstrated. Electronic supplementary material Supplementary material is available in the online version of this article at and is accessible for authorized users. 相似文献

13.

Low-copy repeats mediate the common 3-Mb deletion in patients with velo-cardio-facial syndrome 总被引：16，自引：0，他引：16

下载免费PDF全文

Edelmann L Pandita RK Morrow BE 《American journal of human genetics》1999,64(4):1076-1086

Velo-cardio-facial syndrome (VCFS) is the most common microdeletion syndrome in humans. It occurs with an estimated frequency of 1 in 4, 000 live births. Most cases occur sporadically, indicating that the deletion is recurrent in the population. More than 90% of patients with VCFS and a 22q11 deletion have a similar 3-Mb hemizygous deletion, suggesting that sequences at the breakpoints confer susceptibility to rearrangements. To define the region containing the chromosome breakpoints, we constructed an 8-kb-resolution physical map. We identified a low-copy repeat in the vicinity of both breakpoints. A set of genetic markers were integrated into the physical map to determine whether the deletions occur within the repeat. Haplotype analysis with genetic markers that flank the repeats showed that most patients with VCFS had deletion breakpoints in the repeat. Within the repeat is a 200-kb duplication of sequences, including a tandem repeat of genes/pseudogenes, surrounding the breakpoints. The genes in the repeat are GGT, BCRL, V7-rel, POM121-like, and GGT-rel. Physical mapping and genomic fingerprint analysis showed that the repeats are virtually identical in the 200-kb region, suggesting that the deletion is mediated by homologous recombination. Examination of two three-generation families showed that meiotic intrachromosomal recombination mediated the deletion. 相似文献

14.

Diverse approaches to achieving grain yield in wheat

Barrero RA Bellgard M Zhang X 《Functional & integrative genomics》2011,11(1):37-48

Artificial selection (domestication and breeding) leaves a strong footprint in plant genomes. Second generation high throughput DNA sequencing technologies make it possible to sequence the gene complement of a plant genome within 3 to 5 months, and the costs of doing so are declining very quickly. This makes it practical to identify genomic regions that have undergone very strong selection. Available reference sequences of important crops such as rice, maize, and sorghum will promote the wide use of re-sequencing strategies in these crops. Marker/trait associations, especially haplotype (or haplotype block) association analyses, will help the precise mapping of important genomic regions and location of favored alleles or haplotypes for breeding. This mini-review examines a genomics approach to defining yield traits in wheat. 相似文献

15.

A rhesus macaque radiation hybrid map and comparative analysis with the human genome 总被引：10，自引：0，他引：10

Murphy WJ Agarwala R Schäffer AA Stephens R Smith C Crumpler NJ David VA O'Brien SJ 《Genomics》2005,86(4):383-395

The genomes of nonhuman primates are powerful references for better understanding the recent evolution of the human genome. Here we compare the order of 802 genomic markers mapped in a rhesus macaque (Macaca mulatta) radiation hybrid panel with the human genome, allowing for nearly complete cross-reference to the human genome at an average resolution of 3.5 Mb. At least 23 large-scale chromosomal rearrangements, mostly inversions, are needed to explain the changes in marker order between human and macaque. Analysis of the breakpoints flanking inverted chromosomal segments and estimation of their duplication divergence dates provide additional evidence implicating segmental duplications as a major mechanism of chromosomal rearrangement in recent primate evolution. 相似文献

16.

SVA retrotransposon insertion-associated deletion represents a novel mutational mechanism underlying large genomic copy number changes with non-recurrent breakpoints

Julia Vogt Kathrin Bengesser Kathleen BM Claes Katharina Wimmer Victor-Felix Mautner Rick van Minkelen Eric Legius Hilde Brems Meena Upadhyaya Josef H?gel Conxi Lazaro Thorsten Rosenbaum Simone Bammert Ludwine Messiaen David N Cooper Hildegard Kehrer-Sawatzki 《Genome biology》2014,15(6):R80

相似文献

17.

Genomic organization of the genes coding for the six main histones of the chicken: complete sequence of the H5 gene 总被引：23，自引：0，他引：23

A Ruiz-Carrillo M Affolter J Renaud 《Journal of molecular biology》1983,170(4):843-859

相似文献

18.

Insertion of a short Alu sequence into the hMSH2 gene following a double cross over next to sequences with chi homology

《Gene》1996,174(1):175-179

相似文献

19.

A sequence-based survey of the complex structural organization of tumor genomes 总被引：1，自引：0，他引：1

Raphael BJ Volik S Yu P Wu C Huang G Linardopoulou EV Trask BJ Waldman F Costello J Pienta KJ Mills GB Bajsarowicz K Kobayashi Y Sridharan S Paris PL Tao Q Aerni SJ Brown RP Bashir A Gray JW Cheng JF de Jong P Nefedov M Ried T Padilla-Nash HM Collins CC 《Genome biology》2008,9(3):R59-17

相似文献

20.

A hybrid approach for the automated finishing of bacterial genomes

Bashir A Klammer AA Robins WP Chin CS Webster D Paxinos E Hsu D Ashby M Wang S Peluso P Sebra R Sorenson J Bullard J Yen J Valdovino M Mollova E Luong K Lin S LaMay B Joshi A Rowe L Frace M Tarr CL Turnsek M Davis BM Kasarskis A Mekalanos JJ Waldor MK Schadt EE 《Nature biotechnology》2012,30(7):701-707

Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly. 相似文献