首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies now enable assembling genomes at unprecedented quality and contiguity. However, the difficulty in assembling repeat‐rich and GC‐rich regions (genomic “dark matter”) limits insights into the evolution of genome structure and regulatory networks. Here, we compare the efficiency of currently available sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter. By adopting different de novo assembly strategies, we compare individual draft assemblies to a curated multiplatform reference assembly and identify the genomic features that cause gaps within each assembly. We show that a multiplatform assembly implementing long‐read, linked‐read and proximity sequencing technologies performs best at recovering transposable elements, multicopy MHC genes, GC‐rich microchromosomes and the repeat‐rich W chromosome. Telomere‐to‐telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is now possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects for optimized completeness of both the coding and noncoding parts of nonmodel genomes.  相似文献   

2.
Biémont C 《Genetics》2010,186(4):1085-1093
The idea that some genetic factors are able to move around chromosomes emerged more than 60 years ago when Barbara McClintock first suggested that such elements existed and had a major role in controlling gene expression and that they also have had a major influence in reshaping genomes in evolution. It was many years, however, before the accumulation of data and theories showed that this latter revolutionary idea was correct although, understandably, it fell far short of our present view of the significant influence of what are now known as "transposable elements" in evolution. In this article, I summarize the main events that influenced my thinking about transposable elements as a young scientist and the influence and role of these specific genomic elements in evolution over subsequent years. Today, we recognize that the findings about genomic changes affected by transposable elements have considerably altered our view of the ways in which genomes evolve and work.  相似文献   

3.
Microsatellites, transposable elements and the X chromosome   总被引:4,自引:0,他引:4  
Variability at microsatellite (MS) loci is generally perceived as resulting from an interaction between mutation and genetic drift and, to a lesser extent, selection and recombination. Less investigated has been the reason for MS accumulation in genomes. We present here a simple model that could account for the variation in density of MS loci, assuming that they are created either through replication slippage or in association with transposable elements. Microsatellites then evolve under the forces cited above. We use this framework to revisit two results obtained from high-density genomic maps of the human and mouse genomes built with thousands of CA repeats: MS loci are (1) less variable and (2) less dense on the X chromosome than on autosomes. The first result is most likely explained by differential mutation on the X chromosome and the autosomes. The second result may be explained by differential mutation, provided the distributions of MS loci are still not at equilibrium. Selection, acting either directly on large allele size or indirectly on the transposable elements associated with MS, may explain the same result. The framework developed here is a first step toward more rigorous models, calling for additional data.   相似文献   

4.
《Genomics》2021,113(6):4163-4172
This analysis presents five genome assemblies of four Notostraca taxa. Notostraca origin dates to the Permian/Upper Devonian and the extant forms show a striking morphological similarity to fossil taxa. The comparison of sequenced genomes with other Branchiopoda genomes shows that, despite the morphological stasis, Notostraca share a dynamic genome evolution with high turnover for gene families' expansion/contraction and a transposable elements content comparable to other branchiopods. While Notostraca substitutions rate appears similar or lower in comparison to other branchiopods, a subset of genes shows a faster evolutionary pace, highlighting the difficulty of generalizing about genomic stasis versus dynamism. Moreover, we found that the variation of Triops cancriformis transposable elements content appeared linked to reproductive strategies, in line with theoretical expectations. Overall, besides providing new genomic resources for the study of these organisms, which appear relevant for their ecology and evolution, we also confirmed the decoupling of morphological and molecular evolution.  相似文献   

5.
Transposable elements are mobile DNA sequences that integrate into host genomes using diverse mechanisms with varying degrees of target site specificity. While the target site preferences of some engineered transposable elements are well studied, the natural target preferences of most transposable elements are poorly characterized. Using population genomic resequencing data from 166 strains of Drosophila melanogaster, we identified over 8,000 new insertion sites not present in the reference genome sequence that we used to decode the natural target preferences of 22 families of transposable element in this species. We found that terminal inverted repeat transposon and long terminal repeat retrotransposon families present clade-specific target site duplications and target site sequence motifs. Additionally, we found that the sequence motifs at transposable element target sites are always palindromes that extend beyond the target site duplication. Our results demonstrate the utility of population genomics data for high-throughput inference of transposable element targeting preferences in the wild and establish general rules for terminal inverted repeat transposon and long terminal repeat retrotransposon target site selection in eukaryotic genomes.  相似文献   

6.
The ithomiine butterflies (Nymphalidae: Danainae) represent the largest known radiation of Müllerian mimetic butterflies. They dominate by number the mimetic butterfly communities, which include species such as the iconic neotropical Heliconius genus. Recent studies on the ecology and genetics of speciation in Ithomiini have suggested that sexual pheromones, colour pattern and perhaps hostplant could drive reproductive isolation. However, no reference genome was available for Ithomiini, which has hindered further exploration on the genetic architecture of these candidate traits, and more generally on the genomic patterns of divergence. Here, we generated high-quality, chromosome-scale genome assemblies for two Melinaea species, M. marsaeus and M. menophilus, and a draft genome of the species Ithomia salapia. We obtained genomes with a size ranging from 396 to 503 Mb across the three species and scaffold N50 of 40.5 and 23.2 Mb for the two chromosome-scale assemblies. Using collinearity analyses we identified massive rearrangements between the two closely related Melinaea species. An annotation of transposable elements and gene content was performed, as well as a specialist annotation to target chemosensory genes, which is crucial for host plant detection and mate recognition in mimetic species. A comparative genomic approach revealed independent gene expansions in ithomiines and particularly in gustatory receptor genes. These first three genomes of ithomiine mimetic butterflies constitute a valuable addition and a welcome comparison to existing biological models such as Heliconius, and will enable further understanding of the mechanisms of adaptation in butterflies.  相似文献   

7.
As a greater number and diversity of high-quality vertebrate reference genomes become available, it is increasingly feasible to use these references to guide new draft assemblies for related species. Reference-guided assembly approaches may substantially increase the contiguity and completeness of a new genome using only low levels of genome coverage that might otherwise be insufficient for de novo genome assembly. We used low-coverage (∼3.5–5.5x) Illumina paired-end sequencing to assemble draft genomes of two bird species (the Gunnison Sage-Grouse, Centrocercus minimus, and the Clark''s Nutcracker, Nucifraga columbiana). We used these data to estimate de novo genome assemblies and reference-guided assemblies, and compared the information content and completeness of these assemblies by comparing CEGMA gene set representation, repeat element content, simple sequence repeat content, and GC isochore structure among assemblies. Our results demonstrate that even lower-coverage genome sequencing projects are capable of producing informative and useful genomic resources, particularly through the use of reference-guided assemblies.  相似文献   

8.

Background

Transposable elements are found in the genomes of nearly all eukaryotes. The recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive elements in the Drosophila euchromatin. We have used this genomic sequence to describe the euchromatic transposable elements in the sequenced strain of this species.

Results

We identified 85 known and eight novel families of transposable element varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial. The density of transposable elements increases an average of 4.7 times in the centromere-proximal regions of each of the major chromosome arms. We found that transposable elements are preferentially found outside genes; only 436 of 1,572 transposable elements are contained within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of transposable elements is found nested within other elements of the same or different classes. Lastly, an analysis of structural variation from different families reveals distinct patterns of deletion for elements belonging to different classes.

Conclusions

This analysis represents an initial characterization of the transposable elements in the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the transposable elements of other organisms can begin to be made. These data have been made available on the Berkeley Drosophila Genome Project website for future analyses.  相似文献   

9.
O'Brien HE  Gong Y  Fung P  Wang PW  Guttman DS 《PloS one》2011,6(11):e27199
Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.  相似文献   

10.
类Tc1转座子研究进展   总被引:1,自引:0,他引:1       下载免费PDF全文
转座子广泛存在于各种生物基因组中,能在染色体不同位点间转座,并在基因组中大量扩增.转座子的活动能引起生物基因组或基因的重组和变异,加速生物多样性及其进化速率,被视为生物基因组进化的内在驱动.转座子分2类:反转座子和DNA转座子.类Tc1转座子是DNA转座子超级家族中种类最多、分布最广的一类.本文简要概述了类Tc1转座子的结构特征,及其扩增、转座和迸发的机制,并展望了其应用和研究方向.  相似文献   

11.
12.
High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly problematic, especially for TE families exhibiting high sequence identity, high copy number, or complex genomic arrangements. While TEs strongly affect genome function and evolution, most current de novo assembly approaches cannot resolve long, identical, and abundant families of TEs. Here, we applied a novel Illumina technology called TruSeq synthetic long-reads, which are generated through highly-parallel library preparation and local assembly of short read data and which achieve lengths of 1.5–18.5 Kbp with an extremely low error rate (0.03% per base). To test the utility of this technology, we sequenced and assembled the genome of the model organism Drosophila melanogaster (reference genome strain y; cn, bw, sp) achieving an N50 contig size of 69.7 Kbp and covering 96.9% of the euchromatic chromosome arms of the current reference genome. TruSeq synthetic long-read technology enables placement of individual TE copies in their proper genomic locations as well as accurate reconstruction of TE sequences. We entirely recovered and accurately placed 4,229 (77.8%) of the 5,434 annotated transposable elements with perfect identity to the current reference genome. As TEs are ubiquitous features of genomes of many species, TruSeq synthetic long-reads, and likely other methods that generate long-reads, offer a powerful approach to improve de novo assemblies of whole genomes.  相似文献   

13.
Identifying factors influencing transposable element activity is essential for understanding how these elements impact genomes and their evolution as well as for fully exploiting them as functional genomics tools and gene-therapy vectors. Using a genetics-based approach, the influence of genomic position on piggyBac mobility in Drosophila melanogaster was assessed while controlling for element structure, genetic background, and transposase concentration. The mobility of piggyBac elements varied over more than two orders of magnitude solely as a result of their locations within the genome. The influence of genomic position on element activities was independent of factors resulting in position-dependent transgene expression ("position effects"). Elements could be relocated to new genomic locations without altering their activity if ≥ 500 bp of genomic DNA originally flanking the element was also relocated. Local intrinsic factors within the neighboring DNA that determined the activity of piggyBac elements were portable not only within the genome but also when elements were moved to plasmids. The predicted bendability of the first 50 bp flanking the 5' and 3' termini of piggyBac elements could account for 60% of the variance in position-dependent activity observed among elements. These results are significant because positional influences on transposable element activities will impact patterns of accumulation of elements within genomes. Manipulating and controlling the local sequence context of piggyBac elements could be a powerful, novel way of optimizing gene vector activity.  相似文献   

14.
Silencing of transposable elements in plants.   总被引:1,自引:0,他引:1  
  相似文献   

15.
Advances in modern sequencing technologies allow us to generate sufficient data to analyze hundreds of bacterial genomes from a single machine in a single day. This potential for sequencing massive numbers of genomes calls for fully automated methods to produce high-quality assemblies and variant calls. We introduce Pilon, a fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions. Pilon works with many types of sequence data, but is particularly strong when supplied with paired end data from two Illumina libraries with small e.g., 180 bp and large e.g., 3–5 Kb inserts. Pilon significantly improves draft genome assemblies by correcting bases, fixing mis-assemblies and filling gaps. For both haploid and diploid genomes, Pilon produces more contiguous genomes with fewer errors, enabling identification of more biologically relevant genes. Furthermore, Pilon identifies small variants with high accuracy as compared to state-of-the-art tools and is unique in its ability to accurately identify large sequence variants including duplications and resolve large insertions. Pilon is being used to improve the assemblies of thousands of new genomes and to identify variants from thousands of clinically relevant bacterial strains. Pilon is freely available as open source software.  相似文献   

16.
17.
Salmonella enterica is divided into four subspecies containing a large number of different serovars, several of which are important zoonotic pathogens and some show a high degree of host specificity or host preference. We compare 45 sequenced S. enterica genomes that are publicly available (22 complete and 23 draft genome sequences). Of these, 35 were found to be of sufficiently good quality to allow a detailed analysis, along with two Escherichia coli strains (K-12 substr. DH10B and the avian pathogenic E. coli (APEC O1) strain). All genomes were subjected to standardized gene finding, and the core and pan-genome of Salmonella were estimated to be around 2,800 and 10,000 gene families, respectively. The constructed pan-genomic dendrograms suggest that gene content is often, but not uniformly correlated to serotype. Any given Salmonella strain has a large stable core, whilst there is an abundance of accessory genes, including the Salmonella pathogenicity islands (SPIs), transposable elements, phages, and plasmid DNA. We visualize conservation in the genomes in relation to chromosomal location and DNA structural features and find that variation in gene content is localized in a selection of variable genomic regions or islands. These include the SPIs but also encompass phage insertion sites and transposable elements. The islands were typically well conserved in several, but not all, isolates—a difference which may have implications in, e.g., host specificity.  相似文献   

18.
Although whole-genome sequencing is greatly extending our knowledge of the genetic capacity of those bacterial species, it is only directly informative for the particular strain sequenced. Many bacterial species exhibit more or less genetic polymorphism within their populations and characterising this variety is an extremely important way of elucidating the biology of these species. Often genomic polymorphisms are associated with multicopy elements, particularly transposable elements. We describe a novel method that efficiently characterises the sequences of such polymorphisms. We have optimised heminested inverse PCR (hINVPCR) to assess the diversity of insertional polymorphisms of a transposable element (IS6110) in clinical isolates of Mycobacterium tuberculosis. To increase the yield of information, genomic DNA was digested with different endonucleases (Bsp1286I, HaeII or PvuI), and primers based on both the 5' and 3' ends of IS6110 were used to amplify and determine the genomic sequence upstream (or downstream) of the transposable element. We found that both the choice of restriction enzyme and the use of primers at both ends of the transposable element significantly increased the diversity of the insertion sites identified. Band stabbing was incorporated into the method as an alternative to cloning in order to screen large number of isolates at a sequence level in a rapid and labour-efficient fashion. We describe some of the purposes to which such data can be put.  相似文献   

19.
Advances in sequencing technology allow genomes to be sequenced at vastly decreased costs. However, the assembled data frequently are highly fragmented with many gaps. We present a practical approach that uses Illumina sequences to improve draft genome assemblies by aligning sequences against contig ends and performing local assemblies to produce gap-spanning contigs. The continuity of a draft genome can thus be substantially improved, often without the need to generate new data.  相似文献   

20.
D J Earp  B Lowe    B Baker 《Nucleic acids research》1990,18(11):3271-3279
The isolation of sequences flanking integrated transposable elements is an important step in gene tagging strategies. We have demonstrated that sequences flanking transposons integrated into complex genomes can be simply and rapidly obtained using the polymerase chain reaction. Amplification of such sequences was established in a model system, a transgenic tobacco plant carrying a single Ac element, and successfully applied to the cloning of a specific Spm element from a maize line carrying multiple Spm hybridizing sequences. The described utilization of methylation sensitive restriction enzymes (including those with degenerate recognition sequences) in the generation of templates for amplification will simplify the cloning and mapping of genomic sequences adjacent to transposable elements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号