首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Triticeae species (including wheat, barley and rye) have huge and complex genomes due to polyploidization and a high content of transposable elements (TEs). TEs are known to play a major role in the structure and evolutionary dynamics of Triticeae genomes. During the last 5 years, substantial stretches of contiguous genomic sequence from various species of Triticeae have been generated, making it necessary to update and standardize TE annotations and nomenclature. In this study we propose standard procedures for these tasks, based on structure, nucleic acid and protein sequence homologies. We report statistical analyses of TE composition and distribution in large blocks of genomic sequences from wheat and barley. Altogether, 3.8 Mb of wheat sequence available in the databases was analyzed or re-analyzed, and compared with 1.3 Mb of re-annotated genomic sequences from barley. The wheat sequences were relatively gene-rich (one gene per 23.9 kb), although wheat gene-derived sequences represented only 7.8% (159 elements) of the total, while the remainder mainly comprised coding sequences found in TEs (54.7%, 751 elements). Class I elements [mainly long terminal repeat (LTR) retrotransposons] accounted for the major proportion of TEs, in terms of sequence length as well as element number (83.6% and 498, respectively). In addition, we show that the gene-rich sequences of wheat genome A seem to have a higher TE content than those of genomes B and D, or of barley gene-rich sequences. Moreover, among the various TE groups, MITEs were most often associated with genes: 43.1% of MITEs fell into this category. Finally, the TRIM and copia elements were shown to be the most active TEs in the wheat genome. The implications of these results for the evolution of diploid and polyploid wheat species are discussed. Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

2.
Bread wheat (Triticum aestivum) is one of the most important crops worldwide. However, because of its large, hexaploid, highly repetitive genome it is a challenge to develop efficient means for molecular analysis and genetic improvement in wheat. To better understand the composition and molecular evolution of the hexaploid wheat homoeologous genomes and to evaluate the potential of BAC-end sequences (BES) for marker development, we have followed a chromosome-specific strategy and generated 11 Mb of random BES from chromosome 3B, the largest chromosome of bread wheat. The sequence consisted of about 86% of repetitive elements, 1.2% of coding regions, and 13% remained unknown. With 1.2% of the sequence length corresponding to coding sequences, 6000 genes were estimated for chromosome 3B. New repetitive sequences were identified, including a Triticineae-specific tandem repeat (Fat) that represents 0.6% of the B-genome and has been differentially amplified in the homoeologous genomes before polyploidization. About 10% of the BES contained junctions between nested transposable elements that were used to develop chromosome-specific markers for physical and genetic mapping. Finally, sequence comparison with 2.9 Mb of random sequences from the D-genome of Aegilops tauschii suggested that the larger size of the B-genome is due to a higher content in repetitive elements. It also indicated which families of transposable elements are mostly responsible for differential expansion of the homoeologous wheat genomes during evolution. Our data demonstrate that BAC-end sequencing from flow-sorted chromosomes is a powerful tool for analysing the structure and evolution of polyploid and highly repetitive genomes.  相似文献   

3.
Organisms with a high density of transposable elements (TEs) exhibit nesting, with subsequent repeats found inside previously inserted elements. Nesting splits the sequence structure of TEs and makes annotation of repetitive areas challenging. We present TEnest, a repeat identification and display tool made specifically for highly repetitive genomes. TEnest identifies repetitive sequences and reconstructs separated sections to provide full-length repeats and, for long-terminal repeat (LTR) retrotransposons, calculates age since insertion based on LTR divergence. TEnest provides a chronological insertion display to give an accurate visual representation of TE integration history showing timeline, location, and families of each TE identified, thus creating a framework from which evolutionary comparisons can be made among various regions of the genome. A database of repeats has been developed for maize (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), and barley (Hordeum vulgare) to illustrate the potential of TEnest software. All currently finished maize bacterial artificial chromosomes totaling 29.3 Mb were analyzed with TEnest to provide a characterization of the repeat insertions. Sixty-seven percent of the maize genome was found to be made up of TEs; of these, 95% are LTR retrotransposons. The rate of solo LTR formation is shown to be dissimilar across retrotransposon families. Phylogenetic analysis of TE families reveals specific events of extreme TE proliferation, which may explain the high quantities of certain TE families found throughout the maize genome. The TEnest software package is available for use on PlantGDB under the tools section (http://www.plantgdb.org/prj/TE_nest/TE_nest.html); the source code is available from (http://wiselab.org).  相似文献   

4.
5.
The genomes of barley and wheat, two of the world's most important crops, are very large and complex due to their high content of repetitive DNA. In order to obtain a whole-genome sequence sample, we performed two runs of 454 (GS20) sequencing on genomic DNA of barley cv. Morex, which yielded approximately 1% of a haploid genome equivalent. Almost 60% of the sequences comprised known transposable element (TE) families, and another 9% represented novel repetitive sequences. We also discovered high amounts of low-complexity DNA and non-genic low-copy DNA. We identified almost 2300 protein coding gene sequences and more than 660 putative conserved non-coding sequences. Comparison of the 454 reads with previously published genomic sequences suggested that TE families are distributed unequally along chromosomes. This was confirmed by in situ hybridizations of selected TEs. A comparison of these data for the barley genome with a large sample of publicly available wheat sequences showed that several TE families that are highly abundant in wheat are absent from the barley genome. This finding implies that the TE composition of their genomes differs dramatically, despite their very similar genome size and their close phylogenetic relationship.  相似文献   

6.
7.
A 454 sequencing snapshot was utilised to investigate the genome composition and nucleotide diversity of transposable elements (TEs) for several Triticeae taxa, including Triticum aestivum, Hordeum vulgare, Hordeum spontaneum and Secale cereale together with relatives of the A, B and D genome donors of wheat, Triticum urartu (A), Aegilops speltoides (S) and Aegilops tauschii (D). Additional taxa containing the A genome, Triticum monococcum and its wild relative Triticum boeoticum, were also included. The main focus of the analysis was on the genomic composition of TEs as these make up at least 80% of the overall genome content. Although more than 200 TE families were identified in each species, approximately 50% of the overall genome comprised 12–15 TE families. The BARE1 element was the largest contributor to all genomes, contributing more than 10% to the overall genome. We also found that several TE families differ strongly in their abundance between species, indicating that TE families can thrive extremely successfully in one species while going virtually extinct in another. Additionally, the nucleotide diversity of BARE1 populations within individual genomes was measured. Interestingly, the nucleotide diversity in the domesticated barley H. vulgare cv. Barke was found to be twice as high as in its wild progenitor H. spontaneum, suggesting that the domesticated barley gained nucleotide diversity from the addition of different genotypes during the domestication and breeding process. In the rye/wheat lineage, sequence diversity of BARE1 elements was generally higher, suggesting that factors such as geographical distribution and mating systems might play a role in intragenomic TE diversity.  相似文献   

8.
Discovering and detecting transposable elements in genome sequences   总被引:2,自引:0,他引:2  
The contribution of transposable elements (TEs) to genome structure and evolution as well as their impact on genome sequencing, assembly, annotation and alignment has generated increasing interest in developing new methods for their computational analysis. Here we review the diversity of innovative approaches to identify and annotate TEs in the post-genomic era, covering both the discovery of new TE families and the detection of individual TE copies in genome sequences. These approaches span a broad spectrum in computational biology including de novo, homology-based, structure-based and comparative genomic methods. We conclude that the integration and visualization of multiple approaches and the development of new conceptual representations for TE annotation will further advance the computational analysis of this dynamic component of the genome.  相似文献   

9.
The techniques that are usually used to detect transposable elements (TEs) in nucleic acid sequences rely on sequence similarity with previously characterized elements. However, these methods are likely to miss many elements in various organisms. We tested two strategies for the detection of unknown elements. The first, which we call "TBLASTX strategy," searches for TE sequences by comparing the six-frame translations of the nucleic acid sequences of known TEs with the genomic sequence of interest. The second, "repeat-based strategy," searches genomic sequences for long repeats and clusters them in groups of similar sequences. TE copies from a given family are expected to cluster together. We tested the Drosophila melanogaster genomic sequence and the recently sequenced Anopheles gambiae genome in which most TEs remain unknown. We showed that the "TBLASTX strategy" is very efficient as it detected at least 332 new TE families in D. melanogaster and 400 in A. gambiae. This was unexpected in Drosophila as TEs of this organism have been extensively studied. The "repeat-based strategy" appeared to be very inefficient because of two problems: (i) TE copies are heavily deleted and few copies share homologous regions, and (ii) segmental duplications are frequent and it is not easy to distinguish them from TE copies.  相似文献   

10.
New classes of repetitive DNA elements were effectively identified by isolating small fragments of the elements from the wheat genome. A wheat A genome library was constructed from Triticum monococcum by degenerate cleavage with EcoO109I, the recognition sites of which consisted of 5'-PuGGNCCPy-3'multi-sequences. Three novel repetitive sequences pTm6, pTm69 and pTm58 derived from the A genome were screened and tested for high copy number using a blotting approach. pTm6 showed identity with integrase domains of the barley Ty1-Copia-retrotransposon BARE-1 and pTm58 showed similarity to the barley Ty3-gypsy-like retrotransposon Romani. pTm69, however, constituted a tandem array with useful genomic specificities, but did not share any identity with known repetitive elements. This study also sought to isolate wheat D-genome-specific repetitive elements regardless of the level of methylation, by genomic subtraction. Total genomic DNA of Aegilops tauschii was cleaved into short fragments with a methylation-insensitive 4 bp cutter, Mbol, and then common DNA sequences between Ae. tauschii and Triticum turgidum were subtracted by annealing with excess T. turgidum genomic DNA. The D genome repetitive sequence pAt1 was isolated and used to identify an additional novel repetitive sequence family from wheat bacterial artificial chromosomes with a size range of 1 395-1 850 bp. The methods successfully led pathfinding of two unique repetitive families.  相似文献   

11.
Transposable elements (TEs) constitute >80% of the wheat genome but their dynamics and contribution to size variation and evolution of wheat genomes (Triticum and Aegilops species) remain unexplored. In this study, 10 genomic regions have been sequenced from wheat chromosome 3B and used to constitute, along with all publicly available genomic sequences of wheat, 1.98 Mb of sequence (from 13 BAC clones) of the wheat B genome and 3.63 Mb of sequence (from 19 BAC clones) of the wheat A genome. Analysis of TE sequence proportions (as percentages), ratios of complete to truncated copies, and estimation of insertion dates of class I retrotransposons showed that specific types of TEs have undergone waves of differential proliferation in the B and A genomes of wheat. While both genomes show similar rates and relatively ancient proliferation periods for the Athila retrotransposons, the Copia retrotransposons proliferated more recently in the A genome whereas Gypsy retrotransposon proliferation is more recent in the B genome. It was possible to estimate for the first time the proliferation periods of the abundant CACTA class II DNA transposons, relative to that of the three main retrotransposon superfamilies. Proliferation of these TEs started prior to and overlapped with that of the Athila retrotransposons in both genomes. However, they also proliferated during the same periods as Gypsy and Copia retrotransposons in the A genome, but not in the B genome. As estimated from their insertion dates and confirmed by PCR-based tracing analysis, the majority of differential proliferation of TEs in B and A genomes of wheat (87 and 83%, respectively), leading to rapid sequence divergence, occurred prior to the allotetraploidization event that brought them together in Triticum turgidum and Triticum aestivum, <0.5 million years ago. More importantly, the allotetraploidization event appears to have neither enhanced nor repressed retrotranspositions. We discuss the apparent proliferation of TEs as resulting from their insertion, removal, and/or combinations of both evolutionary forces.  相似文献   

12.
Repetitive genomic sequences might have various structural features and properties distinct from those of the known transposable elements (TE). Here, the content and properties of the repetitive sequences present in a 200-kb region around the rice waxy locus were analyzed using the available rice genomic database. In our previous Southern blotting analysis, 70% of the segments in this region showed smeared patterns, but according to the present database analysis, the proportion of repetitive sequences in this region was only 15%. The repetitive segments in this 200-kb region comprised 75 repetitive sequences that we classified into 46 subfamilies: 21 subfamilies were known TEs or repetitive sequences and 25 subfamilies consisted of newly identified TEs or novel types of repetitive sequences. The region contains no long terminal repeat (LTR) retrotransposable elements, but miniature inverted repeat transposable elements (MITEs) constituted a major class among the elements identified. These MITEs showed remarkable structural divergence: 12 elements were found to be new members of known MITE superfamilies, while five elements had novel terminal structures, and did not belong to any known TE families. Interestingly, about 10% of the repetitive sequences, including virus-like sequences did not have any of the usual characteristics of TEs, suggesting that a certain proportion of repetitive sequences that might not share the transpositional mechanisms of known elements are dispersed in the compact rice genome.  相似文献   

13.
14.
15.
C Gao  M Xiao  X Ren  A Hayward  J Yin  L Wu  D Fu  J Li 《Genomics》2012,100(4):222-230
The movement of transposable elements (TE) in eukaryotic genomes can often result in the occurrence of nested TEs (the insertion of TEs into pre-existing TEs). We performed a general TE assessment using available databases to detect nested TEs and analyze their characteristics and putative functions in eukaryote genomes. A total of 802 TEs were found to be inserted into 690 host TEs from a total number of 11,329 TEs. We reveal that repetitive sequences are associated with an increased occurrence of nested TEs and sequence biased of TE insertion. A high proportion of the genes which were associated with nested TEs are predicted to localize to organelles and participate in nucleic acid and protein binding. Many of these function in metabolic processes, and encode important enzymes for transposition and integration. Therefore, nested TEs in eukaryotic genomes may negatively influence genome expansion, and enrich the diversity of gene expression or regulation.  相似文献   

16.
High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly problematic, especially for TE families exhibiting high sequence identity, high copy number, or complex genomic arrangements. While TEs strongly affect genome function and evolution, most current de novo assembly approaches cannot resolve long, identical, and abundant families of TEs. Here, we applied a novel Illumina technology called TruSeq synthetic long-reads, which are generated through highly-parallel library preparation and local assembly of short read data and which achieve lengths of 1.5–18.5 Kbp with an extremely low error rate (0.03% per base). To test the utility of this technology, we sequenced and assembled the genome of the model organism Drosophila melanogaster (reference genome strain y; cn, bw, sp) achieving an N50 contig size of 69.7 Kbp and covering 96.9% of the euchromatic chromosome arms of the current reference genome. TruSeq synthetic long-read technology enables placement of individual TE copies in their proper genomic locations as well as accurate reconstruction of TE sequences. We entirely recovered and accurately placed 4,229 (77.8%) of the 5,434 annotated transposable elements with perfect identity to the current reference genome. As TEs are ubiquitous features of genomes of many species, TruSeq synthetic long-reads, and likely other methods that generate long-reads, offer a powerful approach to improve de novo assemblies of whole genomes.  相似文献   

17.
The Drosophila melanogaster genome contains approximately 100 distinct families of transposable elements (TEs). In the euchromatic part of the genome, each family is present in a small number of copies (5-150 copies), with individual copies of TEs often present at very low frequencies in populations. This pattern is likely to reflect a balance between the inflow of TEs by transposition and the removal of TEs by natural selection. The nature of natural selection acting against TEs remains controversial. We provide evidence that selection against chromosome abnormalities caused by ectopic recombination limits the spread of some TEs. We also demonstrate for the first time that some TE families in the Drosophila euchromatin appear to be only marginally affected by purifying selection and contain many copies at high population frequencies. We argue that TEs in these families attain high population frequencies and even reach fixation as a result of low family-wide transposition rates leading to low TE copy numbers and consequently reduced strength of selection acting on individual TE copies. Fixation of TEs in these families should provide an upward pressure on the size of intergenic sequences counterbalancing rapid DNA loss through small deletions. Copy-number-dependent selection on TE families caused by ectopic recombination may also promote diversity among TEs in the Drosophila genome.  相似文献   

18.
Reliable population DNA molecular markers are difficult to develop for molluscs, the reasons for which are largely unknown. Identical protocols for microsatellite marker development were implemented in three gastropods. Success rates were lower for Gibbula cineraria compared to Littorina littorea and L. saxatilis. Comparative genomic analysis of 47.2 kb of microsatellite containing sequences (MCS) revealed a high incidence of cryptic repetitive DNA in their flanking regions. The majority of these were novel, and could be grouped into DNA families based upon sequence similarities. Significant inter-specific variation in abundance of cryptic repetitive DNA and DNA families was observed. Repbase scans show that a large proportion of cryptic repetitive DNA was identified as transposable elements (TEs). We argue that a large number of TEs and their transpositional activity may be linked to differential rates of DNA multiplication and recombination. This is likely to be an important factor explaining inter-specific variation in genome stability and hence microsatellite marker development success rates. Gastropods also differed significantly in the type of TEs classes (autonomous vs non-autonomous) observed. We propose that dissimilar transpositional mechanisms differentiate the TE classes in terms of their propensity for transposition, fixation and/or silencing. Consequently, the phylogenetic conservation of non-autonomous TEs, such as CvA, suggests that dispersal of these elements may have behaved as microsatellite-inducing elements. Results seem to indicate that, compared to autonomous, non-autonomous TEs maybe have a more active role in genome rearrangement processes. The implications of the findings for genomic rearrangement, stability and marker development are discussed.  相似文献   

19.
In plant species with large genomes such as wheat or barley, genome organization at the level of DNA sequence is largely unknown. The largest sequences that are publicly accessible so far from Triticeae genomes are two 60 kb and 66 kb intervals from barley. Here, we report on the analysis of a 211 kb contiguous DNA sequence from diploid wheat (Triticum monococcum L.). Five putative genes were identified, two of which show similarity to disease resistance genes. Three of the five genes are clustered in a 31 kb gene-enriched island while the two others are separated from the cluster and from each other by large stretches of repetitive DNA. About 70% of the contig is comprised of several classes of transposable elements. Ten different types of retrotransposons were identified, most of them forming a pattern of nested insertions similar to those found in maize and barley. Evidence was found for major deletion, insertion and duplication events within the analysed region, suggesting multiple mechanisms of genome evolution in addition to retrotransposon amplification. Seven types of foldback transposons, an element class previously not described for wheat genomes, were characterized. One such element was found to be closely associated with genes in several Triticeae species and may therefore be of use for the identification of gene-rich regions in these species.  相似文献   

20.
In eukaryotes, small noncoding RNA molecules of 16–29 nucleotides in length play crucial roles in the regulation of gene expression. Some 377 sequences representing rice pseudo-microRNAs (miRNAs) are available in release 13.0 of the miRBase sequence database () and are grouped into 143 families. Most newly deposited miRNA sequences are likely to be species-specific. To understand the relationship between miRNAs and transposable elements (TEs) in rice, the RepeatMasker application () was used to screen single-stranded precursor miRNA (pre-miRNA) sequences. This analysis revealed that 33.1% of miRNAs and 36.4% of miRNA families are associated with interspersed repeats, and most of them are species-specific. Furthermore, multiple miRNA families can be encoded by the same TE class. Alignment analysis revealed that miR439 originated from an MuDR4-OS TE, which amplified and diversified in the genome as an inverted repeat of the core sequence followed by multiple repeats. Multiple copies of miR445 and its complexity originate from and are driven by the DNA/Tourist TE class. These results provide an important contribution to the elucidation of TE-driven mechanisms that regulate the species specificity and complexity of rice miRNAs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号