首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Despite having predominately deleterious fitness effects, transposable elements (TEs) are major constituents of eukaryote genomes in general and of plant genomes in particular. Although the proportion of the genome made up of TEs varies at least four-fold across plants, the relative importance of the evolutionary forces shaping variation in TE abundance and distributions across taxa remains unclear. Under several theoretical models, mating system plays an important role in governing the evolutionary dynamics of TEs. Here, we use the recently sequenced Capsella rubella reference genome and short-read whole genome sequencing of multiple individuals to quantify abundance, genome distributions, and population frequencies of TEs in three recently diverged species of differing mating system, two self-compatible species (C. rubella and C. orientalis) and their self-incompatible outcrossing relative, C. grandiflora.

Results

We detect different dynamics of TE evolution in our two self-compatible species; C. rubella shows a small increase in transposon copy number, while C. orientalis shows a substantial decrease relative to C. grandiflora. The direction of this change in copy number is genome wide and consistent across transposon classes. For insertions near genes, however, we detect the highest abundances in C. grandiflora. Finally, we also find differences in the population frequency distributions across the three species.

Conclusion

Overall, our results suggest that the evolution of selfing may have different effects on TE evolution on a short and on a long timescale. Moreover, cross-species comparisons of transposon abundance are sensitive to reference genome bias, and efforts to control for this bias are key when making comparisons across species.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-602) contains supplementary material, which is available to authorized users.  相似文献   

2.

Background

Transposable Elements (TEs) are key components that shape the organization and evolution of genomes. Fungi have developed defense mechanisms against TE invasion such as RIP (Repeat-Induced Point mutation), MIP (Methylation Induced Premeiotically) and Quelling (RNA interference). RIP inactivates repeated sequences by promoting Cytosine to Thymine mutations, whereas MIP only methylates TEs at C residues. Both mechanisms require specific cytosine DNA Methyltransferases (RID1/Masc1) of the Dnmt1 superfamily.

Results

We annotated TE sequences from 10 fungal genomes with different TE content (1-70%). We then used these TE sequences to carry out a genome-wide analysis of C to T mutations biases. Genomes from either Ascomycota or Basidiomycota that were massively invaded by TEs (Blumeria, Melampsora, Puccinia) were characterized by a low frequency of C to T mutation bias (10-20%), whereas other genomes displayed intermediate to high frequencies (25-75%). We identified several dinucleotide signatures at these C to T mutation sites (CpA, CpT, and CpG). Phylogenomic analysis of fungal Dnmt1 MTases revealed a previously unreported association between these dinucleotide signatures and the presence/absence of sub-classes of Dnmt1.

Conclusions

We identified fungal genomes containing large numbers of TEs with many C to T mutations associated with species-specific dinucleotide signatures. This bias suggests that a basic defense mechanism against TE invasion similar to RIP is widespread in fungi, although the efficiency and specificity of this mechanism differs between species. Our analysis revealed that dinucleotide signatures are associated with the presence/absence of specific Dnmt1 subfamilies. In particular, an RID1-dependent RIP mechanism was found only in Ascomycota.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1347-1) contains supplementary material, which is available to authorized users.  相似文献   

3.

Background

Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.

Results

To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.

Conclusion

Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

Comparative evolutionary analysis of whole genomes requires not only accurate annotation of gene space, but also proper annotation of the repetitive fraction which is often the largest component of most if not all genomes larger than 50 kb in size.

Results

Here we present the Rice TE database (RiTE-db) - a genus-wide collection of transposable elements and repeated sequences across 11 diploid species of the genus Oryza and the closely-related out-group Leersia perrieri. The database consists of more than 170,000 entries divided into three main types: (i) a classified and curated set of publicly-available repeated sequences, (ii) a set of consensus assemblies of highly-repetitive sequences obtained from genome sequencing surveys of 12 species; and (iii) a set of full-length TEs, identified and extracted from 12 whole genome assemblies.

Conclusions

This is the first report of a repeat dataset that spans the majority of repeat variability within an entire genus, and one that includes complete elements as well as unassembled repeats. The database allows sequence browsing, downloading, and similarity searches. Because of the strategy adopted, the RiTE-db opens a new path to unprecedented direct comparative studies that span the entire nuclear repeat content of 15 million years of Oryza diversity.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1762-3) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

The 17 Gb bread wheat genome has massively expanded through the proliferation of transposable elements (TEs) and two recent rounds of polyploidization. The assembly of a 774 Mb reference sequence of wheat chromosome 3B provided us with the opportunity to explore the impact of TEs on the complex wheat genome structure and evolution at a resolution and scale not reached so far.

Results

We develop an automated workflow, CLARI-TE, for TE modeling in complex genomes. We delineate precisely 56,488 intact and 196,391 fragmented TEs along the 3B pseudomolecule, accounting for 85% of the sequence, and reconstruct 30,199 nested insertions. TEs have been mostly silent for the last one million years, and the 3B chromosome has been shaped by a succession of bursts that occurred between 1 to 3 million years ago. Accelerated TE elimination in the high-recombination distal regions is a driving force towards chromosome partitioning. CACTAs overrepresented in the high-recombination distal regions are significantly associated with recently duplicated genes. In addition, we identify 140 CACTA-mediated gene capture events with 17 genes potentially created by exon shuffling and show that 19 captured genes are transcribed and under selection pressure, suggesting the important role of CACTAs in the recent wheat adaptation.

Conclusion

Accurate TE modeling uncovers the dynamics of TEs in a highly complex and polyploid genome. It provides novel insights into chromosome partitioning and highlights the role of CACTA transposons in the high level of gene duplication in wheat.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0546-4) contains supplementary material, which is available to authorized users.  相似文献   

6.

Background

Many genomes contain a substantial number of transposable elements (TEs), a few of which are known to be involved in regulating gene expression. However, recent observations suggest that TEs may have played a very important role in the evolution of gene expression because many conserved non-genic sequences, some of which are know to be involved in gene regulation, resemble TEs.

Results

Here we investigate whether new TE insertions affect gene expression profiles by testing whether gene expression divergence between mouse and rat is correlated to the numbers of new transposable elements inserted near genes. We show that expression divergence is significantly correlated to the number of new LTR and SINE elements, but not to the numbers of LINEs. We also show that expression divergence is not significantly correlated to the numbers of ancestral TEs in most cases, which suggests that the correlations between expression divergence and the numbers of new TEs are causal in nature. We quantify the effect and estimate that TE insertion has accounted for ∼20% (95% confidence interval: 12% to 26%) of all expression profile divergence in rodents.

Conclusions

We conclude that TE insertions may have had a major impact on the evolution of gene expression levels in rodents.  相似文献   

7.

Summary

The classification of transposable elements (TEs) is key step towards deciphering their potential impact on the genome. However, this process is often based on manual sequence inspection by TE experts. With the wealth of genomic sequences now available, this task requires automation, making it accessible to most scientists. We propose a new tool, PASTEC, which classifies TEs by searching for structural features and similarities. This tool outperforms currently available software for TE classification. The main innovation of PASTEC is the search for HMM profiles, which is useful for inferring the classification of unknown TE on the basis of conserved functional domains of the proteins. In addition, PASTEC is the only tool providing an exhaustive spectrum of possible classifications to the order level of the Wicker hierarchical TE classification system. It can also automatically classify other repeated elements, such as SSR (Simple Sequence Repeats), rDNA or potential repeated host genes. Finally, the output of this new tool is designed to facilitate manual curation by providing to biologists with all the evidence accumulated for each TE consensus.

Availability

PASTEC is available as a REPET module or standalone software (http://urgi.versailles.inra.fr/download/repet/REPET_linux-x64-2.2.tar.gz). It requires a Unix-like system. There are two standalone versions: one of which is parallelized (requiring Sun grid Engine or Torque), and the other of which is not.  相似文献   

8.

Background and Aims

Although monocotyledonous plants comprise one of the two major groups of angiosperms and include >65 000 species, comprehensive genome analysis has been focused mainly on the Poaceae (grass) family. Due to this bias, most of the conclusions that have been drawn for monocot genome evolution are based on grasses. It is not known whether these conclusions apply to many other monocots.

Methods

To extend our understanding of genome evolution in the monocots, Asparagales genomic sequence data were acquired and the structural properties of asparagus and onion genomes were analysed. Specifically, several available onion and asparagus bacterial artificial chromosomes (BACs) with contig sizes >35 kb were annotated and analysed, with a particular focus on the characterization of long terminal repeat (LTR) retrotransposons.

Key Results

The results reveal that LTR retrotransposons are the major components of the onion and garden asparagus genomes. These elements are mostly intact (i.e. with two LTRs), have mainly inserted within the past 6 million years and are piled up into nested structures. Analysis of shotgun genomic sequence data and the observation of two copies for some transposable elements (TEs) in annotated BACs indicates that some families have become particularly abundant, as high as 4–5 % (asparagus) or 3–4 % (onion) of the genome for the most abundant families, as also seen in large grass genomes such as wheat and maize.

Conclusions

Although previous annotations of contiguous genomic sequences have suggested that LTR retrotransposons were highly fragmented in these two Asparagales genomes, the results presented here show that this was largely due to the methodology used. In contrast, this current work indicates an ensemble of genomic features similar to those observed in the Poaceae.  相似文献   

9.
10.

Background

The transposable element (TE) content of the genomes of plant species varies from near zero in the genome of Utricularia gibba to more than 80 % in many species. It is not well understood whether this variation in genome composition results from common mechanisms or stochastic variation. The major obstacles to investigating mechanisms of TE evolution have been a lack of comparative genomic data sets and efficient computational methods for measuring differences in TE composition between species. In this study, we describe patterns of TE evolution in 14 species in the flowering plant family Asteraceae and 1 outgroup species in the Calyceraceae to investigate phylogenetic patterns of TE dynamics in this important group of plants.

Results

Our findings indicate that TE families in the Asteraceae exhibit distinct patterns of non-neutral evolution, and that there has been a directional increase in copy number of Gypsy retrotransposons since the origin of the Asteraceae. Specifically, there is marked increase in Gypsy abundance at the origin of the Asteraceae and at the base of the tribe Heliantheae. This latter shift in genome composition has had a significant impact on the diversity and abundance distribution of TEs in a lineage-specific manner.

Conclusions

We show that the TE-driven expansion of plant genomes can be facilitated by just a few TE families, and is likely accompanied by the modification and/or replacement of the TE community. Importantly, large shifts in TE composition may be correlated with major of phylogenetic transitions.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1830-8) contains supplementary material, which is available to authorized users.  相似文献   

11.
12.
13.
14.
15.

Background

Personal genome assembly is a critical process when studying tumor genomes and other highly divergent sequences. The accuracy of downstream analyses, such as RNA-seq and ChIP-seq, can be greatly enhanced by using personal genomic sequences rather than standard references. Unfortunately, reads sequenced from these types of samples often have a heterogeneous mix of various subpopulations with different variants, making assembly extremely difficult using existing assembly tools. To address these challenges, we developed SHEAR (Sample Heterogeneity Estimation and Assembly by Reference; http://vk.cs.umn.edu/SHEAR), a tool that predicts SVs, accounts for heterogeneous variants by estimating their representative percentages, and generates personal genomic sequences to be used for downstream analysis.

Results

By making use of structural variant detection algorithms, SHEAR offers improved performance in the form of a stronger ability to handle difficult structural variant types and better computational efficiency. We compare against the lead competing approach using a variety of simulated scenarios as well as real tumor cell line data with known heterogeneous variants. SHEAR is shown to successfully estimate heterogeneity percentages in both cases, and demonstrates an improved efficiency and better ability to handle tandem duplications.

Conclusion

SHEAR allows for accurate and efficient SV detection and personal genomic sequence generation. It is also able to account for heterogeneous sequencing samples, such as from tumor tissue, by estimating the subpopulation percentage for each heterogeneous variant.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-84) contains supplementary material, which is available to authorized users.  相似文献   

16.
17.
18.

Background

Cochliobolus heterostrophus is a dothideomycete that causes Southern Corn Leaf Blight disease. There are two races, race O and race T that differ by the absence (race O) and presence (race T) of ~ 1.2-Mb of DNA encoding genes responsible for the production of T-toxin, which makes race T much more virulent than race O. The presence of repetitive elements in fungal genomes is considered to be an important source of genetic variability between different species.

Results

A detailed analysis of class I and II TEs identified in the near complete genome sequence of race O was performed. In total in race O, 12 new families of transposons were identified. In silico evidence of recent activity was found for many of the transposons and analyses of expressed sequence tags (ESTs) demonstrated that these elements were actively transcribed. Various potentially active TEs were found near coding regions and may modify the expression and structure of these genes by acting as ectopic recombination sites. Transposons were found on scaffolds carrying polyketide synthase encoding genes, responsible for production of T-toxin in race T. Strong evidence of ectopic recombination was found, demonstrating that TEs can play an important role in the modulation of genome architecture of this species. The Repeat Induced Point mutation (RIP) silencing mechanism was shown to have high specificity in C. heterostrophus, acting only on transposons near coding regions.

Conclusions

New families of transposons were identified. In C. heterostrophus, the RIP silencing mechanism is efficient and selective. The co-localization of effector genes and TEs, therefore, exposes those genes to high rates of point mutations. This may accelerate the rate of evolution of these genes, providing a potential advantage for the host. Additionally, it was shown that ectopic recombination promoted by TEs appears to be the major event in the genome reorganization of this species and that a large number of elements are still potentially active. So, this study provides information about the potential impact of TEs on the evolution of C. heterostrophus.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-536) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

Vibrio parahaemolyticus is a Gram-negative halophilic bacterium. Infections with the bacterium could become systemic and can be life-threatening to immunocompromised individuals. Genome sequences of a few clinical isolates of V. parahaemolyticus are currently available, but the genome dynamics across the species and virulence potential of environmental strains on a genome-scale have not been described before.

Results

Here we present genome sequences of four V. parahaemolyticus clinical strains from stool samples of patients and five environmental strains in Hong Kong. Phylogenomics analysis based on single nucleotide polymorphisms revealed a clear distinction between the clinical and environmental isolates. A new gene cluster belonging to the biofilm associated proteins of V. parahaemolyticus was found in clincial strains. In addition, a novel small genomic island frequently found among clinical isolates was reported. A few environmental strains were found harboring virulence genes and prophage elements, indicating their virulence potential. A unique biphenyl degradation pathway was also reported. A database for V. parahaemolyticus (http://kwanlab.bio.cuhk.edu.hk/vp) was constructed here as a platform to access and analyze genome sequences and annotations of the bacterium.

Conclusions

We have performed a comparative genomics analysis of clinical and environmental strains of V. parahaemolyticus. Our analyses could facilitate understanding of the phylogenetic diversity and niche adaptation of this bacterium.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1135) contains supplementary material, which is available to authorized users.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号