首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Large-scale features of the spatial arrangement of protein-coding segments (PCS) are investigated by means of the inter-PCS spacers' size distributions, which have been found to follow power-laws. Linearity in double-logarithmic scale extends to several orders of magnitude in the genomes of organisms as disparate as mammals, insects and plants. This feature is also present in the most compact eukaryotic genomes and in half of the examined bacteria, despite their very limited non-coding space. We have tried to determine the sequence of events in the course of genomes' evolution which may account for the formation of the observed size distributions. The proposed mechanism essentially includes two types of events: (i) segmental duplications (and possibly paleopolyploidy), and (ii) the subsequent loss of most of the duplicated genes. It is shown by computer simulations that the formulated scenario generates power-law-like inter-PCS spacers' size distributions, which remain robust for a variety of parameter choices, even if insertion of external sequences, such as viruses or proliferating retroelements is included. Moreover, power-laws are preserved after most of the non-coding DNA has been removed, thus explaining the finding of this pattern in genomes as compact as that of Takifugu rubripes.  相似文献   

2.
Spatial distribution and clustering of repetitive elements are extensively studied during the last years, as well as their colocalization with other genomic components. Here we investigate the large-scale features of Alu and LINE1 spatial arrangement in the human genome by studying the size distribution of interrepeat distances. In most cases, we have found power-law size distributions extending in several orders of magnitude. We have also studied the correlations of the extent of the power law (linear region in double-logarithmic scale) and of the corresponding exponent (slope) with other genomic properties. A model has been formulated to explain the formation of the observed power laws. According to the model, 2 kinds of events occur repetitively in evolutionary time: random insertion of several types of intruding sequences and occasional loss of repeats belonging to the initial population due to "elimination" events. This simple mechanism is shown to reproduce the observed power-law size distributions and is compatible with our present knowledge on the dynamics of repeat proliferation in the genome.  相似文献   

3.
4.
Repetitive sequences are a major constituent of many eukaryote genomes and play roles in gene regulation, chromosome inheritance, nuclear architecture, and genome stability. The identification of repetitive elements has traditionally relied on in-depth, manual curation and computational determination of close relatives based on DNA identity. However, the rapid divergence of repetitive sequence has made identification of repeats by DNA identity difficult even in closely related species. Hence, the presence of unidentified repeats in genome sequences affects the quality of gene annotations and annotation-dependent analyses (e.g. microarray analyses). We have developed an enhanced repeat identification pipeline using two approaches. First, the de novo repeat finding program PILER-DF was used to identify interspersed repetitive elements in several recently finished Dipteran genomes. Repeats were classified, when possible, according to their similarity to known elements described in Repbase and GenBank, and also screened against annotated genes as one means of eliminating false positives. Second, we used a new program called RepeatRunner, which integrates results from both RepeatMasker nucleotide searches and protein searches using BLASTX. Using RepeatRunner with PILER-DF predictions, we masked repeats in thirteen Dipteran genomes and conclude that combining PILER-DF and RepeatRunner greatly enhances repeat identification in both well-characterized and un-annotated genomes.  相似文献   

5.
The pseudo-fourfold homotetrameric synapse formed by Cre protein and target DNA restricts site-specific recombination to sequences containing dyad-symmetric Cre-binding repeats. Mixtures of engineered altered-specificity Cre monomers can form heterotetramers that recombine nonidentical asymmetric sequences, allowing greater flexibility for target site selection in the genome of interest. However, the variety of tetramers allowed by random subunit association increases the chances of unintended reactivity at nontarget sites. This problem can be circumvented by specifying a unique spatial arrangement of heterotetramer subunits. By reconfiguring intersubunit protein-protein contacts, we directed the assembly of two different Cre monomers, each having a distinct DNA sequence specificity, in an alternating (ABAB) configuration. This designed heterotetramer preferentially recombined a particular pair of asymmetric Lox sites over other pairs, whereas a mixture of freely associating subunits showed little bias. Alone, the engineered monomers had reduced reactivity towards both dyad-symmetric and asymmetric sites. Specificity arose because the organization of Cre-binding repeats of the preferred substrate matched the programmed arrangement of the subunits in the heterotetrameric synapse. When this “spatial matching” principle is applied, Cre-mediated recombination can be directed to asymmetric DNA sequences with greater fidelity.  相似文献   

6.
Fourteen recombinant clones from Zea mays were studied with regard to their composition of unique and repetitive sequences. Southern hybridization experiments were used to classify restriction fragments of the clones into a unique, middle or highly repetitive class of reiteration frequency. All three classes were often found on the same genomic clone. Crosshybridization studies between clones showed that a given repeat might be present on several clones, and thus four families of highly repetitive elements were established. Heteroduplex analysis was used to show the arrangement and size of repeats common between several clones. A short interspersion pattern of unique, middle and highly repetitive DNA was found. The dispersed repetitive elements were 300-1300 bp in length. Analysis of the pattern produced by a given repeat in genomic Southern experiments suggests that some small dispersed repeats may also exist as part of a larger repeating unit elsewhere in the genome.  相似文献   

7.
Despite a vast expansion in the availability of epigenomic data, our knowledge of the chromatin landscape at interspersed repeats remains highly limited by difficulties in mapping short-read sequencing data to these regions. In particular, little is known about the locus-specific regulation of evolutionarily young transposable elements (TEs), which have been implicated in genome stability, gene regulation and innate immunity in a variety of developmental and disease contexts. Here we propose an approach for generating locus-specific protein–DNA binding profiles at interspersed repeats, which leverages information on the spatial proximity between repetitive and non-repetitive genomic regions. We demonstrate that the combination of HiChIP and a newly developed mapping tool (PAtChER) yields accurate protein enrichment profiles at individual repetitive loci. Using this approach, we reveal previously unappreciated variation in the epigenetic profiles of young TE loci in mouse and human cells. Insights gained using our method will be invaluable for dissecting the molecular determinants of TE regulation and their impact on the genome.  相似文献   

8.
A model developed for the evolving size of the repetitive part of the eukaryote genome during speciation was subjected to analytical and computer treatment. The basic assumption of the model was that two classes of repetitive DNA contribute mainly to macroevolutionary changes in genome size: arrays of tandem repeats (ATR) changing through unequal crossover and mobile genetic elements (MGE) changing presumably through an integration mechanism of the Tn- and Is-kind operating in bacteria. Within the framework of this model, the macroevolution of the MGE size is formally equivalent to that of the ATR in the particular case when shifts of chromatids have only one repeat out of register. This allowed us to consider genome size as a large set of various ATRs. The results obtained are as follows. If the duplication and deletion of repeats have unequal fixation probabilities during each speciation act, the predicted species distributions of genome size significantly deviate from the real ones; if they have equal fixation probabilities, there is a conformance between calculated and real distributions. In the latter case, the model reproduces the salient features of real distributions upon acceptance of 1) upper selective boundary nonspecifically limiting increase in genome size within the evolving taxonomic group and 2) non-neutrality of variability in genome size with respect to speciation.  相似文献   

9.
10.
We describe here a family of foldback transposons found in the genome of the higher eucaryote, the sea urchin Strongylocentrotus purpuratus. Two major classes of TU elements have been identified by analysis of genomic DNA and TU element clones. One class consists of largely similar elements with long terminal inverted repeats (IVRs) containing outer and inner domains and sharing a common middle segment that can undergo deletions. Some of these elements contain insertions. The second class is highly heterogeneous, with many different middle segments nonhomologous to those of the first-class and variable-sized inverted repeats that contain only an outer domain. The middle and insertion segments of both classes carry sequences that also are found unassociated from the inverted repeats at many other genomic locations. We conclude that the TU elements are modular structures composed of inverted repeats plus other sequence domains that are themselves members of different families of dispersed repetitive sequences. Such modular elements may have a role in the dispersion and rearrangement of genomic DNA segments.  相似文献   

11.
High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly problematic, especially for TE families exhibiting high sequence identity, high copy number, or complex genomic arrangements. While TEs strongly affect genome function and evolution, most current de novo assembly approaches cannot resolve long, identical, and abundant families of TEs. Here, we applied a novel Illumina technology called TruSeq synthetic long-reads, which are generated through highly-parallel library preparation and local assembly of short read data and which achieve lengths of 1.5–18.5 Kbp with an extremely low error rate (0.03% per base). To test the utility of this technology, we sequenced and assembled the genome of the model organism Drosophila melanogaster (reference genome strain y; cn, bw, sp) achieving an N50 contig size of 69.7 Kbp and covering 96.9% of the euchromatic chromosome arms of the current reference genome. TruSeq synthetic long-read technology enables placement of individual TE copies in their proper genomic locations as well as accurate reconstruction of TE sequences. We entirely recovered and accurately placed 4,229 (77.8%) of the 5,434 annotated transposable elements with perfect identity to the current reference genome. As TEs are ubiquitous features of genomes of many species, TruSeq synthetic long-reads, and likely other methods that generate long-reads, offer a powerful approach to improve de novo assemblies of whole genomes.  相似文献   

12.
The protozoans Trypanosoma cruzi, Trypanosoma brucei and Leishmania major (Tritryps), are evolutionarily ancient eukaryotes which cause worldwide human parasitosis. They present unique biological features. Indeed, canonical DNA/RNA cis-acting elements remain mostly elusive. Repetitive sequences, originally considered as selfish DNA, have been lately recognized as potentially important functional sequence elements in cell biology. In particular, the dinucleotide patterns have been related to genome compartmentalization, gene evolution and gene expression regulation. Thus, we perform a comparative analysis of the occurrence, length and location of dinucleotide repeats (DRs) in the Tritryp genomes and their putative associations with known biological processes. We observe that most types of DRs are more abundant than would be expected by chance. Complementary DRs usually display asymmetrical strand distribution, favoring TT and GT repeats in the coding strands. In addition, we find that GT repeats are among the longest DRs in the three genomes. We also show that specific DRs are non-uniformly distributed along the polycistronic unit, decreasing toward its boundaries. Distinctive non-uniform density patterns were also found in the intergenic regions, with predominance at the vicinity of the ORFs. These findings further support that DRs may control genome structure and gene expression.  相似文献   

13.
How natural selection acts to limit the proliferation of transposable elements (TEs) in genomes has been of interest to evolutionary biologists for many years. To describe TE dynamics in populations, previous studies have used models of transposition–selection equilibrium that assume a constant rate of transposition. However, since TE invasions are known to happen in bursts through time, this assumption may not be reasonable. Here we propose a test of neutrality for TE insertions that does not rely on the assumption of a constant transposition rate. We consider the case of TE insertions that have been ascertained from a single haploid reference genome sequence. By conditioning on the age of an individual TE insertion allele (inferred by the number of unique substitutions that have occurred within the particular TE sequence since insertion), we determine the probability distribution of the insertion allele frequency in a population sample under neutrality. Taking models of varying population size into account, we then evaluate predictions of our model against allele frequency data from 190 retrotransposon insertions sampled from North American and African populations of Drosophila melanogaster. Using this nonequilibrium neutral model, we are able to explain ∼80% of the variance in TE insertion allele frequencies based on age alone. Controlling for both nonequilibrium dynamics of transposition and host demography, we provide evidence for negative selection acting against most TEs as well as for positive selection acting on a small subset of TEs. Our work establishes a new framework for the analysis of the evolutionary forces governing large insertion mutations like TEs, gene duplications, or other copy number variants.  相似文献   

14.
The repetitive sequence PisTR-A has an unusual organization in the pea (Pisum sativum) genome, being present both as short dispersed repeats as well as long arrays of tandemly arranged satellite DNA. Cloning, sequencing and FISH analysis of both PisTR-A variants revealed that the former occurs in the genome embedded within the sequence of Ty3/gypsy-like Ogre elements, whereas the latter forms homogenized arrays of satellite repeats at several genomic loci. The Ogre elements carry the PisTR-A sequences in their 3′ untranslated region (UTR) separating the gag-pol region from the 3′ LTR. This region was found to be highly variable among pea Ogre elements, and includes a number of other tandem repeats along with or instead of PisTR-A. Bioinformatic analysis of LTR-retrotransposons mined from available plant genomic sequence data revealed that the frequent occurrence of variable tandem repeats within 3′ UTRs is a typical feature of the Tat lineage of plant retrotransposons. Comparison of these repeats to known plant satellite sequences uncovered two other instances of satellites with sequence similarity to a Tat-like retrotransposon 3′ UTR regions. These observations suggest that some retrotransposons may significantly contribute to satellite DNA evolution by generating a library of short repeat arrays that can subsequently be dispersed through the genome and eventually further amplified and homogenized into novel satellite repeats.  相似文献   

15.
Repetitive elements in genomes of parasitic protozoa.   总被引:8,自引:0,他引:8  
Repetitive DNA elements have been a part of the genomic fauna of eukaryotes perhaps since their very beginnings. Millions of years of coevolution have given repeats central roles in chromosome maintenance and genetic modulation. Here we review the genomes of parasitic protozoa in the context of the current understanding of repetitive elements. Particular reference is made to repeats in five medically important species with ongoing or completed genome sequencing projects: Plasmodium falciparum, Leishmania major, Trypanosoma brucei, Trypanosoma cruzi, and Giardia lamblia. These organisms are used to illustrate five thematic classes of repeats with different structures and genomic locations. We discuss how these repeat classes may interact with parasitic life-style and also how they can be used as experimental tools. The story which emerges is one of opportunism and upheaval which have been employed to add genetic diversity and genomic flexibility.  相似文献   

16.
Selectivity and polarity of adenovirus type 5 DNA packaging are believed to be directed by an interaction of putative packaging factors with the cis-acting adenovirus packaging domain located within the genomic left end (nucleotides 194 to 380). In previous studies, this packaging domain was mutationally dissected into at least seven functional elements called A repeats. These elements, albeit redundant in function, exhibit differences in the ability to support viral packaging, with elements I, II, V, and VI as the most critical repeats. Viral packaging was shown to be sensitive to spatial changes between individual A repeats. To study the importance of spatial constraints in more detail, we performed site-directed mutagenesis of the 21-bp linker regions separating A repeats I and II, as well as A repeats V and VI. The results of our mutational analysis reveal previously unrecognized sequences that are critical for DNA encapsidation in vivo. On the basis of these results, we present a more complex consensus motif for the adenovirus packaging elements which is bipartite in structure. DNA encapsidation is compromised by changes in spacing between the two conserved parts of the consensus motif, rather than between different A repeats. Genetic evidence implicating packaging elements as independent units in viral DNA packaging is derived from the selection of revertants from a packaging-deficient adenovirus: multimerization of packaging repeats is sufficient for the evolution of packaging-competent viruses. Finally, we identify minimally sized segments of the adenovirus packaging domain that can confer viability and packaging activity to viruses carrying gross truncations within their left-end sequences. Coinfection experiments using the revertant as well as the minimal-packaging-domain mutant viruses strengthen existing arguments for the involvement of limiting, trans-acting components in viral DNA packaging.  相似文献   

17.
Lavrov DV  Maikova OO  Pett W  Belikov SI 《Gene》2012,505(1):91-99
Demosponges, the largest and most diverse class in the phylum Porifera, possess mitochondrial DNA (mtDNA) markedly different from that in other animals. Although several studies investigated evolution of demosponge mtDNA among major lineages of the group, the changes within these groups remain largely unexplored. Recently we determined mitochondrial genomic sequence of the Lake Baikal sponge Lubomirskia baicalensis and described proliferation of small inverted repeats (hairpins) that occurred in it since the divergence between L. baicalensis and the most closely related cosmopolitan freshwater sponge Ephydatia muelleri. Here we report mitochondrial genomes of three additional species of Lake Baikal sponges: Swartschewskia papyracea, Rezinkovia echinata and Baikalospongia intermedia morpha profundalis (Demospongiae, Haplosclerida, Lubomirskiidae) and from a more distantly related freshwater sponge Corvomeyenia sp. (Demospongiae, Haplosclerida, Metaniidae). We use these additional sequences to explore mtDNA evolution in Baikalian sponges, paying particular attention to the variation in the rates of nucleotide substitutions and the distribution of hairpins, abundant in these genomes. We show that most of the changes in Lubomirskiidae mitochondrial genomes are due to insertion/deletion/duplication of these elements rather than single nucleotide substitutions. Thus inverted repeats can act as an important force in evolution of mitochondrial genome architecture and be a valuable marker for population- and species-level studies in this group. In addition, we infer (((Rezinkovia+Lubomirskia)+Swartschewskia)+Baikalospongia) phylogeny for the family Lubomirskiidae based on the analysis of mitochondrial coding sequences from freshwater sponges.  相似文献   

18.
We describe a comprehensive and general approach for mapping centromeres and present a detailed characterization of two maize centromeres. Centromeres are difficult to map and analyze because they consist primarily of repetitive DNA sequences, which in maize are the tandem satellite repeat CentC and interspersed centromeric retrotransposons of maize (CRM). Centromeres are defined epigenetically by the centromeric histone H3 variant, CENH3. Using novel markers derived from centromere repeats, we have mapped all ten centromeres onto the physical and genetic maps of maize. We were able to completely traverse centromeres 2 and 5, confirm physical maps by fluorescence in situ hybridization (FISH), and delineate their functional regions by chromatin immunoprecipitation (ChIP) with anti-CENH3 antibody followed by pyrosequencing. These two centromeres differ substantially in size, apparent CENH3 density, and arrangement of centromeric repeats; and they are larger than the rice centromeres characterized to date. Furthermore, centromere 5 consists of two distinct CENH3 domains that are separated by several megabases. Succession of centromere repeat classes is evidenced by the fact that elements belonging to the recently active recombinant subgroups of CRM1 colonize the present day centromeres, while elements of the ancestral subgroups are also found in the flanking regions. Using abundant CRM and non-CRM retrotransposons that inserted in and near these two centromeres to create a historical record of centromere location, we show that maize centromeres are fluid genomic regions whose borders are heavily influenced by the interplay of retrotransposons and epigenetic marks. Furthermore, we propose that CRMs may be involved in removal of centromeric DNA (specifically CentC), invasion of centromeres by non-CRM retrotransposons, and local repositioning of the CENH3.  相似文献   

19.
20.
Transposable elements (TEs) are considered to be genomic parasites and their interactions with their hosts have been likened to the coevolution between host and other nongenomic, horizontally transferred pathogens. TE families, however, are vertically inherited as integral segments of the nuclear genome. This transmission strategy has been suggested to weaken the selective benefits of host alleles repressing the transposition of specific TE variants. On the other hand, the elevated rates of TE transposition and high incidences of deleterious mutations observed during the rare cases of horizontal transfers of TE families between species could create at least a transient process analogous to the influence of horizontally transmitted pathogens. Here, we formally address this analogy, using empirical and theoretical analysis to specify the mechanism of how host–TE interactions may drive the evolution of host genes. We found that host TE-interacting genes actually have more pervasive evidence of adaptive evolution than immunity genes that interact with nongenomic pathogens in Drosophila. Yet, both our theoretical modeling and empirical observations comparing Drosophila melanogaster populations before and after the horizontal transfer of P elements, which invaded D. melanogaster early last century, demonstrated that horizontally transferred TEs have only a limited influence on host TE-interacting genes. We propose that the more prevalent and constant interaction with multiple vertically transmitted TE families may instead be the main force driving the fast evolution of TE-interacting genes, which is fundamentally different from the gene-for-gene interaction of host–pathogen coevolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号