首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the bovine genome we found two intrachromosomal DNA fragments flanked by inverted telomeric repeats (GenBank Accession Nos. AF136741 and AF136742). The internal parts of the fragments are homologous exclusively to the human sequences and to the consensus sequence of the L1MC4 subfamily of LINE-1 retrotransposons which are widespread among mammalian genomes. We found that distribution of homologous human sequences within our fragments is not random, reflecting a complicated pattern of insertion mechanisms of and maintenance of retrotransposons in mammalian genomes. One of the possible explanations of the origin of LINE-1 truncated elements flanked by inverted telomeric repeats in the bovine genome is that extrachromosomal DNA fragments may be modified by telomerase and subsequently, transferred into chromosomal DNA.  相似文献   

2.
In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus‐like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host‐dependent manner. Conversely, other simple mono‐ and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double‐strand breaks that induce non‐homologous end joining. The insertions within ATrs occasionally generated new gene‐related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force.  相似文献   

3.
Inverted repeats have been found to occur in both prokaryotic and eukaryotic genomes. Usually they are short and some have important functions in various biological processes. However, long inverted repeats are rare and can cause genome instability. Analyses of C. elegans genome identified long, nearly-perfect inverted repeat sequences involving both divergently and convergently oriented homologous gene pairs and complete intergenic sequences. Comparisons with the orthologous regions from the genomes of C. briggsae and C. remanei show that the inverted repeat structures are often far more conserved than the sequences. This observation implies that there is an active mechanism for maintaining the inverted repeat nature of the sequences.  相似文献   

4.
Most previous work on the evolution of mobile DNA was limited by incomplete sequence information. Whole genome sequences allow us to overcome this limitation. I study the nucleotide diversity of prominent members of five insertion sequence families whose transposition activity is encoded by a single transposase gene. Eighteen among 376 completely sequenced bacterial genomes and plasmids carry between 3 and 20 copies of a given insertion sequence. I show that these copies generally show very low DNA divergence. Specifically, more than 68% of the transposase genes are identical within a genome. The average number of amino acid replacement substitutions at amino acid replacement sites is Ka = 0.013, that at silent sites is Ks = 0.1. This low intragenomic diversity stands in stark contrast to a much higher divergence of the same insertion sequences among distantly related genomes. Gene conversion among protein-coding genes is unlikely to account for this lack of diversity. The relation between transposition frequencies and silent substitution rates suggests that most insertion sequences in a typical genome are evolutionarily young and have been recently acquired. They may undergo periodic extinction in bacterial lineages. By implication, they are detrimental to their host in the long run. This is also suggested by the highly skewed and patchy distribution of insertion sequences among genomes. In sum, one can think of insertion sequences as slow-acting infectious diseases of cell lineages.  相似文献   

5.
6.
Pigeon genome long sequences containing clusters of moderately repeating elements have been cloned. Molecular analysis has shown a dispersed distribution of the repeats in both pigeon and chicken genomes. Within a single cluster, a scrambled distribution of elements belonging to different families of repeats has been shown. Similar repeated sequences have been revealed within clusters. The analysed clusters of repeats are characterized by a limited structural variability in the genomes. In situ hybridization revealed the localization of sequences complementary to the cloned clusters in pigeon and chicken macrochromosomes. Preferential localization has been demonstrated in telomeric and centromeric chromosome regions as well as in the region of R-bands.  相似文献   

7.
8.
Comparative genomics has revealed that variations in bacterial and archaeal genome DNA sequences cannot be explained by only neutral mutations. Virus resistance and plasmid distribution systems have resulted in changes in bacterial and archaeal genome sequences during evolution. The restriction-modification system, a virus resistance system, leads to avoidance of palindromic DNA sequences in genomes. Clustered, regularly interspaced, short palindromic repeats (CRISPRs) found in genomes represent yet another virus resistance system. Comparative genomics has shown that bacteria and archaea have failed to gain any DNA with GC content higher than the GC content of their chromosomes. Thus, horizontally transferred DNA regions have lower GC content than the host chromosomal DNA does. Some nucleoid-associated proteins bind DNA regions with low GC content and inhibit the expression of genes contained in those regions. This form of gene repression is another type of virus resistance system. On the other hand, bacteria and archaea have used plasmids to gain additional genes. Virus resistance systems influence plasmid distribution. Interestingly, the restriction-modification system and nucleoid-associated protein genes have been distributed via plasmids. Thus, GC content and genomic signatures do not reflect bacterial and archaeal evolutionary relationships.  相似文献   

9.
We study the length distribution functions for the 16 possible distinct dimeric tandem repeats in DNA sequences of diverse taxonomic partitions of GenBank (known human and mouse genomes, and complete genomes of Caenorhabditis elegans and yeast). For coding DNA, we find that all 16 distribution functions are exponential. For non-coding DNA, the distribution functions for most of the dimeric repeats have surprisingly long tails, that fit a power-law function. We hypothesize that: (i) the exponential distributions of dimeric repeats in protein coding sequences indicate strong evolutionary pressure against tandem repeat expansion in coding DNA sequences; and (ii) long tails in the distributions of dimers in non-coding DNA may be a result of various mutational mechanisms. These long, non-exponential tails in the distribution of dimeric repeats in non-coding DNA are hypothesized to be due to the higher tolerance of non-coding DNA to mutations. By comparing genomes of various phylogenetic types of organisms, we find that the shapes of the distributions are not universal, but rather depend on the specific class of species and the type of a dimer.  相似文献   

10.
All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.  相似文献   

11.
Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes. Supplementary Data pertaining to this article is available on the Journal of Biosciences Website at  相似文献   

12.
The contextual analysis of nucleotide sequences of 22 Alu repeats arrangement regions in the human genome has been carried out and some of their peculiarities have been revealed. In particular, the occurrence of marked and statistical non-random homology between the repeats and the regions of their integration has been shown. A mechanism of choosing the Alu repeats insertion regions in the genome has been suggested taking into account these peculiarities. Using a sample of the 80 human Alu repeats sequences peculiarities of these repeats location within the genome has been investigated. A tendency to the formation of Alu repeats clusters in various regions of the genome was revealed. A range of possible mechanisms on such Alu clusters emergence is considered. On the basis of the data obtained an "attraction" mechanism, according to which integration of Alu repeats into the definite region of the genome increases the insertion probability of other Alu repeats into the same region, are proposed.  相似文献   

13.
The rhizobia are a group of bacteria widely studied for their capacity to form intimate symbiotic relationships with leguminous plants. However, they are also interesting for containing a remarkable abundance of repetitive genetic elements, such as long DNA repeats. In this study we deeply analyzed long, exact DNA repeats in five representative rhizobial genomes; Rhizobium etli, Rhizobium leguminosarum, Bradyrhizobium japonicum, Sinorhizobium meliloti and Mesorhizobium loti. The results suggest that a huge proportion of repeats can be located in either plasmid or chromosome replicons, except in B. japonicum, which lacks plasmids, but contains the largest number, and longest repeat elements of the genomes analyzed here. Interestingly, we detected a slight correlation between the density of repeats (either number or length) and genome size. As expected, the highest percentage of DNA repeats code for mobile genetic elements, including insertion sequences, recombinases, and transposases. Some repeats corresponded to non-coding or intergenic regions, while in genomes like that of R. etli, a significant percentage of large repeats, mainly located in plasmids, were strongly associated with symbiotic and nitrogen fixation activities. In conclusion, our analysis shows that rhizobial genomes contain a high density of long DNA repeats, which might facilitate recombination events and genome rearrangements, functioning in adaption and persistence during saprophytic or symbiotic life.  相似文献   

14.
霍乱弧菌溶源性噬菌体CTXΦ携带霍乱毒素基因ctxAB,通过其结构基因gⅢ编码产生的PⅢ蛋白识别霍乱弧菌毒素共调菌毛(toxin co-regulated pilus, TCP)的主要结构亚单位TcpA,从而感染具有TCP的霍乱弧菌,使之成为产毒菌株。CTXΦ还有不携带ctxAB的前体pre-CTXΦ,根据CTXΦ基因组中调控基因rstR序列型不同,可分成不同的型别。在不同霍乱弧菌菌株的基因组中,已发现CTXΦ/pre-CTXΦ基因组及其亚型的多种组合排列方式。研究该噬菌体家族的基因组多样性,能够分析其进化及在霍乱弧菌产毒株形成中的作用。本研究发现了4株O1和O139群霍乱弧菌非产毒株具有pre-CTXΦ基因组及多样的rstR序列型,进一步对pre-CTXΦ在4株菌株中的基因组特征进行了分析。利用第3代基因测序法(短读长测序技术和单分子长读长测序技术),获得了4株菌株的基因组序列。利用长读长测序和拼接分析,精确地获得了具有长片段重复序列结构的pre-CTXΦ基因组排列,明确了4株测序菌株中多样的pre-CTXΦ基因组排列。在非产毒株基因组菌株VC3193中发现了携带古典型pre-CTXΦ;还在菌株VC702的pre-CTXΦ基因组中首次发现了肺炎克雷白菌的转座子结构(Gen Bank序列号:SRIL00000000)。在这4株测序菌株中,受体TcpA以及pre-CTXΦ的PⅢ蛋白也具有明显差异的序列,有 TcpA和PⅢ新序列型,这提示了CTXΦ家族感染宿主菌的受体-配体相互识别的复杂对应关系。本研究丰富了对CTXΦ/pre-CTXΦ家族基因组及其整合排列的多样化认识,也为分析该溶源性噬菌体在不同遗传特征霍乱弧菌菌株间的水平转移和促使新产毒克隆形成方面提供了更多的证据。  相似文献   

15.
The protozoans Trypanosoma cruzi, Trypanosoma brucei and Leishmania major (Tritryps), are evolutionarily ancient eukaryotes which cause worldwide human parasitosis. They present unique biological features. Indeed, canonical DNA/RNA cis-acting elements remain mostly elusive. Repetitive sequences, originally considered as selfish DNA, have been lately recognized as potentially important functional sequence elements in cell biology. In particular, the dinucleotide patterns have been related to genome compartmentalization, gene evolution and gene expression regulation. Thus, we perform a comparative analysis of the occurrence, length and location of dinucleotide repeats (DRs) in the Tritryp genomes and their putative associations with known biological processes. We observe that most types of DRs are more abundant than would be expected by chance. Complementary DRs usually display asymmetrical strand distribution, favoring TT and GT repeats in the coding strands. In addition, we find that GT repeats are among the longest DRs in the three genomes. We also show that specific DRs are non-uniformly distributed along the polycistronic unit, decreasing toward its boundaries. Distinctive non-uniform density patterns were also found in the intergenic regions, with predominance at the vicinity of the ORFs. These findings further support that DRs may control genome structure and gene expression.  相似文献   

16.
Identifying and predicting the structural characteristics of novel repeats throughout the genome can lend insight into biological function. Specific repeats are believed to have biological significance as a function of their distribution patterns. We have developed 'GenomeMark,' a computer program that detects and statistically analyzes candidate repeats. Specifically, 'GenomeMark' identifies the periodic distribution of unique words, calculating their chi2 and Z-score values. Using 'GenomeMark,' we identified novel sequence words present in tandem throughout genomes. We found that these sequences have remarkable spacer sequence distributions and many were genome specific, validating the genome signature theory. Further analysis confirmed that many of these sequences have a specific biological function. The program is available from the authors upon request and is freely available for non-commercial and academic entities.  相似文献   

17.
The Restriction On Computer (ROC) program (freely available at http://www.mcb.harvard.edu/gilbert/ROC) was developed and used to analyze the restriction fragment length distribution in the human genome. In contrast to other programs searching for restriction sites, ROC simultaneously analyzes several long nucleotide sequences, such as the entire genomes, and in essence simulates electrophoretic analysis of DNA restriction fragments. In addition, this program extracts and analyzes DNA repeats that account for peaks in the restriction fragment length distribution. The ROC analysis data are consistent with the experimental data obtained via in vitro restriction enzyme analysis (taxonomic printing). A difference between the in vitro and in silico results is explained by underrepresentation of tandem DNA repeats in genomic databases. The ROC analysis of individual genome fragments elucidated the nature of several DNA markers, which were earlier revealed by taxonomic printing, and showed that L1 and Alu repeats are nonrandomly distributed in various chromosomes. Another advantage is that the ROC procedure makes it possible to analyze the nonrandom character of a genomic distribution of short DNA sequences. The ROC analysis showed that a low poly(G) frequency is characteristic of the entire human genome, rather than of only coding sequences. The method was proposed for a more complex in silico analysis of the genome. For instance, it is possible to simulate DNA restriction together with blot hybridization and then to analyze the nature of markers revealed.  相似文献   

18.
The structure of the transgenic mouse DNA region containing an integrated transgene (fragment of pBR322 sequence) was analysed. In one of the sequences flanking the transgene, short direct and inverted overlapping repeats were revealed at a distance of 60 bp from the integration site. In the same flanking sequence, there is an extended sequence (3.5 kbp) 0.3-1 kbp away from the transgene. It repeats 100-300 times in the mouse genome and is highly conservative (the homologs of the repeat have been revealed in other mammalian, bird, fish and insect genomes). This up-to-date unknown family of highly-conserved dispersed repeats has been denoted by T1. We believe that both the revealed short inverted repeats capable of forming hairpins with loops and the T1 repeat are structures involved in the process of non-homologous insertion of foreign DNA into the region of the transgenic mouse genome.  相似文献   

19.
The Restriction On Computer (ROC) program (freely available at http://www.mcb.harvard.edu/ gilbert/ROC) was developed and used to analyze the restriction fragment length distribution in the human genome. In contrast to other programs searching for restriction sites, ROC simultaneously analyzes several long nucleotide sequences, such as the entire genomes, and in essence simulates electrophoretic analysis of DNA restriction fragments. In addition, this program extracts and analyzes DNA repeats that account for peaks in the restriction fragment length distribution. The ROC analysis data are consistent with the experimental data obtained via in vitro restriction enzyme analysis (DNA taxonoprint). A difference between the in vitro and in silico results is explained by underrepresentation of tandem DNA repeats in genomic databases. The ROC analysis of individual genome fragments elucidated the nature of several DNA markers, which were earlier revealed by DNA taxonoprint, and showed that L1 and Alurepeats are nonrandomly distributed in various chromosomes. Another advantage is that the ROC procedure makes it possible to analyze the nonrandom character of a genomic distribution of short DNA sequences. The ROC analysis showed that a low poly(G) frequency is characteristic of the entire human genome, rather than of only coding sequences. The method was proposed for a more complex in silico analysis of the genome. For instance, it is possible to simulate DNA restriction together with blot hybridization and then to analyze the nature of markers revealed.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号