首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
A DNA fragment containing short tandem repeat sequences (approximately 86-bp repeat) was isolated from a Xenopus laevis cDNA library. Southern blot and in situ hybridization analyses revealed that the repeat was highly dispersed in the genome and was present at approximately 1 million copies per haploid genome. We named this element Xstir (Xenopus short tandemly and invertedly repeating element) after its arrangement in the genome. The majority of the genomic Xstir sequences were digested to monomer and dimer sizes with several restriction enzymes. Their sequences were found to be highly homogeneous and organized into tandem arrays in the genome. Alignment analyses of several known sequences showed that some of the Xstir-like sequences were also organized into interspersed inverted repeats. The inverted repeats consisted of an inverted pair of two differently modified Xstirs separated by a short insert. In addition, these were framed by another novel inverted repeat (Xstir-TIR). The Xstir-TIR sequence was also found at the ends of tandem Xstir arrays. Furthermore, we found that Xstir-TIR was linked to a motif characterizing the T2 family which belonged to a vertebrate MITE (miniature inverted-repeat transposable element) family, suggesting the importance of Xstir-TIR for their amplification and transposition. The present study of 11 anuran and 2 urodele species revealed that Xstir or Xstir-like sequences were extensively amplified in the three Xenopus species. Genomic Xstir populations of X. borealis and X. laevis were mutually indistinguishable but significantly different from that of X. tropicalis. Received: 5 April 2000 / Accepted: 3 August 2000  相似文献   

3.
The repetitive sequence PisTR-A has an unusual organization in the pea (Pisum sativum) genome, being present both as short dispersed repeats as well as long arrays of tandemly arranged satellite DNA. Cloning, sequencing and FISH analysis of both PisTR-A variants revealed that the former occurs in the genome embedded within the sequence of Ty3/gypsy-like Ogre elements, whereas the latter forms homogenized arrays of satellite repeats at several genomic loci. The Ogre elements carry the PisTR-A sequences in their 3′ untranslated region (UTR) separating the gag-pol region from the 3′ LTR. This region was found to be highly variable among pea Ogre elements, and includes a number of other tandem repeats along with or instead of PisTR-A. Bioinformatic analysis of LTR-retrotransposons mined from available plant genomic sequence data revealed that the frequent occurrence of variable tandem repeats within 3′ UTRs is a typical feature of the Tat lineage of plant retrotransposons. Comparison of these repeats to known plant satellite sequences uncovered two other instances of satellites with sequence similarity to a Tat-like retrotransposon 3′ UTR regions. These observations suggest that some retrotransposons may significantly contribute to satellite DNA evolution by generating a library of short repeat arrays that can subsequently be dispersed through the genome and eventually further amplified and homogenized into novel satellite repeats.  相似文献   

4.
The Pacific oyster (Crassostrea gigas) is globally distributed and is one of the most commercially and ecologically important marine organisms. However, little is known about the genome of this species. In this study, a C. gigas fosmid library was constructed that contains 459,936 clones with an average insert size of approximately 40 kb, representing 22.34-fold haploid genome equivalents. End sequencing generated 90,240 fosmid end sequences (FESs) with an average length of 384.27 base pairs (bp), covering approximately 2.58% of the Pacific oyster genome. The FESs were subsequently assembled and annotated, resulting in 6332 sequences with predicted open reading frames≥300 and 1,189,100 bp repeats. Furthermore, a total of 3200 microsatellite repeats were identified, and dinucleotide repeats were found to occur most abundantly, with AG and AAT being the most abundant repeat class of dinucleotides and trinucleotides. We also found that the repeat number was generally negatively proportional to the repeat element length. Microsatellites composition between the transcribed sequences and genomic sequences was shown to be different. Point mutations of microsatellite were non-random and underwent strong selection stress. Overall, a comprehensive sequence resource for the Pacific oyster was created, including annotated transposable elements, tandem repeats, protein coding sequences and microsatellites. These initial findings will serve as resources for further in-depth studies of physical mapping, gene discovery, microsatellite marker developing and evolution studies.  相似文献   

5.
We mapped and analyzed the microsatellites throughout 284295605 base pairs of the unambiguously assembled sequence scaffolds along 19 chromosomes of the haploid poplar genome. Totally, we found 150985 SSRs with repeat unit lengths between 2 and 5 bp. The established microsatellite physical map demonstrated that SSRs were distributed relatively evenly across the genome of Populus. On average, These SSRs occurred every 1883 bp within the poplar genome and the SSR densities in intergenic regions, introns, exons and UTRs were 85.4%, 10.7%, 2.7% and 1.2%, respectively. We took di-, tri-, tetra-and pentamers as the four classes of repeat units and found that the density of each class of SSRs decreased with the repeat unit lengths except for the tetranucleotide repeats. It was noteworthy that the length diversification of microsatellite sequences was negatively correlated with their repeat unit length and the SSRs with shorter repeat units gained repeats faster than the SSRs with longer repeat units. We also found that the GC content of poplar sequence significantly correlated with densities of SSRs with uneven repeat unit lengths (tri-and penta-), but had no significant correlation with densities of SSRs with even repeat unit lengths (di-and tetra-). In poplar genome, there were evidences that the occurrence of different microsatellites was under selection and the GC content in SSR sequences was found to significantly relate to the functional importance of microsatellites.  相似文献   

6.
Banana and plantain (Musa spp.) are grown in more than 120 countries in tropical and subtropical regions and constitute an important staple food for millions of people. A Musa acuminata ssp. malaccencis DH Pahang bacterial artificial chromosome (BAC) library (MAMB) was submitted for BAC-end sequencing. MAMB consists of 23,040 clones, with a 140-kbp average insert size, accounting for a five times coverage of the banana genome. A total of 46,080 reads were generated, and 42,750 (92.8%) high-quality sequences were obtained after trimming for vector and quality. Analysis of these data shows a GC content of 41.39%, whereas interspersed repeats comprise 32.3%. The most common repeated sequences found show homology to ribosomal RNA genes, particularly 18S rRNA, while the Ty3/gypsy type monkey retrotransposon is the most common retro element. The sequence data were used to generate a banana-specific repeat library containing 54 new repetitive elements which accounted for 11.86% of the total nucleotides. Simple sequence repeats represent 0.7% of the sequence data and allowed the identification of 2,455 potentially useful marker sites. Functional annotation identified 2,705 sequences that could code for proteins of known function. Microsynteny analysis shows a higher number of co-linear matches to Oryza sativa, in contrast to Arabidopsis thaliana. This database of BAC-end sequences is useful for the assembly of the complete banana genome sequence and is important for identification in functional genomics experiments.  相似文献   

7.
A family of repetitive DNA elements of approximately 350 bp—Sat350—that are members of Toxoplasma gondii satellite DNA was further analyzed. Sequence analysis identified at least three distinct repeat types within this family, called types A, B, and C. B repeats were divided into the subtypes B1 and B2. A search for internal repetitions within this family permitted the identification of conserved regions and the design of PCR primers that amplify almost all these repetitive elements. These primers amplified the expected 350-bp repeats and a novel 680-bp repetitive element (Sat680) related to this family. Two additional tandemly repeated high-order structures corresponding to this satellite DNA family were found by searching the Toxoplasma genome database with these sequences. These studies were confirmed by sequence analysis and identified: (1) an arrangement of AB1CB2 350-bp repeats and (2) an arrangement of two 350-bp-like repeats, resulting in a 680-bp monomer. Sequence comparison and phylogenetic analysis indicated that both high-order structures may have originated from the same ancestral 350-bp repeat. PCR amplification, sequence analysis and Southern blot showed that similar high-order structures were also found in the Toxoplasma-sister taxon Neospora caninum. The Toxoplasma genome database ( ) permitted the assembly of a contig harboring Sat350 elements at one end and a long nonrepetitive DNA sequence flanking this satellite DNA. The region bordering the Sat350 repeats contained two differentially expressed sequence-related regions and interstitial telomeric sequences.  相似文献   

8.
A bovine genomic phagemid library was constructed with randomly sheared DNA. Enrichment of this single-stranded DNA library with CA or GT primers resulted in 45% positive clones. The 14% of positive clones with (CA · GT)>12, and not containing flanking repetitive elements, were sequenced, and the efficiency of marker production was compared with random M13 bacteriophage libraries. Primer sequences and genotyping information are presented for 390 informative bovine microsatellite markers. The genomic frequency for 11 tri- and tetranucleotide repeats was estimated by hybridization to a lambda genomic library. Only GCT, GGT, and GGAT were estimated to have a frequency of >100 per genome. Enrichment of the phagemid library for these repeats failed to provide a viable source of microsatellite markers in the bovine. Comparison of map interval lengths between 100 markers from the enriched library prepared from randomly sheared DNA and M13 bacteriophage libraries prepared from Mbo1 restriction digests suggested no bias in skeletal genomic coverage based on source of small insert DNA. In conclusion, enrichment of the bovine phagemid library provides a sufficient source of microsatellites so that small repeat lengths and flanking repetitive sequences common in the bovine can be eliminated, resulting in a high percentage of informative markers.The nucleotide sequence data reported in this paper have been submitted to GenBank and have been assigned the accession numbers U25689 and U25690.  相似文献   

9.
A computer-aided homology search of databases found that the nucleotide sequences flanking ATLN44, a non-LTR retrotransposon (LINE) from Arabidopsis thaliana, are repeated in the A. thaliana genome. These sequences are homologous to flanking sequences of 664 bp with terminal inverted repeat sequences of about 70 bp. The 664-bp sequence and most of the 14 homologues identified were flanked by direct repeat sequences of 9 bp. These findings indicate that the repeated sequence, named Tnat1, is a transposable element that duplicates a 9-bp sequence at the target site on transposition and that ATLN44 is inserted in one Tnat1 member. Interestingly, all of the Tnat1 members had tandem repeats comprised of several units of a 60-bp sequence, the number of repeats differing among Tnat1 members. Of the Tnat1 members identified, one was inserted into another sequence repeated in the A. thaliana genome: that sequence is about 770 bp long and has terminal inverted repeat sequences of about 110 bp. The sequence is flanked by direct repeats of a 9-bp sequence, indicating that it is another transposable element, named Tnat2, from A. thaliana. Moreover, Tnat2 members had a tandem repeat about 240 bp long. Tnat1 and Tnat2 with tandem repeats in their internal regions show no homology to each other or to any of the elements identified previously; therefore they appear to be novel transposable elements.  相似文献   

10.
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.  相似文献   

11.
Studies on microsatellite distribution and divergence in related genomes contribute towards understanding of genome evolution in eukaryotes. Despite the availability of whole genome sequences of four rice genomes, occurrence and significance of microsatellites in the rice genome has remained a relatively unexplored area of research. We have aligned genomes of two rice subspecies i.e. indica and japonica to understand the trends of microsatellite conservation and divergence in the rice genome. Nearly 62% of the indica microsatellites were also found in the japonica genome. Occurrence of microsatellites showed a negative association with that of retrotransposons. Microsatellites repeat unit length and sequence showed direct influence on the microsatellite locus length. Further, microsatellite allele length was also influenced by the sequence characteristics of the neighbouring regions. CCG repeats were most conserved microsatellite sequences across the different syntenic regions in the two rice genomes and often showed association with CpG islands. Our study suggested that microsatellite distribution is not only governed by a balance between replication slippage and point mutations as proposed earlier, but also by the microsatellite motif sequence and characteristics of microsatellite neighbouring regions in the genome. Thus, this study is likely to prove an important reference for understanding the process of microsatellite evolution and dynamics in the two rice subspecies.  相似文献   

12.
Novel functional role of CA repeats and hnRNP L in RNA stability   总被引:6,自引:1,他引:5  
CA dinucleotide repeat sequences are very common in the human genome. We have recently demonstrated that the polymorphic CA repeats in intron 13 of the human endothelial nitric oxide synthase (eNOS) gene function as an unusual, length-dependent splicing enhancer. The CA repeat enhancer requires for its activity specific binding of hnRNP L. Here we show that in the absence of bound hnRNP L, the pre-mRNA is cleaved directly upstream of the CA repeats. The addition of recombinant hnRNP L restores RNA stability. CA repeats are both necessary and sufficient for this specific cleavage in the 5' adjacent RNA sequence. We conclude that-in addition to its role as a splicing activator-hnRNP L can act in vitro as a sequence-specific RNA protection factor. Based on the wide abundance of CA repetitive sequences in the human genome, this may represent a novel, generally important role of this abundant hnRNP protein.  相似文献   

13.
We have identified four novel repeats and two domains in cell surface proteins encoded by the Methanosarcina acetivorans genome and in some archaeal and bacterial genomes. The repeats correspond to a certain number of amino acid residues present in tandem in a protein sequence and each repeat is characterized by conserved sequence motifs. These correspond to: (a) a 42 amino acid (aa) residue RIVW repeat; (b) a 45 aa residue LGxL repeat; (c) a 42 aa residue LVIVD repeat; and (d) a 54 aa residue LGFP repeat. The domains correspond to a certain number of aa residues in a protein sequence that do not comprise internal repeats. These correspond to: (a) a 200 aa residue DNRLRE domain; and (b) a 70 aa residue PEGA domain. We discuss the occurrence of these repeats and domains in the different proteins and genomes analysed in this work.  相似文献   

14.
We have investigated the organisation, nucleotide sequence, and chromosomal distribution of a tandemly repeated, satellite DNA from Allium cepa (Liliaceae). The satellite, which constitutes about 4% of the A. cepa genome, may be resolved from main-band DNA in antibiotic-CsCl density gradients, and has a repeat length of about 375 base pairs (bp). A cloned member of the repeat family hybridises exclusively to chromosome telomeres and has a non-random distribution in interphase nuclei. We present the nucleotide sequences of three repeats, which differ at a large number of positions. In addition to arrays made up of 375-bp repeats, homologous sequences are found in units with a greater repeat length. This divergence between repeats reflects the heterogeneity of the satellite determined using other criteria. Possible constraints on the interchromosomal exchange of repeated sequences are discussed.  相似文献   

15.
Three different repeat sequences have been mapped within the cloned EcoRI fragments that contain the adult beta-globin genes from the BALB/c (Hddd) mouse. One sequence, "a", occurs 1.5-2 kb 3' to the beta-major gene. A second, "b", is found 4kb 5' and 7.5kb 3' to the beta-minor gene. The 14kb EcoRI fragment bearing the beta-minor gene carries at least one additional repetitive element, "c". Probing a BALB/c DNA library with each repeat has demonstrated that these sequences are moderately to highly repetitive and are extensively interspersed with each other throughout the genome. In addition, repeats "a" and "b" are preferentially found in satellite and main-band DNa, respectively. The occurrence of these repeats elsewhere in the beta-globin cluster was demonstrated by probing the non-adult globin clones with each repeat. The arrangement of these repeats around the non-adult genes is 5'-"b"-"b"-epsilon y-beta hl-beta h2-"c"-beta h3-3'. Probing the C57BL/10 (Hbbs) adult gene clones with these repeats demonstrated that the distribution of these sequences in the adult region of these two haplotypes is essentially the same.  相似文献   

16.
Taxus mairei is a critically endangered and commercially important cultured medicinal gymnosperm in China and forms an important medicinal resource, but the research of its genome is absent. In this study, we constructed a T. mairei fosmid library and analyzed the fosmid end sequences to provide a preliminary assessment of the genome. The library consists of one million clones with an average insert size of about 39 kb, amounting to 3.9 genome equivalents. Fosmid stability assays indicate that T. mairei DNA was stable during propagation in the fosmid system. End sequencing of both 5′ and 3′ ends of 968 individual clones generated 1,923 sequences after trimming, with an average sequence length of 839 bp. BLASTN searches of the nr and EST databases of GenBank and BLASTX searches of the nr database resulted in 560 (29.1%) significant hits (E < e−5). Repetitive sequences analysis revealed that 20.8% of end sequences are repetitive elements, which were composed of retroelements, DNA transposons, satellites, simple repeats, and low complexity sequences. The distribution pattern of various repeat types was found to be more similar to the gymnosperm Pinus and Picea than to the monocot and dicot. The satellites of T. mairei were significantly longer than those of P. taeda and P. glauca. The tetra-nucleotide repeats of T. mairei were much longer than those of P. glauca and P. taeda. The fosmid library and the fosmid end sequences, for the first time, will serve as a useful resource for large-scale genome sequencing, physical mapping, SSR marker development and positional cloning, and provide a better understanding of the Taxus genome.  相似文献   

17.
Characterization of the nuclear ribosomal DNA of Euglena gracilis   总被引:4,自引:0,他引:4  
S E Curtis  J R Rawson 《Gene》1981,15(2-3):237-247
A phage lambda recombinant library containing Euglena gracilis genomic DNA was screened for nuclear rDNA sequences. A recombinant phage was isolated that contained an 11.5-kb nuclear rDNA sequence. The 11.5-kb insert was mapped with restriction endonucleases and was shown to represent a complete rDNA repeat unit that carried the genes for the 19S, 25S, 5.8 S and 5 S cytoplasmic rRNAs. The 2000 rDNA repeat units per haploid genome are organized in the form of identical tandem repeats.  相似文献   

18.
We studied the occurrence of mammalian interspersed repeats (MIRs) in DNA and RNA of vertebrates, invertebrates, and bacteria using the data from GenBank. A special algorithm based on a weight position matrix with optimal alignment using dynamic programming was developed to search for the traces of MIR dissemination. This allowed us to search for highly divergent MIRs carrying deletions and insertions. MIRs were detected in genomes of various fishes, includingLatimeria. This suggests that the origin of MIRs dates back more than 400 million years. The method to search for similarity between highly divergent sequences may be used to find the genome fragments from various ancient repeat families and from various gene families.  相似文献   

19.
Tandemly repeated sequences are a major component of the eukaryotic genome. Although the general characteristics of tandem repeats have been well documented, the processes involved in their origin and maintenance remain unknown. In this study, a region on the paternal sex ratio (PSR) chromosome was analyzed to investigate the mechanisms of tandem repeat evolution. The region contains a junction between a tandem array of PSR2 repeats and a copy of the retrotransposon NATE, with other dispersed repeats (putative mobile elements) on the other side of the element. Little similarity was detected between the sequence of PSR2 and the region of NATE flanking the array, indicating that the PSR2 repeat did not originate from the underlying NATE sequence. However, a short region of sequence similarity (11/15 bp) and an inverted region of sequence identity (8 bp) are present on either side of the junction. These short sequences may have facilitated nonhomologous recombination between NATE and PSR2, resulting in the formation of the junction. Adjacent to the junction, the three most terminal repeats in the PSR2 array exhibited a higher sequence divergence relative to internal repeats, which is consistent with a theoretical prediction of the unequal exchange model for tandem repeat evolution. Other NATE insertion sites were characterized which show proximity to both tandem repeats and complex DNAs containing additional dispersed repeats. An ``accretion model' is proposed to account for this association by the accumulation of mobile elements at the ends of tandem arrays and into ``islands' within arrays. Mobile elements inserting into arrays will tend to migrate into islands and to array ends, due to the turnover in the number of intervening repeats. Received: 18 August 1997 / Accepted: 18 September 1998  相似文献   

20.
There are over 6000 internally eliminated DNA sequences (IESs) in the Tetrahymena genome that are deleted in a programmed fashion during the development of a polyploid, somatic macronucleus from a diploid germline micronucleus. Recently, based on several results, a homology and small RNA-based mechanism has been proposed for the efficient elimination of IES elements. Since the RNAi machinery is proposed to be intimately involved in silencing potentially harmful repeats such as transposons and viruses, characterization of repeats and the conditions for their developmental elimination from the somatic genome is warranted. Three short (500–600 bp) repeat families, members of which had been experimentally identified in IESs, that is, in micronucleus-specific DNA, are examined here using the Tetrahymena genome database. Members of all three families display varied degrees of truncation and are represented in macronuclear sequences. A 200 bp segment of one of the families can appear in the genome on its own, or as part of a 600 bp repeat detected experimentally, or in association with an unrelated 1 kb sequence to form a 1.2 kb repeat that is also frequently truncated. The 1 kb sequence contains a 300 bp section similar to a repeat associated with a non-long terminal repeat-like element and is often found accompanied by several more copies of this shorter repeat. These observations indicate that transposition may have had a role in the evolution of the short repeat families.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号