首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
DNA gel-blot and in situ hybridization with genome-specific repeated sequences have proven to be valuable tools in analyzing genome structure and relationships in species with complex allopolyploid genomes such as hexaploid oat (Avena sativa L., 2n = 6x = 42; AACCDD genome). In this report, we describe a systematic approach for isolating genome-, chromosome-, and region-specific repeated and low-copy DNA sequences from oat that can presumably be applied to any complex genome species. Genome-specific DNA sequences were first identified in a random set of A. sativa genomic DNA cosmid clones by gel-blot hybridization using labeled genomic DNA from different Avena species. Because no repetitive sequences were identified that could distinguish between the A and D gneomes, sequences specific to these two genomes are refereed to as A/D genome specific. A/D or C genome specific DNA subfragments were used as screening probes to identify additional genome-specific cosmid clones in the A. sativa genomic library. We identified clustered and dispersed repetitive DNA elements for the A/D and C genomes that could be used as cytogenetic markers for discrimination of the various oat chromosomes. Some analyzed cosmids appeared to be composed entirely of genome-specific elements, whereas others represented regions with genome- and non-specific repeated sequences with interspersed low-copy DNA sequences. Thus, genome-specific hybridization analysis of restriction digests of random and selected A. sativa cosmids also provides insight into the sequence organization of the oat genome.  相似文献   

2.
MOTIVATION: Complex genomes contain numerous repeated sequences, and genomic duplication is believed to be a main evolutionary mechanism to obtain new functions. Several tools are available for de novo repeat sequence identification, and many approaches exist for clustering homologous protein sequences. We present an efficient new approach to identify and cluster homologous DNA sequences with high accuracy at the level of whole genomes, excluding low-complexity repeats, tandem repeats and annotated interspersed repeats. We also determine the boundaries of each group member so that it closely represents a biological unit, e.g. a complete gene, or a partial gene coding a protein domain. RESULTS: We developed a program called HomologMiner to identify homologous groups applicable to genome sequences that have been properly marked for low-complexity repeats and annotated interspersed repeats. We applied it to the whole genomes of human (hg17), macaque (rheMac2) and mouse (mm8). Groups obtained include gene families (e.g. olfactory receptor gene family, zinc finger families), unannotated interspersed repeats and additional homologous groups that resulted from recent segmental duplications. Our program incorporates several new methods: a new abstract definition of consistent duplicate units, a new criterion to remove moderately frequent tandem repeats, and new algorithmic techniques. We also provide preliminary analysis of the output on the three genomes mentioned above, and show several applications including identifying boundaries of tandem gene clusters and novel interspersed repeat families. AVAILABILITY: All programs and datasets are downloadable from www.bx.psu.edu/miller_lab.  相似文献   

3.
FORRepeats: detects repeats on entire chromosomes and between genomes   总被引:1,自引:0,他引:1  
MOTIVATION: As more and more whole genomes are available, there is a need for new methods to compare large sequences and transfer biological knowledge from annotated genomes to related new ones. BLAST is not suitable to compare multimegabase DNA sequences. MegaBLAST is designed to compare closely related large sequences. Some tools to detect repeats in large sequences have already been developed such as MUMmer or REPuter. They also have time or space restrictions. Moreover, in terms of applications, REPuter only computes repeats and MUMmer works better with related genomes. RESULTS: We present a heuristic method, named FORRepeats, which is based on a novel data structure called factor oracle. In the first step it detects exact repeats in large sequences. Then, in the second step, it computes approximate repeats and performs pairwise comparison. We compared its computational characteristics with BLAST and REPuter. Results demonstrate that it is fast and space economical. We show FORRepeats ability to perform intra-genomic comparison and to detect repeated DNA sequences in the complete genome of the model plant Arabidopsis thaliana.  相似文献   

4.
Several complementary procedures were used to identify and characterize DNA sequences which are repeated within a 44 kilobase (kb) segment of rabbit chromosomal DNA containing four different rabbit β-like globin genes (β1–β4). Cross-hybridization between cloned DNAs from different regions of the gene cluster indicates the presence of a complex array of repeat sequences interspersed with the globin genes. We classified 20 different repeat sequences into five families whose members cross-hybridize. Electron microscopy was used to determine the location, size and relative orientations of many of the repeat sequences. Both direct and inverted repeats were identified, with sizes ranging from 140 to 1400 base pairs (bp). Each of the four closely linked globin genes is flanked by at least one pair of inverted repeats of 140–400 bp, and the entire set of four genes is flanked by an inverted repeat of 1400 bp. Two of the five repeat families contain repeat sequences of different sizes. We found that the smaller sequence elements can occur individually or in association with the larger repeat sequences, suggesting that the larger repeats may be composed of more than one smaller repeat sequence. The restriction fragments containing the intracluster repeats also contain sequences which are repeated many times in total rabbit genomic DNA, but it is not known whether the genomic and intracluster repeats are the same sequences. The results provide the first demonstration of the relationship between single-copy and repetitive DNA sequences in a large segment of chromosomal DNA containing a well characterized set of developmentally regulated genes.  相似文献   

5.
Seven barley species have been compared for organization of repeated sequences. Quantitative variation of repeated DNA fractions is demonstrated, though the total amount of sequences (reassociation up to Cot=10) in most cases does not vary. The repeats are divided into four groups by the mode of interspecific variability, with the help of dot and blot hybridization of the genomes under study with cloned highly repeated sequences of Hordeum vulgare. The first group contains the pHv7161 family of the most conservative sequences. The second group comprises moderately changing repeats. The third group includes highly variable Hind III repeats of Hordeum genomes, and the fourth group is represented by pHv7191 family of repeats that are highly amplified in H. vulgare genome. Comparative analysis of content and organization of highly repeated sequences in genome helps to clarify phylogenetic relationships in the genus and can be used for prediction of successfullness of interspecific hybridization.  相似文献   

6.
MRD is a database system to access the microsatellite repeats information of genomes such as archea, eubacteria, and other eukaryotic genomes whose sequence information is available in public domains. MRD stores information about simple tandemly repeated k-mer sequences where k= 1 to 6, i.e. monomer to hexamer. The web interface allows the users to search for the repeat of their interest and to know about the association of the repeat with genes and genomic regions in the specific organism. The data contains the abundance and distribution of microsatellites in the coding and non-coding regions of the genome. The exact location of repeats with respect to genomic regions of interest (such as UTR, exon, intron or intergenic regions) whichever is applicable to organism is highlighted. MRD is available on the World Wide Web at and/or . The database is designed as an open-ended system to accommodate the microsatellite repeats information of other genomes whose complete sequences will be available in future through public domain.  相似文献   

7.
Repetitive DNA sequences comprise a large percentage of plant genomes, and their characterization provides information about both species and genome evolution. We have isolated a recombinant clone containing a highly repeated DNA element (SB92) that is homologous to ca. 0.9% of the soybean genome or about 105 copies. This repeated sequence is tandemly arranged and is found in four or five major genomic locations. FISH analysis of metaphase chromosomes suggests that two of these locations are centromeric. We have determined the sequence of two cloned repeats and performed genomic sequencing to obtain a consensus sequence. The consensus repeat size was 92 bp and exhibited an average of 10% nucleotide substitution relative to the two cloned repeats. This high level of sequence diversity suggests an ancient origin but is inconsistent with the limited phylogenetic distribution of SB92, which is found an high copy number only in the annual soybeans. It therefore seems likely that this sequence is undergoing very rapid evolution.  相似文献   

8.
Chromosome breakage in germline and somatic genomes gives rise to copy number variation (CNV) responsible for genomic disorders and tumorigenesis. DNA sequence is known to play an important role in breakage at chromosome fragile sites; however, the sequences susceptible to double-strand breaks (DSBs) underlying CNV formation are largely unknown. Here we analyze 140 germline CNV breakpoints from 116 individuals to identify DNA sequences enriched at breakpoint loci compared to 2800 simulated control regions. We find that, overall, CNV breakpoints are enriched in tandem repeats and sequences predicted to form G-quadruplexes. G-rich repeats are overrepresented at terminal deletion breakpoints, which may be important for the addition of a new telomere. Interstitial deletions and duplication breakpoints are enriched in Alu repeats that in some cases mediate non-allelic homologous recombination (NAHR) between the two sides of the rearrangement. CNV breakpoints are enriched in certain classes of repeats that may play a role in DNA secondary structure, DSB susceptibility and/or DNA replication errors.  相似文献   

9.
We studied the structure, organization and relationship of repetitive DNA sequences in the genome of the scallop, Pecten maximus, a bivalve that is important both commercially and in marine ecology. Recombinant DNA libraries were constructed after partial digestion of genomic DNA from scallop with PstI and ApaI restriction enzymes. Clones containing repetitive DNA were selected by hybridisation to labelled DNA from scallop, oyster and mussel; colonies showing strong hybridisation only to scallop were selected for analysis and sequencing. Six non-homologous tandemly repeated sequences were identified in the sequences, and Southern hybridisation with all repeat families to genomic DNA digests showed characteristic ladders of hybridised bands. Three families had monomer lengths around 40 bp while three had repeats characteristic of the length wrapping around one (170 bp), or two (326 bp) nucleosomes. In situ hybridisation to interphase nuclei showed each family had characteristic numbers of clusters indicating contrasting arrangements. Two of the repeats had unusual repetitions of bases within their sequence, which may relate to the nature of microsatellites reported in bivalves. The study of these rapidly evolving sequences is valuable to understand an important source of genomic diversity, has the potential to provide useful markers for population studies and gives a route to identify mechanisms of DNA sequence evolution.  相似文献   

10.
We have examined the organization of the repeated and single copy DNA sequences in the genomes of two insects, the honeybee (Apis mellifera) and the housefly (Musca domestica). Analysis of the reassociation kinetics of honeybee DNA fragments 330 and 2,200 nucleotides long shows that approximately 90% of both size fragments is composed entirely of non-repeated sequences. Thus honeybee DNA contains few or no repeated sequences interspersed with nonrepeated sequences at a distance of less than a few thousand nucleotides. On the other hand, the reassociation kinetics of housefly DNA fragments 250 and 2,000 nucleotides long indicates that less than 15% of the longer fragments are composed entirely of single copy sequences. A large fraction of the housefly DNA therefore contains repeated sequences spaced less than a few thousand nucleotides apart. Reassociated repetitive DNA from the housefly was treated with S1 nuclease and sized on agarose A-50. The S1 resistant sequences have a bimodal distribution of lengths. Thirty-three percent is greater than 1,500 nucleotide pairs, and 67% has an average size about 300 nucleotide pairs. The genome of the housefly appears to have at least 70% of its DNA arranged as short repeats interspersed with single copy sequences in a pattern qualitatively similar to that of most eukaryotic genomes.  相似文献   

11.
Direct or inverse repeated sequences are important functional features of prokaryotic and eukaryotic genomes. Considering the unique mechanism, involving single-stranded genomic intermediates, by which adenovirus (Ad) replicates its genome, we investigated whether repetitive homologous sequences inserted into E1-deleted adenoviral vectors would affect replication of viral DNA. In these studies we found that inverted repeats (IRs) inserted into the E1 region could mediate predictable genomic rearrangements, resulting in vector genomes devoid of all viral genes. These genomes (termed DeltaAd.IR) contained only the transgene cassette flanked on both sides by precisely duplicated IRs, Ad packaging signals, and Ad inverted terminal repeat sequences. Generation of DeltaAd.IR genomes could also be achieved by coinfecting two viruses, each providing one inverse homology element. The formation of DeltaAd.IR genomes required Ad DNA replication and appeared to involve recombination between the homologous inverted sequences. The formation of DeltaAd. IR genomes did not depend on the sequence within or adjacent to the inverted repeat elements. The small DeltaAd.IR vector genomes were efficiently packaged into functional Ad particles. All functions for DeltaAd.IR replication and packaging were provided by the full-length genome amplified in the same cell. DeltaAd.IR vectors were produced at a yield of approximately 10(4) particles per cell, which could be separated from virions with full-length genomes based on their lighter buoyant density. DeltaAd.IR vectors infected cultured cells with the same efficiency as first-generation vectors; however, transgene expression was only transient due to the instability of deleted genomes within transduced cells. The finding that IRs present within Ad vector genomes can mediate precise genetic rearrangements has important implications for the development of new vectors for gene therapy approaches.  相似文献   

12.

Background  

Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences.  相似文献   

13.
We have used Fragmentation Sequencing logic to analyse the repetition structure of several large human genomic genes. The method, based on a proposed laboratory scheme for DNA sequencing, detects short sequences which are repeated near, but not necessarily adjacent, to each other (cryptically simple DNA). We find a low frequency of such repeats. There is a slight excess of such repeats in introns over exons, and a slight but significant excess in genomic DNA over random DNA, confirming that cryptically simple sequences are over-represented in the genome. The analysis suggests that Fragmentation Sequencing will be a suitable method for sequencing large mammalian genes.  相似文献   

14.
15.
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.  相似文献   

16.
Repetitive DNA and chromosome evolution in plants   总被引:32,自引:0,他引:32  
Most higher plant genomes contain a high proportion of repeated sequences. Thus repetitive DNA is a major contributor to plant chromosome structure. The variation in total DNA content between species is due mostly to variation in repeated DNA content. Some repeats of the same family are arranged in tandem arrays, at the sites of heterochromatin. Examples from the Secale genus are described. Arrays of the same sequence are often present at many chromosomal sites. Heterochromatin often contains arrays of several unrelated sequences. The evolution of such arrays in populations is discussed. Other repeats are dispersed at many locations in the chromosomes. Many are likely to be or have evolved from transposable elements. The structures of some plant transposable elements, in particular the sequences of the terminal inverted repeats, are described. Some elements in soybean, antirrhinum and maize have the same inverted terminal repeat sequences. Other elements of maize and wheat share terminal homology with elements from yeast, Drosophila, man and mouse. The evolution of transposable elements in plant populations is discussed. The amplification, deletion and transposition of different repeated DNA sequences and the spread of the mutations in populations produces a turnover of repetitive DNA during evolution. This turnover process and the molecular mechanisms involved are discussed and shown to be responsible for divergence of chromosome structure between species. Turnover of repeated genes also occurs. The molecular processes affecting repeats imply that the older a repetitive DNA family the more likely it is to exist in different forms and in many locations within a species. Examples to support this hypothesis are provided from the Secale genus.  相似文献   

17.
18.
MOTIVATION: Tandemly organized repetitive sequences (satellite DNA) are widespread in complex eukaryotic genomes. In plants, satellite repeats often represent a substantial part of nuclear DNA but only a little is known about the molecular mechanisms of their amplification and their possible role(s) in genome evolution and function. Unfortunately, addressing these questions via characterization of general sequence properties of known satellite repeats has been hindered by a difficulty in obtaining a complete and unbiased set of sequence data for this analysis. This is mainly due to the presence of multiple entries of homologous sequences and of single entries that contain more than one repeated unit (monomer) in the public databases. RESULTS: We have established a computer database specialized for plant satellite repeats (PlantSat) that integrates sequence data available from various resources with supplementary information including repeat consensus sequences, abundances, and chromosomal localizations. The sequences are stored as individual repeat monomers grouped into families, which simplifies their computer analysis and makes it more accurate. Using this feature, we have performed a basic sequence analysis of the whole set of plant satellite repeats with respect to their monomer length and nucleotide composition. The analysis revealed several preferred length ranges of the monomers (approximately 165 bp and its multiples) and an over-representation of the AA/TT dinucleotide in the repeats. We have also detected an enrichment of satellite DNA sequences for the motif CAAAA that is supposed to be involved in breakage-reunion of repeated sequences.  相似文献   

19.
M Hollis  J Hindley 《Gene》1986,46(2-3):153-160
Representatives of the Sau3A family of short human repeated sequences [Meneveri et al., J. Mol. Biol. 186 (1985) 483-489] have been isolated from the small polydisperse circular DNA (spcDNA) of peripheral human lymphocytes. The prototype repeat is a 72-bp element which is at least partially tandemly repeated in spcDNA and human genomic DNA. In comparison with three major families of human repeated DNA, the Sau3A repeats are enriched in spcDNA. The function of spcDNA in normal and transformed eukaryotic cells is not understood and most studies have attempted to resolve this problem by molecular analysis of circular DNA isolated from cells in culture [see Rush and Misra, Plasmid 14 (1985) 177-191 for references]. We have studied the spcDNA present in normal uncultured human lymphocytes and present data pointing to the selective accumulation of the Sau3A family of repeated DNA within this population. The sequences of twelve of these repeats, the consensus sequence for this family and the sequence of a genomic repeat, are presented.  相似文献   

20.
J Li  F Wang  V Kashuba  C Wahlestedt  E R Zabarovsky 《BioTechniques》2001,31(4):788, 790, 792-788, 790, 793
The deletion of specific genomic sequences is believed to influence the pathogenesis of certain diseases such as cancer. Identification of these sequences could provide novel therapeutic avenues for the treatment of disease. Here, we describe a simple and robust method called cloning of deleted sequences (CODE), which allows the selective cloning of deleted sequences from complex human genomes. Briefly, genomic DNA from two sources (human normal and tumor samples) was digested with restriction enzymes (e.g., BamHI, BglII, and BclI), then ligated to special linkers, and amplified by PCR. Tester (normal) DNA was amplified using a biotinylated primer and dNTPs. Driver (tumor) DNA was amplified using a non-biotinylated primer, but with dUTP instead of d7TP After denaturation and hybridization, all the driver DNA was destroyed with uracil-DNA glycosylase (UDG), and all imperfect hybrids were digested with mung bean nuclease. Sequences deleted from the driver DNA but present in the tester DNA were purified with streptavidin magnetic beads, and the cycle was repeated three more times. This procedure resulted in the rapid isolation and efficient cloning of genomic sequences homozygously deleted from the driver DNA sample, but present in the tester DNA fraction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号