首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A new family of repeats--i.e. MB1 repeats family--the number of copies of which per a human genome constitutes a few hundreds of thousands of copies has been revealed in a human gemone by computer analysis of a noncanonical similarity of nucleic acid sequences. The numbers of that family of repeats have also been revealed in the genomes of mouse and rat, they have been identified as mirror--reflected copies--in purines and pyrimidines--of B1 repeats in the genome of mouse and the Alu repeats in the human genome. The MB1 repeats tend to remain most similar at a length of 70 b.p. They are not flanked by short repeats, neither contain poly(A) region at the 3' end, by which they differ from the repeats of the SINE family. It has been assumed that the member of the Alu repeats family and the MB1 repeats family can form a so called H-form of DNA. The mirror-reflected repeat family could have been formed by replication of parallel DNA strands.  相似文献   

2.
We have previously shown that GAA trinucleotide repeats have undergone significant expansion in the human genome. Here we present the analysis of the length distribution of all 10 nonredundant trinucleotide repeat motifs in 20 complete eukaryotic genomes (6 mammalian, 2 nonmammalian vertebrates, 4 arthropods, 4 fungi, and 1 each of nematode, amoebozoa, alveolate, and plant), which showed that the abundance of large expansions of GAA trinucleotide repeats is specific to mammals. Analysis of human-chimpanzee-gorilla orthologs revealed that loci with large expansions are species-specific and have occurred after divergence from the common ancestor. PCR analysis of human controls revealed large expansions at multiple human (GAA)(30+) loci; nine loci showed expanded alleles containing >65 triplets, analogous to disease-causing expansions in Friedreich ataxia, including two that are in introns of genes of unknown function. The abundance of long GAA trinucleotide repeat tracts in mammalian genomes represents a significant mutation potential and source of interindividual variability.  相似文献   

3.
Long interspersed nuclear elements (LINEs) comprise about 21% of the human genome (of which L1 is most abundant) and are preferentially accumulated in AT-rich regions, as well as the X and Y chromosomes. Most knowledge of L1 distribution in mammals is restricted to human and mouse. Here we report the first investigation of L1 distribution in the genomes of a wide variety of eutherian mammals, including species in the two basal clades, Afrotheria and Xenarthra. Our results show L1 accumulation on the X of all eutherian mammals, an observation consistent with an ancestral involvement of these elements in the X-inactivation process (the Lyon repeat hypothesis). Surprisingly, conspicuous accumulation of L1 in AT-rich regions of the genome was not observed in any species outside of Euarchontoglires (represented by human, mouse and rabbit). Although several features were common to most species investigated, our comprehensive survey shows that the patterns observed in human and mouse are, in many aspects, far from typical for all mammals. We discuss these findings with reference to models that have previously been proposed to explain the AT distribution bias of L1 in human and mouse, and how this relates to the evolution of these elements in other eutherian genomes.Paul D. Waters and Gauthier Dobigny contributed equally to this work  相似文献   

4.

Background

Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes?

Results

A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the human, dog, and cow genomes. To maximize genome coverage, the coordinates of all BAC end sequence hits to the cow and dog genomes were also converted to the equivalent human genome coordinates. The 84,624 sheep BACs (about 5.4-fold genome coverage) with paired ends in the correct orientation (tail-to-tail) and spacing, combined with information from sheep BAC comparative genome contigs (CGCs) built separately on the dog and cow genomes, were used to construct 1,172 sheep BAC-CGCs, covering 91.2% of the human genome. Clustered non-tail-to-tail and outsize BACs located close to the ends of many BAC-CGCs linked BAC-CGCs covering about 70% of the genome to at least one other BAC-CGC on the same chromosome. Using the BAC-CGCs, the intrachromosomal and interchromosomal BAC-CGC linkage information, human/cow and vertebrate synteny, and the sheep marker map, a virtual sheep genome was constructed. To identify BACs potentially located in gaps between BAC-CGCs, an additional set of 55,668 sheep BACs were positioned on the sheep genome with lower confidence. A coordinate conversion process allowed us to transfer human genes and other genome features to the virtual sheep genome to display on a sheep genome browser.

Conclusion

We demonstrate that limited sequencing of BACs combined with positioning on a well assembled genome and integrating locations from other less well assembled genomes can yield extensive, detailed subgene-level maps of mammalian genomes, for which genomic resources are currently limited.  相似文献   

5.
6.
The interspersed repeat content of mammalian genomes has been best characterized in human, mouse and cow. In this study, we carried out de novo identification of repeated elements in the equine genome and identified previously unknown elements present at low copy number. The equine genome contains typical eutherian mammal repeats, but also has a significant number of hybrid repeats in addition to clade-specific Long Interspersed Nuclear Elements (LINE). Equus caballus clade specific LINE 1 (L1) repeats can be classified into approximately five subfamilies, three of which have undergone significant expansion. There are 1115 full-length copies of these equine L1, but of the 103 presumptive active copies, 93 fall within a single subfamily, indicating a rapid recent expansion of this subfamily. We also analysed both interspersed and simple sequence repeats (SSR) genome-wide, finding that some repeat classes are spatially correlated with each other as well as with G+C content and gene density. Based on these spatial correlations, we have confirmed that recently-described ancestral vs. clade-specific genome territories can be defined by their repeat content. The clade-specific Short Interspersed Nuclear Element correlations were scattered over the genome and appear to have been extensively remodelled. In contrast, territories enriched for ancestral repeats tended to be contiguous domains. To determine if the latter territories were evolutionarily conserved, we compared these results with a similar analysis of the human genome, and observed similar ancestral repeat enriched domains. These results indicate that ancestral, evolutionarily conserved mammalian genome territories can be identified on the basis of repeat content alone. Interspersed repeats of different ages appear to be analogous to geologic strata, allowing identification of ancient vs. newly remodelled regions of mammalian genomes.  相似文献   

7.
Analysis of 37 short repetitive elements (SINEs) in rabbit DNA that are known as C repeats has revealed three that contribute functional polyadenylation signals to genes into which they have been inserted. Similar roles have been attributed to particular individual SINEs in rodents and primates before, suggesting that these roles may be common to SINEs in all mammalian orders. Although most SINEs appear to have little influence on the genome individually, the observation that three of 36 rabbit C repeats provide functional sequences suggests a mechanism for the maintenance of SINEs within mammalian genomes.  相似文献   

8.
9.
The genomes of birds are much smaller than mammalian genomes, and transposable elements (TEs) make up only 10% of the chicken genome, compared with the 45% of the human genome. To study the mechanisms that constrain the copy numbers of TEs, and as a consequence the genome size of birds, we analyzed the distributions of LINEs (CR1's) and SINEs (MIRs) on the chicken autosomes and Z chromosome. We show that (1) CR1 repeats are longest on the Z chromosome and their length is negatively correlated with the local GC content; (2) the decay of CR1 elements is highly biased, and the 5'-ends of the insertions are lost much faster than their 3'-ends; (3) the GC distribution of CR1 repeats shows a bimodal pattern with repeats enriched in both AT-rich and GC-rich regions of the genome, but the CR1 families show large differences in their GC distribution; and (4) the few MIRs in the chicken are most abundant in regions with intermediate GC content. Our results indicate that the primary mechanism that removes repeats from the chicken genome is ectopic exchange and that the low abundance of repeats in avian genomes is likely to be the consequence of their high recombination rates.  相似文献   

10.

Background

Ancestral reconstructions of mammalian genomes have revealed that evolutionary breakpoint regions are clustered in regions that are more prone to break and reorganize. What is still unclear to evolutionary biologists is whether these regions are physically unstable due solely to sequence composition and/or genome organization, or do they represent genomic areas where the selection against breakpoints is minimal.

Methodology and Principal Findings

Here we present a comprehensive study of the distribution of tandem repeats in great apes. We analyzed the distribution of tandem repeats in relation to the localization of evolutionary breakpoint regions in the human, chimpanzee, orangutan and macaque genomes. We observed an accumulation of tandem repeats in the genomic regions implicated in chromosomal reorganizations. In the case of the human genome our analyses revealed that evolutionary breakpoint regions contained more base pairs implicated in tandem repeats compared to synteny blocks, being the AAAT motif the most frequently involved in evolutionary regions. We found that those AAAT repeats located in evolutionary regions were preferentially associated with Alu elements.

Significance

Our observations provide evidence for the role of tandem repeats in shaping mammalian genome architecture. We hypothesize that an accumulation of specific tandem repeats in evolutionary regions can promote genome instability by altering the state of the chromatin conformation or by promoting the insertion of transposable elements.  相似文献   

11.
Short interspersed elements (SINEs) are ubiquitous in mammalian genomes. Remarkable variety of these repeats among placental orders indicates that most of them amplified in each lineage independently, following mammalian radiation. Here, we present an ancient family of repeats, whose sequence divergence and common occurrence among placental mammals, marsupials and monotremes indicate their amplification during the Mesozoic era. They are called MIRs for abundant Mammalian-wide Interspersed Repeats. With approximately 120,000 copies still detectable in the human genome (0.2-0.3% DNA), MIRs represent a 'fossilized' record of a major genetic event preceding the radiation of placental orders.  相似文献   

12.
The structure of the transgenic mouse DNA region containing an integrated transgene (fragment of pBR322 sequence) was analysed. In one of the sequences flanking the transgene, short direct and inverted overlapping repeats were revealed at a distance of 60 bp from the integration site. In the same flanking sequence, there is an extended sequence (3.5 kbp) 0.3-1 kbp away from the transgene. It repeats 100-300 times in the mouse genome and is highly conservative (the homologs of the repeat have been revealed in other mammalian, bird, fish and insect genomes). This up-to-date unknown family of highly-conserved dispersed repeats has been denoted by T1. We believe that both the revealed short inverted repeats capable of forming hairpins with loops and the T1 repeat are structures involved in the process of non-homologous insertion of foreign DNA into the region of the transgenic mouse genome.  相似文献   

13.
Complete archaeal genomes were probed for the presence of long (> or = 25 bp) oligonucleotide repeats (words). We detected the presence of many words distributed in tandem with narrow ranges of periodicity (i.e., spacer length between repeats). Similar words were not identified in genomes of non-archaeal species, namely Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Mycoplasma genitalium and Mycoplasma pneumoniae. BLAST similarity searches against the GenBank nucleotide sequence database revealed that these words were archaeal species-specific, indicating that they are of a signature character. Sequence analysis and genome viewing tools showed these repeats to be restricted to non-coding regions. Thus, archaea appear to possess a non-coding genomic signature that is absent in bacterial species. The identification of a species-specific genomic signature would be of great value to archaeal genome mapping, evolutionary studies and analyses of genome complexity.  相似文献   

14.
A 90-nucleotide (CAG)30,single-stranded DNA was used to probe Southern blots inorder to indicate the quantity and distribution of longCAG repeats in selected genomes. Bovine and rat genomeswere found to contain a particularly high content of CAGrepeats, while the repeats were comparatively rare inthe human genome. A particularly strong signal in thebovine genome was due to a CAG repeat associatedwith the 1.709 satellite. A similar element wasfound in goat and musk, but not in the otherartiodactyls tested, suggesting that this particular CAGrepeat developed some 10-20 million years ago withina 3.8-kb unit presently belonging to thesatellite element and that this unit has latermultiplied in the genome. Single-copy repeats could bediscerned in yeast, but not in mammals. Thus the probedid not detect specific repeats in patients withCAG repeat diseases.  相似文献   

15.
The nucleotide sequence of the entire beta-like globin gene cluster of rabbits has been determined. This sequence of a continuous stretch of 44.5 x 10(3) base-pairs (bp) starts about 6 x 10(3) bp upstream from epsilon (the 5'-most gene) and ends about 12 x 10(3) bp downstream from beta (the 3'-most gene). Analysis of the sequence reveals that: (1) the sequence is relatively A + T rich (about 60%); (2) regions with high G + C content are associated with OcC repeats, a short interspersed repeated DNA in rabbits; (3) the distribution of polypurines, polypyrimidines and alternating purine/pyrimidine tracts is not random within the cluster; (4) most open reading frames are associated with known globin coding regions, OcC repeats or long interspersed repeats (L1 repeats); (5) the most prominent open reading frames are found in the L1 repeats; (6) different strand asymmetries in base composition are associated with embyronic and adult genes as well as the tandem L1 repeats at the 3' end of the cluster; and (7) essentially all the repeats appear to have been inserted by a transposon mechanism. A comparison of the sequence with itself by a dot-plot analysis has revealed nine new members of the OcC family of repeats in addition to the six previously reported. The OcC repeats tend to be clustered, particularly in the epsilon-gamma and gamma-psi delta intergenic regions. Dot-plot comparisons between the rabbit and the human clusters have revealed extensive sequence matches. Homology starts about 6 x 10(3) bp 5' to epsilon or as far upstream as the rabbit sequence is available. It continues throughout the entire cluster and stops about 0.7 x 10(3) bp 3' to beta, at which point several repeats have inserted in both rabbits and humans. Throughout the gene cluster, the homology is interrupted mainly by insertions or deletions in either the rabbit or the human genome. Almost all of the insertions are of known short or long repeated DNAs. The positions of the insertions are different in the two gene clusters, which indicates that both short and long repeats have been transposing throughout the genome for the time since the mammalian radiation. An alignment of rabbit and human sequences allows the calculation of the substitution rate around epsilon. Sequences far removed from the gene are evolving at a rate equivalent to the pseudogene rate, although some short regions show an apparently higher rate.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

16.
Microsatellite polymorphisms are invaluable for mapping vertebrate genomes. In order to estimate the occurrence of microsatellites in the rabbit genome and to assess their feasibility as markers in rabbit genetics, a survey on the presence of all types of mononucleotide, dinucleotide, trinucleotide and tetranucleotide repeats, with a length of about 20 bp or more, was conducted by searching the published rabbit DNA sequences in the EMBL nucleotide database (version 32). A total of 181 rabbit microsatellites could be extracted from the present database. The estimated frequency of microsatellites in the rabbit genome was one microsatellite for every 2–3 kb of DNA. Dinucleotide repeats constituted the prevailing class of microsatellites, followed by trinucleotide, mononucleotide and tetranucleotide repeats, respectively. The average length of the microsatellites, as found in the database, was 26, 23, 23 and 22 bp for mono-, di-, tri- and tetranucleotide repeats, respectively. The most common repeat motif was AG, followed by A, AC, AGG and CCG. This group comprised about 70% of all extracted rabbit microsatellites. About 61% of the microsatellites were found in non-coding regions of genes, whereas 15% resided in (protein) coding regions. A significant fraction of rabbit microsatellites (about 22%) was found within interspersed repetitive DNA sequences.  相似文献   

17.
In the bovine genome we found two intrachromosomal DNA fragments flanked by inverted telomeric repeats (GenBank Accession Nos. AF136741 and AF136742). The internal parts of the fragments are homologous exclusively to the human sequences and to the consensus sequence of the L1MC4 subfamily of LINE-1 retrotransposons which are widespread among mammalian genomes. We found that distribution of homologous human sequences within our fragments is not random, reflecting a complicated pattern of insertion mechanisms of and maintenance of retrotransposons in mammalian genomes. One of the possible explanations of the origin of LINE-1 truncated elements flanked by inverted telomeric repeats in the bovine genome is that extrachromosomal DNA fragments may be modified by telomerase and subsequently, transferred into chromosomal DNA.  相似文献   

18.
Survey of simple sequence repeats in completed fungal genomes   总被引:7,自引:0,他引:7  
The use of simple sequence repeats or microsatellites as genetic markers has become very popular because of their abundance and length variation between different individuals. SSRs are tandem repeat units of 1 to 6 base pairs that are found abundantly in many prokaryotic and eukaryotic genomes. This is the first study examining and comparing SSRs in completely sequenced fungal genomes. We analyzed and compared the occurrences, relative abundance, relative density, most common, and longest SSRs in nine taxonomically different fungal species: Aspergillus nidulans, Cryptococcus neoformans, Encephalitozoon cuniculi, Fusarium graminearum, Magnaporthe grisea, Neurospora crassa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Ustilago maydis. Our analysis revealed that, in all of the genomes studied, the occurrence, abundance, and relative density of SSRs varied and was not influenced by the genome sizes. No correlation between relative abundance and the genome sizes was observed, but it was shown that N. crassa, the largest genome analyzed had the highest relative abundance of SSRs. In most genomes, mononucleotide, dinucleotide, and trinucleotide repeats were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. Our analysis showed that the relative abundance of SSRs in fungi is low compared with the human genome and that longer SSRs in fungi are rare. In addition to providing new information concerning the abundance of SSRs for each of these fungi, the results provide a general source of molecular markers that could be useful for a variety of applications such as population genetics and strain identification of fungal organisms.  相似文献   

19.
Based on published information, we have identified 991 genes and gene-family clusters for cattle and 764 for pigs that have orthologues in the human genome. The relative linear locations of these genes on human sequence maps were used as "rulers" to annotate bovine and porcine genomes based on a CSAM (contiguous sets of autosomal markers) approach. A CSAM is an uninterrupted set of markers in one genome (primary genome; the human genome in this study) that is syntenic in the other genome (secondary genome; the bovine and porcine genomes in this study). The analysis revealed 81 conserved syntenies and 161 CSAMs between human and bovine autosomes and 50 conserved syntenies and 95 CSAMs between human and porcine autosomes. Using the human sequence map as a reference, these 991 and 764 markers could correlate 72 and 74% of the human genome with the bovine and porcine genomes, respectively. Based on the number of contiguous markers in each CSAM, we classified these CSAMs into five size groups as follows: singletons (one marker only), small (2-4 markers), medium (5-10 markers), large (11-20 markers), and very large (> 20 markers). Several bovine and porcine chromosomes appear to be represented as di-CSAM repeats in a tandem or dispersed way on human chromosomes. The number of potential CSAMs for which no markers are currently available were estimated to be 63 between human and bovine genomes and 18 between human and porcine genomes. These results provide basic guidelines for further gene and QTL mapping of the bovine and porcine genomes, as well as insight into the evolution of mammalian genomes.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号