首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 36 毫秒
1.
Behura SK  Severson DW 《Gene》2012,504(2):226-232
We present a detailed genome-scale comparative analysis of simple sequence repeats within protein coding regions among 25 insect genomes. The repetitive sequences in the coding regions primarily represented single codon repeats and codon pair repeats. The CAG triplet is highly repetitive in the coding regions of insect genomes. It is frequently paired with the synonymous codon CAA to code for polyglutamine repeats. The codon pairs that are least repetitive code for polyalanine repeats. The frequency of hexanucleotide and dinucleotide motifs of codon pair repeats is significantly (p<0.001) different in the Drosophila species compared to the non-Drosophila species. However, the frequency of synonymous and non-synonymous codon pair repeats varies in a correlated manner (r(2)=0.79) among all the species. Results further show that perfect and imperfect repeats have significant association with the trinucleotide and hexanucleotide coding repeats in most of these insects. However, only select species show significant association between the numbers of perfect/imperfect hexamers and repeat coding for single amino acid/amino acid pair runs. Our data further suggests that genes containing simple sequence coding repeats may be under negative selection as they tend to be poorly conserved across species. The sequences of coding repeats of orthologous genes vary according to the known phylogeny among the species. In conclusion, the study shows that simple sequence coding repeats are important features of genome diversity among insects.  相似文献   

2.
All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.  相似文献   

3.
Microsatellite polymorphisms are invaluable for mapping vertebrate genomes. In order to estimate the occurrence of microsatellites in the rabbit genome and to assess their feasibility as markers in rabbit genetics, a survey on the presence of all types of mononucleotide, dinucleotide, trinucleotide and tetranucleotide repeats, with a length of about 20 bp or more, was conducted by searching the published rabbit DNA sequences in the EMBL nucleotide database (version 32). A total of 181 rabbit microsatellites could be extracted from the present database. The estimated frequency of microsatellites in the rabbit genome was one microsatellite for every 2–3 kb of DNA. Dinucleotide repeats constituted the prevailing class of microsatellites, followed by trinucleotide, mononucleotide and tetranucleotide repeats, respectively. The average length of the microsatellites, as found in the database, was 26, 23, 23 and 22 bp for mono-, di-, tri- and tetranucleotide repeats, respectively. The most common repeat motif was AG, followed by A, AC, AGG and CCG. This group comprised about 70% of all extracted rabbit microsatellites. About 61% of the microsatellites were found in non-coding regions of genes, whereas 15% resided in (protein) coding regions. A significant fraction of rabbit microsatellites (about 22%) was found within interspersed repetitive DNA sequences.  相似文献   

4.
赤拟谷盗全基因组和EST中微卫星的丰度   总被引:1,自引:0,他引:1  
微卫星是近年大力开发的一种分子标记,为了推进赤拟谷盗Tribolium castaneum(Herbst)遗传学相关研究,对赤拟谷盗全基因组和EST中由1~6个碱基重复单元组成的简单序列重复进行分析,进而对其微卫星的丰度和分布进行比较分析。微卫星在赤拟谷盗EST中的分布频率为1/0.87kb,其中单碱基重复序列占71.25%,是最丰富的重复单元,而六、三、四、二,五碱基重复单元序列分别占23.93%,2.94%,1.56%,0.17%,0.15%。全基因组中微卫星的分布频率为1/3.65kb,其中六碱基重复序列占61.96%,是最丰富的重复单元,而三,四,一,五,二碱基重复单元序列分别占14.35%,13.75%,4.68%,3.60%,1.69%。同时发现富含A和T碱基的微卫星占主导地位,富含G和C碱基的微卫星数量较少。进一步的分析显示,微卫星在每条染色体上的丰度存在很大的相似性。  相似文献   

5.
Simple Sequence Repeats (SSRs) are known to be scattered and present in high number in eukaryotic genomes. We demonstrate that dye-labeled oligodeoxyribonucleotides with repeated mono-, di-, tri, or tetranucleotide motifs (15-20 nucleotides in length) have an unexpected ability to recognize SSR target sequences in non-denatured chromosomes. The results show that all these probes are able to invade chromosomes, independent of the size of the repeat motif, their nucleotide sequence, or their ability to form alternative B-DNA structures such as triplex DNA. This novel and remarkable property of binding SSR oligonucleotides to duplex DNA targets permitted the development of a non-denaturing fluorescence in situ hybridization method that quickly and efficiently detects SSR-enriched chromosome regions in mitotic, meiotic, and polytene chromosome spreads of different model organisms. These results have implications for genome analysis and for investigating the roles of SSRs in chromosome structure and function.  相似文献   

6.
A cosmid library made from brown-headed cowbird (Molothrus ater) DNA was examined for representation of 17 distinct microsatellite motifs including all possible mono-, di-, and trinucleotide microsatellites, and the tetranucleotide repeat (GATA)n. The overall density of microsatellites within cowbird DNA was found to be one repeat per 89 kb and the frequency of the most abundant motif, (AGC)n, was once every 382 kb. The abundance of microsatellites within the cowbird genome is estimated to be reduced approximately 15-fold compared to humans. The reduced frequency of microsatellites seen in this study is consistent with previous observations indicating reduced numbers of microsatellites and other interspersed repeats in avian DNA. In addition to providing new information concerning the abundance of microsatellites within an avian genome, these results provide useful insights for selecting cloning strategies that might be used in the development of locus-specific microsatellite markers for avian studies.  相似文献   

7.
We report the results of a comprehensive search of Drosophila melanogaster DNA sequences in GenBank for di-, tri-, and tetranucleotide repeats of more than four repeat units, and a DNA library screen for dinucleotide repeats. Dinucleotide repeats are more abundant (66%) than tri- (30%) or tetranucleotide (4%) repeats. We estimate that 1917 dinucleotide repeats with 10 or more repeat units are present in the euchromatic D. melanogaster genome and, on average, they occur once every 60 kb. Relative to many other animals, dinucleotide repeats in D. melanogaster are short. Tri- and tetranucleotide repeats have even fewer repeat units on average than dinucleotide repeats. Our WorldWide Web site (http://www.bio.cornell.edu/genetics/aquadro/aquadro.html) posts the complete list of 1298 microsatellites (≥ five repeat units) identified from the GenBank search. We also summarize assay conditions for 70 D. melanogaster microsatellites characterized in previous studies and an additional 56 newly characterized markers.  相似文献   

8.
中国明对虾基因组小卫星重复序列分析   总被引:4,自引:0,他引:4  
高焕  孔杰 《动物学报》2005,51(1):101-107
通过对中国明对虾基因组随机DNA片断的测序 ,我们获得了总长度约 6 4 10 0 0个碱基的基因组DNA序列 ,从中共找到 172 0个重复序列。其中 ,小卫星序列的数目为 398个 ,占重复序列总数目的 2 3 14 %。这些小卫星序列的重复单位长度为 7- 16 5个碱基 ,集中分布于 7- 2 1个碱基范围内 ,其中以重复单位长度为 12个碱基的重复序列数目最多 ,为 5 8个 ,占小卫星重复序列总数目的 14 5 7%。不同拷贝数目所对应的重复序列的数目情况为 :拷贝数目为 2的重复单位所组成的重复序列数目最多 ,为 137个 ;其次是拷贝数目为 3的重复序列 ,为12 2个 ,且随着拷贝数目的增加 ,由其所组成的重复序列的数目呈递减的趋势。其中一部分序列见GeneBank数据库 ,登录号为AY6 990 72 -AY6 990 76。 398个重复序列分别由 398种重复单位所组成 ,因而小卫星重复序列的类型很多 ,我们初步分成三类 :两种碱基组成类别、三种碱基组成类别和四种碱基组成类别 ,并进一步根据各个重复序列中所含有的碱基种类的数量从大到小排列这些碱基而分成若干小类。从这些分类中可以看出 ,中国明对虾基因组中的小卫星整体上是富含A T的重复序列 ,并具有一定的“等级制度” ,揭示了其与微卫星重复序列之间的关系 ,即一部分小卫星重复序列可能起源于微卫星  相似文献   

9.
In a recent study, we reported that the combined average mutation rate of 10 di-, 6 tri-, and 8 tetranucleotide repeats in Drosophila melanogaster was 6.3 x 10(-6) mutations per locus per generation, a rate substantially below that of microsatellite repeat units in mammals studied to date (range = 10(-2)-10(-5) per locus per generation). To obtain a more precise estimate of mutation rate for dinucleotide repeat motifs alone, we assayed 39 new dinucleotide repeat microsatellite loci in the mutation accumulation lines from our earlier study. Our estimate of mutation rate for a total of 49 dinucleotide repeats is 9.3 x 10(-6) per locus per generation, only slightly higher than the estimate from our earlier study. We also estimated the relative difference in microsatellite mutation rate among di-, tri-, and tetranucleotide repeats in the genome of D. melanogaster using a method based on population variation, and we found that tri- and tetranucleotide repeats mutate at rates 6.4 and 8.4 times slower than that of dinucleotide repeats, respectively. The slower mutation rates of tri- and tetranucleotide repeats appear to be associated with a relatively short repeat unit length of these repeat motifs in the genome of D. melanogaster. A positive correlation between repeat unit length and allelic variation suggests that mutation rate increases as the repeat unit lengths of microsatellites increase.   相似文献   

10.
11.
Simple sequence repeats (SSRs) are becoming standard DNA markers for plant genome analysis and are being used as markers in marker assisted breeding. And hence because of its great significance we have initiated this study to analyze complete genome of Arabidopsis thaliana for the prevalence of mono-, di-, tri-, tetra-, penta- and hexa- mer repeats in the coding and non-coding regions of the chromosome and to map their exact position on the sequence. We have developed a program that can search a repeat of any length, its exact position on the chromosome and also its frequency of occurrence in the genome. Analysis of the results reveal that maximum number of repeats were found in chromosome 1 followed by chromosome 2 and 4 whereas, chromosome 3 and 5 contain relatively less number of these repeats. Among the SSRs, hexamers and dimers were more predominant in the chromosomes. Overall data showed that Chromosome 5 has minimum number of repeats. The abundance or rarity of various simple repeats in different chromosomes is not explained by nucleotide composition of sequence or potential repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication / repair / recombination machinery might play an important role in genesis of repeats. The positional information is given at www.geocities.com/amubioinfo/ARD. This positional information can help Arabidopsis researchers to identify new polymorphisms in chromosomal regions of interest based on the SSRs that map in the area.  相似文献   

12.
We fit a Markov chain model of microsatellite evolution introduced by Kruglyak et al. to data on all di-, tri-, and tetranucleotide repeats in the yeast genome. Our results suggest that many features of the distribution of abundance and length of microsatellites can be explained by this simple model, which incorporates a competition between slippage events and base pair substitutions, with no need to invoke selection or constraints on the lengths. Our results provide some new information on slippage rates for individual repeat motifs, which suggest that AT-rich trinucleotide repeats have higher slippage rates. As our model predicts, we found that many repeats were adjacent to shorter repeats of the same motif. However, we also found a significant tendency of microsatellites of different motifs to cluster.  相似文献   

13.
Simple sequence repeats (SSRs) are present abundantly in most eukaryotic genomes. They affect several cellular processes like chromatin organization, regulation of gene activity, DNA repair, DNA recombination, etc. Though considerable data exists on using nuclear SSRs to infer phylogenetic relationships, the potential of chloroplast microsatellites (cpSSR), in this regard, remains largely unexplored. In the present study we probe various nucleotide repeat motifs (NRMs) / types of SSRs present in chloroplast genomes (cpDNA) of 12 species belonging to Brassicaceae family. NRMs show a non-random distribution in coding and non-coding compartments of cpDNA. As expected, trinucleotide repeats are more common in coding regions while other repeat motifs are prominent in non-coding DNA. Total numbers of SSRs in coding region show little variation between species while considerable variation is exhibited by SSRs in non-coding regions. Finally, we have designed universal primers that yield polymorphic amplicons from all 12 species. Our analysis also suggests that amplicon length polymorphism shows no significant relationship with sequence based phylogeny of SSRs in cpDNA of Brassicaceae family.  相似文献   

14.
The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and minisatellite repeats largely follows taxonomic groupings. We found that only particular simple sequence repeat motifs are amplified in gymnosperm genomes, while others such as (CAC)5 and (GACA)4 are present in only low copy numbers. The variation in abundance of simple sequence motifs reflects a similar situation to that found in angiosperms. Species of the two- and three-needle pine section Pinus are relatively conserved and can be distinguished from Pinus strobus which belongs to the five-needle pine section Strobus. The hybridization pattern of Picea species, bald cypress and gingko were different from the patterns detected in the Pinus species. Furthermore, sequences with homology to the plant telomeric repeat (TTTAGGG)n have been analyzed in the same set of gymnosperms. Telomere-like repeats are highly amplified within two- and three- needle pine genomes, such as slash pine (Pinus elliottii Engelm. var. elliottii), compared to P. strobus, Picea species, bald cypress and gingko. P. elliottii var. elliottii was used as a representative species to investigate the chromosomal organization of telomere-like sequences by fluorescence in situ hybridization (FISH). The telomere-like sequences are not restricted to the ends of chromosomes; they form large intercalary and pericentric blocks showing that they are a repeated component of the slash pine genome.Conifers have genomes larger than 20000 Mbp, and our results clearly demonstrate that repeats of low sequence complexity, such to (CA)8, (GA)8, (GGAT)4 and (GATA)4, and minisatellite- and telomere-like sequences represent a large fraction of the repetitive DNA of these species. The striking differences in abundance and genome organization of the various repeat motifs suggest that these repetitive sequences evolved differently in the gymnosperm genomes investigated. Received: 1 October 1999 / Accepted: 3 November 1999  相似文献   

15.
Recent studies have shown the non-random distribution of microsatellite motifs between genomic regions within a particular species. This study investigates such microsatellite distributions in the genome of the economically important abalone Haliotis midae, via a bioinformatic survey. In particular, the association of specific repeat motifs to coding regions and transposable elements is investigated. An understanding of microsatellite genomic distribution will facilitate more efficient use and development of this popular molecular marker. A bias toward di- and tetranucleotide repeats was found in the H. midae genome. CA microsatellite units were the most abundant repeat motif, but were notably underrepresented in genic regions where GAGT repeats predominate. Approximately 17.5% and 21% of the microsatellites showed gene and/or transposable element associations, respectively. This could explain the high genomic frequencies of particular motifs across the genome and may allude to a possible functional role. The data presented in this study are the first to demonstrate such non-random dispersal of microsatellites in abalone and support previous findings arguing in favor of non-random distribution of repeat motifs.  相似文献   

16.
We have isolated, characterized and mapped 33 dinucleotide, three trinucleotide and one tetranucleotide repeat loci from the four major chromosomes of Drosophila pseudoobscura. Average inferred repeat unit length of the dinucleotide repeats is 12 repeat units, similar to D. melanogaster. Assays of D. pseudoobscura and populations of its sibling species, D. persimilis, using 10 of these loci show extremely high levels of variation compared with similar studies of dinucleotide repeat variation in D. melanogaster populations. The high levels of variation are consistent with an average mutation rate of approximately 10(-6) per locus per generation and an effective population size of D. pseudoobscura approximately four times larger than that of D. melanogaster. Consistent with allozymes and nucleotide sequence polymorphism, the dinucleotide repeat loci reveal minimal structure across four populations of D. pseudoobscura. Finally, our preliminary recombinational mapping of 24 of these microsatellites suggests that the total recombinational genome size may be larger than previously inferred using morphological mutant markers.  相似文献   

17.
Oligonucleotide usage in archaeal and bacterial genomes can be linked to a number of properties, including codon usage (trinucleotides), DNA base-stacking energy (dinucleotides), and DNA structural conformation (di- to tetranucleotides). We wanted to assess the statistical information potential of different DNA ‘word-sizes’ and explore how oligonucleotide frequencies differ in coding and non-coding regions. In addition, we used oligonucleotide frequencies to investigate DNA composition and how DNA sequence patterns change within and between prokaryotic organisms. Among the results found was that prokaryotic chromosomes can be described by hexanucleotide frequencies, suggesting that prokaryotic DNA is predominantly short range correlated, i.e., information in prokaryotic genomes is encoded in short oligonucleotides. Oligonucleotide usage varied more within AT-rich and host-associated genomes than in GC-rich and free-living genomes, and this variation was mainly located in non-coding regions. Bias (selectional pressure) in tetranucleotide usage correlated with GC content, and coding regions were more biased than non-coding regions. Non-coding regions were also found to be approximately 5.5% more AT-rich than coding regions, on average, in the 402 chromosomes examined. Pronounced DNA compositional differences were found both within and between AT-rich and GC-rich genomes. GC-rich genomes were more similar and biased in terms of tetranucleotide usage in non-coding regions than AT-rich genomes. The differences found between AT-rich and GC-rich genomes may possibly be attributed to lifestyle, since tetranucleotide usage within host-associated bacteria was, on average, more dissimilar and less biased than free-living archaea and bacteria.  相似文献   

18.
I have examined potential determinants of the asymmetric distribution of nucleotide sequences in the genome of Escherichia coli as cataloged in GenBank release 44. I have used the frequency of occurrence of all possible tetranucleotides in a given sequence catalog or derivative as a comparative measure of asymmetry. The GenBank-cataloged strand and its complement show statistically similar (not complementary) distributions. The distribution is statistically similar in comparisons between the protein coding subset and the total genome, the coding subset and selected non-coding genes, the coding subset and the remainder of the DNA, and the coding subset and stable RNA sequences. I have compared the distribution in the genome of E. coli with the distributions found in the cataloged genomes of Salmonella typhimurium, Bacillus subtilis, and of coliphages lambda and T7. The distribution summed in both strands of the cataloged DNA differs statistically only in comparisons with lytic bacteriophage T7 because only the two strands of T7 show statistically dissimilar distributions. Despite similarities in tetranucleotide distribution, the pattern of codon complementarity in B. subtilis is different than that documented for E. coli. Thus, sequence asymmetry does not seem related to specific DNA function or to documented similarities or differences in codon bias. The sequence asymmetry of the E. coli genome may thus reflect a hitherto unsuspected pattern impressed on both strands of DNA which is or can be packaged into bacterial genomes.  相似文献   

19.
红原鸡全基因组中微卫星分布规律研究   总被引:1,自引:0,他引:1  
本文对红原鸡Gallus gallus全基因组中微卫星数量及分布规律进行了分析,查找到l~6个碱基重复类型的微卫星序列共282728个,约占全基因组序列(1.1Gb)的0.49%,分布频率为1/3.89kb,微卫星序列的长度主要在12~70个碱基长度范围内。第1、2、3条染色体上微卫星分布频率较高,而32号染色体上无微卫星分布。不同类型微卫星中,单碱基重复类型数目最多,为184192个,占总数的65.1%;其次是四、二、三、五、六碱基重复单元序列,分别占到总数的12.8%、9.7%、7.2%、4.6%、0.8%。T、A、AT、GTTT、AAAC、G、C、ATTT、AC、GT、AAAT、ATT、AAC、AAT、GTT、AG、CT、CTTT、AAAG、GTTTT、AAACA、AAGG、CCTT是红原鸡基因组中最主要的微卫星重复类型。本研究为红原鸡微卫星标记的分离筛选、遗传多样性的研究以及不同物种微卫星的比较分析奠定了基础。  相似文献   

20.
Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号