期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Survey of simple sequence repeats in completed fungal genomes 总被引：7，自引：0，他引：7

Karaoglu H Lee CM Meyer W 《Molecular biology and evolution》2005,22(3):639-649

The use of simple sequence repeats or microsatellites as genetic markers has become very popular because of their abundance and length variation between different individuals. SSRs are tandem repeat units of 1 to 6 base pairs that are found abundantly in many prokaryotic and eukaryotic genomes. This is the first study examining and comparing SSRs in completely sequenced fungal genomes. We analyzed and compared the occurrences, relative abundance, relative density, most common, and longest SSRs in nine taxonomically different fungal species: Aspergillus nidulans, Cryptococcus neoformans, Encephalitozoon cuniculi, Fusarium graminearum, Magnaporthe grisea, Neurospora crassa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Ustilago maydis. Our analysis revealed that, in all of the genomes studied, the occurrence, abundance, and relative density of SSRs varied and was not influenced by the genome sizes. No correlation between relative abundance and the genome sizes was observed, but it was shown that N. crassa, the largest genome analyzed had the highest relative abundance of SSRs. In most genomes, mononucleotide, dinucleotide, and trinucleotide repeats were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. Our analysis showed that the relative abundance of SSRs in fungi is low compared with the human genome and that longer SSRs in fungi are rare. In addition to providing new information concerning the abundance of SSRs for each of these fungi, the results provide a general source of molecular markers that could be useful for a variety of applications such as population genetics and strain identification of fungal organisms. 相似文献

2.

Simple sequence repeats in different genome sequences of Shigella and comparison with high GC and AT-rich genomes.

Ashraf Hosseini Suvidya H Ranade Indira Ghosh Pramod Khandekar 《DNA sequence》2008,19(3):167-176

Simple sequence repeats (SSRs) are omnipresent in prokaryotes and eukaryotes, and are found anywhere in the genome in both protein encoding and noncoding regions. In present study the whole genome sequences of seven chromosomes (Shigella flexneri 2a str301 and 2457T, Shigella sonnei, Escherichia coli k12, Mycobacterium tuberculosis, Mycobacterium leprae and Staphylococcus saprophyticus) have downloaded from the GenBank database for identifying abundance, distribution and composition of SSRs and also to determine difference between the tandem repeats in real genome and randomness genome (using sequence shuffling tool) of the organisms included in this study. The data obtained in the present study show that: (i) tandem repeats are widely distributed throughout the genomes; (ii) SSRs are differentially distributed among coding and noncoding regions in investigated Shigella genomes; (iii) total frequency of SSRs in noncoding regions are higher than coding regions; (iv) in all investigated chromosomes ratio of Trinucleotide SSRs in real genomes are much higher than randomness genomes and Di nucleotide SSRs are lower; (v) Ratio of total and mononucleotide SSRs in real genome is higher than randomness genomes in E. coli K12, S. flexneri str 301 and S. saprophyticus, while it is lower in S. flexneri str 2457T, S.sonnei and M. tuberculosis and it is approximately same in M. leprae; (vi) frequency of codon repetitions are vary considerably depending on the type of encoded amino acids. 相似文献

3.

Incidence,complexity and diversity of simple sequence repeats across potexvirus genomes

Chaudhary Mashhood Alam Avadhesh Kumar Singh Choudhary Sharfuddin Safdar Ali 《Gene》2014

An in-silico analysis of simple sequence repeats (SSRs) in genomes of 32 species of potexviruses was performed wherein a total of 691 SSRs and 33 cSSRs were observed. Though SSRs were present in all the studied genomes their incident frequency ranged from 11 to 30 per genome. Further, 10 potexvirus genomes possessed no cSSRs when extracted at a dMAX of 10 and wherein present, the highest frequency was 3. SSR and cSSR incidence, relative density and relative abundance were non-significantly correlated with genome size and GC content suggesting an ongoing evolutionary and adaptive phase of the virus species. SSRs present primarily ranged from mono- to tri-nucleotide repeat motifs with a greatly skewed distribution across the coding and non-coding regions. Present work is an effort for the undergoing compilation and analysis of incidence, distribution and variation of the viral repeat sequences to understand their evolutionary and functional relevance. 相似文献

4.

Genome-Wide Comparative Analyses of Microsatellites in Papaya

Jianping Wang Cuixia Chen Jong-Kuk Na Qingyi Yu Shaobin Hou Robert E. Paull Paul H. Moore Maqsudul Alam Ray Ming 《Tropical plant biology》2008,1(3-4):278-292

Microsatellites, or simple sequence repeats (SSRs), are highly polymorphic and universally distributed in eukaryotes. SSRs have been used extensively as sequence tagged markers in genetic studies. Recently, the functional and evolutionary importance of SSRs has received considerable attention. Here we report the mining and characterization of the SSRs in papaya genome. We analyzed SSRs from 277.4 Mb of whole genome shotgun (WGS) sequences, 51.2 Mb bacterial artificial chromosome (BAC) end sequences (BES), and 13.4 Mb expressed sequence tag (EST) sequences. The papaya SSR density was one SSR per 0.7 kb of DNA sequence in the WGS, which was higher than that in BES and EST sequences. SSR abundance was dramatically reduced as the repeat length increased. According to SSR motif length, dinucleotide repeats were the most common motif in class I, whereas hexanucleotides were the most copious in class II SSRs. The tri- and hexanucleotide repeats of both classes were greater in EST sequences compared to genomic sequences. In class I SSR, AT and AAT were the most frequent motifs in BES and WGS sequences. By contrast, AG and AAG were the most abundant in EST sequences. For SSR marker development, 9,860 primer pairs were surveyed for amplification and polymorphism. Successful amplification and polymorphic rates were 66.6% and 17.6%, respectively. The highest polymorphic rates were achieved by AT, AG, and ATG motifs. The genome wide analysis of microsatellites revealed their frequency and distribution in papaya genome, which varies among plant genomes. This complete set of SSRs markers throughout the genome will assist diverse genetic studies in papaya and related species. 相似文献

5.

Characterization of Mononucleotide Repeats in Sequenced Prokaryotic Genomes

Coenye Tom; Vandamme Peter 《DNA research》2005,12(4):221-233

The increasing availability of prokaryotic genome sequenceshas shown that simple sequence repeats (SSRs) are widespreadin prokaryotes and that there is extensive variation in theirlength, number and distribution. Considering their potentialimportance in generating genomic diversity, we determined thedistribution of a specific group of SSRs, mononucleotide repeatsof size between 5 and 13 nt, in 157 sequenced prokaryotic genomes.The data obtained in the present study show that (i) a largenumber of mononucleotide SSRs is present in all prokaryoticgenomes investigated, (ii) shorter repeats are much more abundantthan longer repeats, and (iii) in the majority of the genomes,longer mononucleotide SSRs are excluded from coding regionsalthough we identified several organisms where mononucleotideSSRs are not excluded from the coding regions. We also observedthat some genomes contain more mononucleotide SSRs than expected,while others contain significantly less. Bacterial genomes thatcontain much less mononucleotide SSRs than expected are generallylarger and more GC-rich, while bacterial genomes that containmuch more mononucleotide SSRs than expected are in general smallerand more AT-rich. Finally, we also noted that genomes that containa high fraction of horizontally transferred genes have a lowermononucleotide SSR density and that A and T are generally overrepresentedin mononucleotide SSRs. 相似文献

6.

Map and analysis of microsatellites in the genome of <Emphasis Type="Italic">Populus</Emphasis>: The first sequenced perennial plant

下载免费PDF全文

Li ShuXian Yin TongMing 《中国科学C辑(英文版)》2007,50(5):690-699

We mapped and analyzed the microsatellites throughout 284295605 base pairs of the unambiguously assembled sequence scaffolds along 19 chromosomes of the haploid poplar genome. Totally, we found 150985 SSRs with repeat unit lengths between 2 and 5 bp. The established microsatellite physical map demonstrated that SSRs were distributed relatively evenly across the genome of Populus. On average, These SSRs occurred every 1883 bp within the poplar genome and the SSR densities in intergenic regions, introns, exons and UTRs were 85.4%, 10.7%, 2.7% and 1.2%, respectively. We took di-, tri-, tetra-and pentamers as the four classes of repeat units and found that the density of each class of SSRs decreased with the repeat unit lengths except for the tetranucleotide repeats. It was noteworthy that the length diversification of microsatellite sequences was negatively correlated with their repeat unit length and the SSRs with shorter repeat units gained repeats faster than the SSRs with longer repeat units. We also found that the GC content of poplar sequence significantly correlated with densities of SSRs with uneven repeat unit lengths (tri-and penta-), but had no significant correlation with densities of SSRs with even repeat unit lengths (di-and tetra-). In poplar genome, there were evidences that the occurrence of different microsatellites was under selection and the GC content in SSR sequences was found to significantly relate to the functional importance of microsatellites. 相似文献

7.

In silico analysis of SSRs in mitochondrial genomes of plants

Kuntal H Sharma V 《Omics : a journal of integrative biology》2011,15(11):783-789

Simple sequence repeats (SSRs) or microsatellites constitute a countable portion of genomes. However, the significance of SSRs in organelle genomes has not been completely understood. The availability of organelle genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. In the current study we surveyed the patterns of SSRs in mitochondrial genomes of different taxa of plants. A total of 16 mitochondrial genomes, from algae to angiosperms, have been considered to analyze the pattern of simple sequence repeats present in them. Based on study, the mononucleotide repeats of A/T were found to be more prevalent in mitochondrial genomes over other repeat types. The dinucleotides repeats, TA/AT, were the second most numerous, whereas tri-, tetra-, and pentanucleotide repeats were in less number and present in intronic or intergenic portions only. Mononucleotide repeats prevailed in protein-coding exonic portions of all organisms. These results indicates that microsatellite pattern in mitochondrial genomes is different from nuclear genomes and also focuses on organization and diversity at SSR locuses in mitochondrial genomes. This is the novel report of microsatellite polymorphism in plant mitochondrion on whole genome level. 相似文献

8.

Comparison of simple sequence repeats in 19 Archaea

Trivedi S 《Genetics and molecular research : GMR》2006,5(4):741-772

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome. 相似文献

9.

Microsatellites in different Potyvirus genomes: survey and analysis 总被引：2，自引：0，他引：2

Zhao X Tan Z Feng H Yang R Li M Jiang J Shen G Yu R 《Gene》2011,488(1-2):52-56

Simple sequence repeats (SSRs) have been extensively used for various genetic and evolutionary studies in eukaryotic and prokaryotic organisms, while few relevant researches have been made in viruses. The Potyvirus is a fine system to study roles and evolution of SSRs in viruses. The densities, relative abundances, compositions and evolutionary inferences of SSRs in 45 different Potyvirus genomes have been analyzed in this study. Results showed that the densities and relative abundances of SSRs are similar in all those Potyvirus genomes. The number of SSRs decreases with an increase in the length of repeat unit. Dinucleotide repeats are the most abundant and followed by trinucleotide repeats, and the numbers of tetra-, penta- and hexanucleotide repeats are very small. Repeats of AC/CA, AG/GA and AAG/GAA predominate, whereas repeats of CG/GC, ATA and CAC are rare. The genome sizes of the Potyvirus species have little influence on the total number and relative abundance of SSRs. Our study suggested that the variety of SSRs may be related to the genome diversity of Potyvirus. Maybe Potyvirus and HIV genomes have the similar evolution mode and parallel evolution level. 相似文献

10.

Long perfect dinucleotide repeats are typical of vertebrates, show motif preferences and size convergence

Almeida P Penha-Gonçalves C 《Molecular biology and evolution》2004,21(7):1226-1233

Microsatellites are simple sequence repeats (SSRs) showing complex patterns of length, motif sizes, motif sequences, and repeat perfection. We studied the structure of the dinucleotide SSR population at the genome level by analyzing assembled DNA sequence across species. Three dinucleotide populations were distinguished when SSR genome frequency was analyzed as a function of repeat length and repeat perfection. A population of low-perfection SSRs was identified, which is constituted by short repeats and represents the vast majority of genomic dinucleotide SSRs across eukaryotic genomes. In turn, the highly perfect repeats are 30 to 50 times less frequent and, in addition to short repeats, also contain a long repeat population that is uniquely represented in vertebrate species. Distinctive features of this population include the modal peak in the frequency distribution of repeat length and the strong preferential usage of the repeat motifs AC and AG. These results raise the hypothesis that the ability of carrying a distinct population of long, highly perfect dinucleotide repeats in the genome is a late acquisition in chordate evolution. Our analysis also suggests that different dinucleotide repeat populations have different dynamics and are likely to be underlined by different molecular mechanisms of generation and maintenance in the genome. Thus, these observations imply that caution should be taken in extrapolating results from studies on SSR mutability and on SSR phylogenetic comparisons that do not take into account the stratification of dinucelotide populations in the eukaryotic genome. 相似文献

11.

Map and analysis of microsatellites in the genome of Populus: The first sequenced perennial plant

LI ShuXian & YIN TongMing 《中国科学：生命科学英文版》2007,50(5):690-699

Environmental Sciences Division, Oak Ridge National Laboratory, TN, USA We mapped and analyzed the microsatellites throughout 284295605 base pairs of the unambiguously assembled sequence scaffolds along 19 chromosomes of the haploid poplar genome. Totally, we found 150985 SSRs with repeat unit lengths between 2 and 5 bp. The established microsatellite physical map demonstrated tr at SSRs were distributed relatively evenly across the genome of Populus. On average, These SSRs occurred every 1883 bp within the poplar genome and the SSR densities in intergenic regions, introns, exons and UTRs were 85.4%, 10.7%, 2.7% and 1.2%, respectively. We took di-, tri-, tetra-and pentamers as the four classes of repeat units and found that the density of each class of SSRs decreased with the repeat unit lengths except for the tetranucleotide repeats. It was noteworthy that the length diversification of microsatellite sequences was negatively correlated with their repeat unit length and the SSRs with shorter repeat units gained repeats faster than the SSRs with longer repeat units. We also found that the GC content of poplar sequence significantly correlated with densities of SSRs with uneven repeat unit lengths (tri-and penta-), but had no significant correlation with densities of SSRs with even repeat unit lengths (di-and tetra-). In poplar genome, there were evidences that the occurrence of different microsatellites was under selection and the GC content in SSR sequences was found to significantly relate to the functional importance of microsatellites. 相似文献

12.

An annotated catalogue of salivary gland transcripts in the adult female mosquito, Ædes ægypti*

José MC Ribeiro Bruno Arcà Fabrizio Lombardo Eric Calvo My Van Phan Prafulla K Chandra Stephen K Wikel 《BMC genomics》2007,8(1):1-27

Background

The number of completely sequenced plastid genomes available is growing rapidly. This array of sequences presents new opportunities to perform comParative analyses. In comParative studies, it is often useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (a basal eudicot). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as protein coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.

Results

The Nuphar [GenBank:NC_008788] and Ranunculus [GenBank:NC_008796] plastid genomes share characteristics of gene content and organization with many other chloroplast genomes. Like other plastid genomes, these genomes are A+T-rich, except for rRNA and tRNA genes. Detailed comparisons of Nuphar with Nymphaea, another Nymphaeaceae, show that more than two-thirds of these genomes exhibit at least 95% sequence identity and that most SSRs are shared. In broader comparisons, SSRs vary among genomes in s of abundance and length and most contain repeat motifs based on A and T nucleotides.

Conclusion

SSR and SDR abundance varies by genome and, for SSRs, is proportional to genome size. Long SDRs are rare in the genomes assessed. SSRs occur less frequently than predicted and, although the majority of the repeat motifs do include A and T nucleotides, the A+T bias in SSRs is less than that predicted from the underlying genomic nucleotide composition. In codon usage third positions show an A+T bias, however variation in codon usage does not correlate with differences in A+T-richness. Thus, although plastome nucleotide composition shows "A+T richness", an A+T bias is not apparent upon more in-depth analysis, at least in these aspects. The pattern of evolution in the sequences identified as ycf15 and ycf68 is not consistent with them being protein-coding genes. In fact, these regions show no evidence of sequence conservation beyond what is normal for non-coding regions of the IR. 相似文献

13.

The Complete Chloroplast Genome Sequence of the Medicinal Plant Salvia miltiorrhiza

Jun Qian Jingyuan Song Huanhuan Gao Yingjie Zhu Jiang Xu Xiaohui Pang Hui Yao Chao Sun Xian’en Li Chuyuan Li Juyan Liu Haibin Xu Shilin Chen 《PloS one》2013,8(2)

Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant. 相似文献

14.

Mono-nucleotide repeats (MNRs): a neglected polymorphism for generating high density genetic maps in silico

Cohen H Danin-Poleg Y Cohen CJ Sprecher E Darvasi A Kashi Y 《Human genetics》2004,115(3):213-220

Short, tandemly repeated DNA motifs, termed SSRs (simple sequence repeats) are widely distributed throughout eukaryotic genomes and exhibit a high degree of polymorphism. The availability of size-based methods for genotyping SSRs has made them the markers of choice for genetic linkage studies in all higher eukaryotes. These genotyping methods are not efficiently applicable to mononucleotide repeats (MNRs). Consequently, MNRs, although highly frequent in the genome, have generally been ignored as genetic markers. In contrast to single nucleotide polymorphisms (SNPs), SSRs can be identified in silico once the genomic sequence or segment of interest is available, without requiring any additional information. This makes possible ad-hoc saturation of a target chromosomal region with informative markers. In this context, MNRs appear to have much to offer by increasing the degree of marker saturation that can be obtained. By using the human genome sequence as a model, computational analysis demonstrates that MNRs in the size of 9–15 bp are highly abundant, with an average appearance every 2.9 kb, exceeding di- and tri-nucleotide SSRs frequencies by two- and five-fold, respectively. In order to enable practical, high throughput MNR genotyping, a rapid method was developed, based on sizing of fluorescent-labeled primer extension products. Genotyping of 16 arbitrarily chosen non-coding MNR sites along human chromosome 22 revealed that almost two-thirds (63%) of them were polymorphic, having 2–5 alleles per locus, with 20% of the polymorphic MNRs having more than two alleles. Thus, MNRs have potential for in silico saturation of sequenced eukaryote genomes with informative genetic markers.Helit Cohen and Yael Danin-Poleg contributed equally to this work 相似文献

15.

Simple sequence repeats in organellar genomes of rice: frequency and distribution in genic and intergenic regions

Rajendrakumar P Biswal AK Balachandran SM Srinivasarao K Sundaram RM 《Bioinformatics (Oxford, England)》2007,23(1):1-4

MOTIVATION: Simple sequence repeats (SSRs) are abundant across genomes. However, the significance of SSRs in organellar genomes of rice has not been completely understood. The availability of organellar genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. RESULTS: We have analyzed SSRs in mitochondrial and chloroplast genomes of rice. We identified 2528 SSRs in the mitochondrial genome and average 870 SSRs in the chloroplast genomes. About 8.7% of the mitochondrial and 27.5% of the chloroplast SSRs were observed in the genic region. Dinucleotides were the most abundant repeats in genic and intergenic regions of the mitochondrial genome while mononucleotides were predominant in the chloroplast genomes. The rps and nad gene clusters of mitochondria had the maximum repeats, while the rpo and ndh gene clusters of chloroplast had the maximum repeats. We identified SSRs in both organellar genomes and validated in different cultivars and species. 相似文献

16.

Analysis of simple sequence repeats (SSRs)dynamics in fungus Fusarium graminearum 总被引：1，自引：0，他引：1

Singh R Sheoran S Sharma P Chatrath R 《Bioinformation》2011,5(10):402-404

The abundance and inherent potential for variations in simple sequence repeats (SSRs) or microsatellites resulted in valuable source for genetic markers in eukaryotes. We describe the organization and abundance of SSRs in fungus Fusarium graminearum (causative agent for Fusarium head blight or head scab of wheat). We identified 1705 SSRs of various nucleotide repeat motifs in the sequence database of F. graminearum. It is observed that mononucleotide repeats (62%) were most abundant followed by di- (20%) and trinucleotide repeats (14%). It is noted that tetra-, penta- and hexanucleotide repeats accounted for only 4% of SSRs. The estimated frequency of Class I SSRs (perfect repeats ≥20 nucleotides) was one SSR per 124.5 kb, whereas the frequency of Class II (perfect repeats >10 nucleotides and ≫20 nucleotides) was one SSR per 25.6 kb. The dynamics of SSRs will be a powerful tool for taxonomic, phylogenetic, genome mapping and population genetic studies as SSR based markers show high levels of allelic variation, codominant inheritance and ease of analysis. 相似文献

17.

Simple sequence repeats in mycobacterial genomes

Sreenu VB Kumar P Nagaraju J Nagarajam HA 《Journal of biosciences》2007,32(1):3-15

Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes. Supplementary Data pertaining to this article is available on the Journal of Biosciences Website at 相似文献

18.

Genome wide survey of microsatellites in ssDNA viruses infecting vertebrates

Ankit Jain Nikhil MittalPrakash C. Sharma 《Gene》2014

Microsatellites or Simple Sequence Repeats (SSRs) are tandem iterations of one to six base pairs, non-randomly distributed throughout prokaryotic and eukaryotic genomes. Limited knowledge is available about distribution of microsatellites in single stranded DNA (ssDNA) viruses, particularly vertebrate infecting viruses. We studied microsatellite distribution in 118 ssDNA virus genomes belonging to three families of vertebrate infecting viruses namely Circoviridae, Parvoviridae, and Anelloviridae, and found that microsatellites constitute an important component of these virus genomes. Mononucleotide repeats were predominant followed by dinucleotide and trinucleotide repeats. A strong positive relationship existed between number of mononucleotide repeats and genome size among all the three virus families. A similar relationship existed for the occurrence of DTTPH (di-, tri-, tetra-, penta- and hexa-nucleotide) repeats in the families Anelloviridae and Parvoviridae only. Relative abundance and relative density of mononucleotide repeats showed a strong positive relationship with genome size in Circoviridae and Parvoviridae. However, in the case of DTTPH repeats, these features showed a strong relationship with genome size in Circoviridae only. On the other hand, relative microsatellite abundance and relative density of mononucleotide repeats were negatively correlated with GC content (%) in Parvoviridae genomes. On the basis of available annotations, our analysis revealed maximum occurrence of mononucleotide as well as DTTPH repeats in the coding regions of these virus genomes. Interestingly, after normalizing the length of the coding and non-coding regions of each virus genome, we found relative density of microsatellites much higher in the non-coding regions. We understand that the present study will help in the better characterization of the stability, genome organization and evolution of these virus classes and may provide useful leads to decipher the etiopathogenesis of these viruses. 相似文献

19.

Distribution and characterization of simple sequence repeats in Gossypium raimondii genome

Changsong Zou Cairui Lu Youping Zhang Guoli Song 《Bioinformation》2012,8(17):801-806

Simple sequence repeats (SSRs) can be derived from the complete genome sequence. These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop SSRs for cotton gene mapping, we selected the complete genome sequence of Gossypium raimondii, which consisted of 4447 non-redundant scaffolds. Out of 775.2 Mb sequence examined, a total of 136,345 microsatellites were identified with a density of 5.69 kb per SSR in the G. raimondii genome leading to development of 112,177 primer pairs. The distributions of SSRs in the genome were non-random. Among the different motifs ranging from 1 to 6 bp, penta-nucleotide repeats were most abundant (30.5%), followed by tetra-nucleotide repeats (18.2%) and di-nucleotide repeats (16.9%). Among all identified 457 motif types, the most frequently occurring repeat motifs were poly-AT/TA, which accounted for 79.8% of the total di-nt SSRs, followed by AAAT/TTTA with 51.5% of the total tetra-nucleotede. Further, 18,834 microsatellites were detected from the protein-coding genes, and the frequency of gene containing SSRs was 46.0% in 40,976 genes of G. raimondii. These genome-based SSRs developed in the present study will lay the groundwork for developing large numbers of SSR markers for genetic mapping, gene discovery, genetic diversity analysis, and MAS breeding in cotton. 相似文献

20.

Triplet repeats in human genome: distribution and their association with genes and other genomic regions 总被引：3，自引：0，他引：3

Subramanian S Madgula VM George R Mishra RK Pandit MW Kumar CS Singh L 《Bioinformatics (Oxford, England)》2003,19(5):549-552

相似文献