首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Simple sequence repeats (SSRs) or microsatellites are known to exhibit ubiquitous across all kingdoms of life including viruses. However, imperfections in simple sequence repeats have been analyzed in genomes of human, Escherichia coli and Human Immunodeficiency virus. The assessment of compound microsatellites in plant viral genomes is yet to be studied. Potyviruses severely affect crop plant growth and reduce economic yield in diverse cropping systems worldwide. Hence, we analyze the nature and distribution of compound microsatellites present in complete genome of 45 potyvirus species. The results indicate that compound microsatellites accounted for about 0% to 15.15% of all microsatellites and have low complexity as compared to that of prokaryotic genomes. Overall, 14% of compound microsatellites were of similar motifs and such motif duplications were observed for CA, TA and AG repeats. Among all 45 potyvirus genomes analyzed, SSR couple (AG)-x-(AC) was found to be the most abundant one. Hence it is apparent that in contrast to eukaryotes, majority of compound microsatellites in potyviruses were composed of variant motifs. We also highlight the relative frequency of different classes of compound microsatellites as well as their patterns of distribution and correlate with biology of potyviruses. Further characterization of such variation is important for elucidating the origin, mutational processes, and structure of these widely used, but incompletely understood sequences.  相似文献   

2.
We mapped and analyzed the microsatellites throughout 284295605 base pairs of the unambiguously assembled sequence scaffolds along 19 chromosomes of the haploid poplar genome. Totally, we found 150985 SSRs with repeat unit lengths between 2 and 5 bp. The established microsatellite physical map demonstrated that SSRs were distributed relatively evenly across the genome of Populus. On average, These SSRs occurred every 1883 bp within the poplar genome and the SSR densities in intergenic regions, introns, exons and UTRs were 85.4%, 10.7%, 2.7% and 1.2%, respectively. We took di-, tri-, tetra-and pentamers as the four classes of repeat units and found that the density of each class of SSRs decreased with the repeat unit lengths except for the tetranucleotide repeats. It was noteworthy that the length diversification of microsatellite sequences was negatively correlated with their repeat unit length and the SSRs with shorter repeat units gained repeats faster than the SSRs with longer repeat units. We also found that the GC content of poplar sequence significantly correlated with densities of SSRs with uneven repeat unit lengths (tri-and penta-), but had no significant correlation with densities of SSRs with even repeat unit lengths (di-and tetra-). In poplar genome, there were evidences that the occurrence of different microsatellites was under selection and the GC content in SSR sequences was found to significantly relate to the functional importance of microsatellites.  相似文献   

3.
Environmental Sciences Division, Oak Ridge National Laboratory, TN, USA We mapped and analyzed the microsatellites throughout 284295605 base pairs of the unambiguously assembled sequence scaffolds along 19 chromosomes of the haploid poplar genome. Totally, we found 150985 SSRs with repeat unit lengths between 2 and 5 bp. The established microsatellite physical map demonstrated tr at SSRs were distributed relatively evenly across the genome of Populus. On average, These SSRs occurred every 1883 bp within the poplar genome and the SSR densities in intergenic regions, introns, exons and UTRs were 85.4%, 10.7%, 2.7% and 1.2%, respectively. We took di-, tri-, tetra-and pentamers as the four classes of repeat units and found that the density of each class of SSRs decreased with the repeat unit lengths except for the tetranucleotide repeats. It was noteworthy that the length diversification of microsatellite sequences was negatively correlated with their repeat unit length and the SSRs with shorter repeat units gained repeats faster than the SSRs with longer repeat units. We also found that the GC content of poplar sequence significantly correlated with densities of SSRs with uneven repeat unit lengths (tri-and penta-), but had no significant correlation with densities of SSRs with even repeat unit lengths (di-and tetra-). In poplar genome, there were evidences that the occurrence of different microsatellites was under selection and the GC content in SSR sequences was found to significantly relate to the functional importance of microsatellites.  相似文献   

4.
An in-silico analysis of simple sequence repeats (SSRs) in 30 species of tobamoviruses was done. SSRs (mono to hexa) were present with variant frequency across species. Compound microsatellites, primarily of variant motifs accounted for up to 11.43% of the SSRs. Motif duplications were observed for A, T, AT, and ACA repeats. (AG)–(TC) was the most prevalent SSR-couple. SSRs were differentially localized in the coding region with ~ 54% on the 128 kDa protein while 20.37% was exclusive to 186 kDa protein. Characterization of such variations is important for elucidating the origin, sequence variations, and structure of these widely used, but incompletely understood sequences.  相似文献   

5.
Microsatellites or Simple Sequence Repeats (SSRs) are tandem iterations of one to six base pairs, non-randomly distributed throughout prokaryotic and eukaryotic genomes. Limited knowledge is available about distribution of microsatellites in single stranded DNA (ssDNA) viruses, particularly vertebrate infecting viruses. We studied microsatellite distribution in 118 ssDNA virus genomes belonging to three families of vertebrate infecting viruses namely Circoviridae, Parvoviridae, and Anelloviridae, and found that microsatellites constitute an important component of these virus genomes. Mononucleotide repeats were predominant followed by dinucleotide and trinucleotide repeats. A strong positive relationship existed between number of mononucleotide repeats and genome size among all the three virus families. A similar relationship existed for the occurrence of DTTPH (di-, tri-, tetra-, penta- and hexa-nucleotide) repeats in the families Anelloviridae and Parvoviridae only. Relative abundance and relative density of mononucleotide repeats showed a strong positive relationship with genome size in Circoviridae and Parvoviridae. However, in the case of DTTPH repeats, these features showed a strong relationship with genome size in Circoviridae only. On the other hand, relative microsatellite abundance and relative density of mononucleotide repeats were negatively correlated with GC content (%) in Parvoviridae genomes. On the basis of available annotations, our analysis revealed maximum occurrence of mononucleotide as well as DTTPH repeats in the coding regions of these virus genomes. Interestingly, after normalizing the length of the coding and non-coding regions of each virus genome, we found relative density of microsatellites much higher in the non-coding regions. We understand that the present study will help in the better characterization of the stability, genome organization and evolution of these virus classes and may provide useful leads to decipher the etiopathogenesis of these viruses.  相似文献   

6.
Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes. Supplementary Data pertaining to this article is available on the Journal of Biosciences Website at  相似文献   

7.
Barley microsatellites: allele variation and mapping   总被引:37,自引:0,他引:37  
Microsatellites have developed into a powerful tool for mapping mammalian genomes and first reports about their use in plants have been published. A database search of 228 barley sequences from GenBank and EMBL was made to determine which simple sequence repeat (SSR) motif prevails in barley. Nearly all types of SSRs were found. The (A)n and (T)n SSRs occurred more often than (C)n and (G)n for n10. Among the dinucleotide repeats, the (CG)n SSRs occurred least often. Trinucleotide repeats did not occur with n>7 and there is no correlation between the GC content in the trinucleotide motifs and the number of observed SSRs. Analysing 15 different microsatellites with 11 barleys yielded 2.1 alleles per microsatellite. Sequencing 25 putative microsatellites showed that the resolution capacity of highquality agarose gels was sufficient to determine differences of only three base paris. Five microsatellites were mapped on three different chromosomes of a barley RFLP map.  相似文献   

8.
Simultaneous identification and comparison of perfect and imperfect microsatellites within a genome is a valuable tool both to overcome the lack of a consensus definition of SSRs and to assess repeat history. Detailed analysis of the overall distribution of perfect and imperfect microsatellites in closely related bacterial taxa is expected to give new insight into the evolution of prokaryotic genomes. We have performed a genome-wide analysis of microsatellite distribution in four Escherichia coli and seven Chlamydial strains. Chlamydial strains generally have a higher density of SSRs and show greater intra-group differences of SSR distribution patterns than E. coli genomes. In most investigated genomes the distribution of the total lengths of matching perfect and imperfect trinucleotide repeats are highly similar, with the notable exception of C. muridarum. Closely related strains show more similar repeat distribution patterns than strains separated by a longer divergence time. The discrepancy between the preferred classes of perfect and imperfect repeats in C. muridarum implies accelerated evolution of SSRs in this particular strain. Our results suggest that microsatellites, although considerably less abundant than in eukaryotic genomes, may nevertheless play an important role in the evolution of prokaryotic genomes and several gene families.  相似文献   

9.

Background

The giant panda (Ailuropoda melanoleuca) is a critically endangered species endemic to China. Microsatellites have been preferred as the most popular molecular markers and proven effective in estimating population size, paternity test, genetic diversity for the critically endangered species. The availability of the giant panda complete genome sequences provided the opportunity to carry out genome-wide scans for all types of microsatellites markers, which now opens the way for the analysis and development of microsatellites in giant panda.

Results

By screening the whole genome sequence of giant panda in silico mining, we identified microsatellites in the genome of giant panda and analyzed their frequency and distribution in different genomic regions. Based on our search criteria, a repertoire of 855,058 SSRs was detected, with mono-nucleotides being the most abundant. SSRs were found in all genomic regions and were more abundant in non-coding regions than coding regions. A total of 160 primer pairs were designed to screen for polymorphic microsatellites using the selected tetranucleotide microsatellite sequences. The 51 novel polymorphic tetranucleotide microsatellite loci were discovered based on genotyping blood DNA from 22 captive giant pandas in this study. Finally, a total of 15 markers, which showed good polymorphism, stability, and repetition in faecal samples, were used to establish the novel microsatellite marker system for giant panda. Meanwhile, a genotyping database for Chengdu captive giant pandas (n = 57) were set up using this standardized system. What’s more, a universal individual identification method was established and the genetic diversity were analysed in this study as the applications of this marker system.

Conclusion

The microsatellite abundance and diversity were characterized in giant panda genomes. A total of 154,677 tetranucleotide microsatellites were identified and 15 of them were discovered as the polymorphic and stable loci. The individual identification method and the genetic diversity analysis method in this study provided adequate material for the future study of giant panda.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1268-z) contains supplementary material, which is available to authorized users.  相似文献   

10.
简单重复序列亦称微卫星,被成功应用于许多真核生物、原核生物和病毒的基因组和进化研究,但是噬菌体中的微卫星目前很少被研究。因此对60条尾病毒目基因组中的微卫星和和复合型微卫星(由两个或两个以上直接相邻的微卫星组成)做综合性分析,在这60个基因组中总共观察到11 874个微卫星和449个复合型微卫星。相关性分析表明微卫星个数与基因组大小成正线性相关(ρ=0.899, P<0.01)。参考序列中的微卫星个数少于对应的随机序列中微卫星个数,这种反常现象主要是因为参考序列含有较少的单核苷酸和二核苷酸重复。A/T和AT/TA重复是单核苷酸和二核苷酸重复中最主要的类型,因此单核苷酸重复中的GC含量明显低于相应的序列中的GC含量;相比之下,微卫星中的二核苷酸和三核苷酸重复的GC含量与对应的参考序列的GC含量无明显区别。尾病毒目基因组中的这些结果与其它生物体基因组存在一定的差别。有助于了解尾病毒目中微卫星的分布、进化和生物学功能。  相似文献   

11.
蚊子全基因组中微卫星的丰度及其分布   总被引:6,自引:0,他引:6  
微卫星是近年大力开发的一种遗传标记,为推进按蚊遗传学相关研究,对按蚊全基因组中由 1~6 个碱基重复单元组成的简单序列重复 ( 微卫星 ) 进行了分析 . 进而对其微卫星的丰度和分布进行了比较分析,也比较了染色体各个区域 ( 外显子、内含子和基因间隔区 ) 之间的分布差异 . 微卫星在按蚊基因组中的比例约占 2.14% ,其中 X 染色体拥有微卫星的密度最大 . 对按蚊基因组中微卫星丰度而言, A 碱基和 C 碱基重复在基因组中丰度相似, AC 单元的丰度是 AG 单元的两倍多,然而 AT 和 CG 单元非常稀少;对于三四碱基而言, AGC, AAAC 和 AAAT 单元最为丰富, ACG, ACT, AGG, CCG, ATGC, CCCG, ACTG, AACT, ACGT, AGAT, CCGG, ACCT 和 AGCT 单元等均很稀少,而一些五碱基重复,在某条甚至某几条染色体中均未分布 . 除两碱基重复单元在 2L 的外显子区域丰度较高外,其他重复单元均在内含子和基因间隔区丰富 . 进一步分析显示,微卫星在每条染色体两臂的丰度和分布存在着很多的相似性 .  相似文献   

12.
13.
德国小蠊全基因组中微卫星分布规律   总被引:3,自引:0,他引:3  
【目的】分析德国小蠊 Blattella germanica 全基因组中微卫星的数量和分布规律,并对外显子中含有微卫星的基因进行功能注释。【方法】使用微卫星搜索软件查找德国小蠊基因组中微卫星的数量、重复次数以及所有微卫星的位置信息,编写Python脚本对微卫星进行定位,并通过Blast2Go和KASS程序对外显子中含有微卫星的基因进行功能注释。【结果】共找到1~6碱基重复类型的微卫星序列604 386个,总长度15 301 255 bp,约占全基因组序列(约2.04 Gb)的0.75%,分布频率为1/3.37 kb,微卫星序列的长度主要在12~60个碱基长度范围内。不同类型的微卫星中,三碱基(226 876)重复类型微卫星数量最多,占微卫星总数的37.54%;四碱基(150 355)重复类型次之,占微卫星总数的24.88%;其余依次是单碱基(141 167)、二碱基(60 877)、五碱基(21 570)和六碱基(3 541)重复类型,分别占微卫星总数的23.36%, 10.07%, 3.57%和0.59%。出现最多的重复拷贝类别有:ATT, AAT, A, T, AAAT, ATTT和AT,共411 789个微卫星,占微卫星总数的68.13%,这7种类别的微卫星数量均大于30 000个。共有2 372个微卫星在外显子上,它们分别位于1 481个基因上。GO功能注释结果表明,其中434条归类于细胞组分(cellular component),402条归类于分子功能(molecular function),660条归类于生物学过程(biological process)。KEGG通路分析结果表明,与新陈代谢相关的基因最多(380个),其次是与机体系统相关的(276个),与遗传信息进程相关的基因最少(92个)。【结论】本研究为进一步系统深入分析德国小蠊微卫星功能及微卫星分子标记筛选打下了基础。  相似文献   

14.
Microsatellites, or simple sequence repeats (SSRs), have become the markers of choice for genetic studies with many crop species including wheat. Currently an international effort is underway to enrich the repertoire of available sequence tagged microsatellite site (STMS) markers in wheat. As a part of this effort, we have sequenced 43 clones obtained from a microsatellite-enriched wheat genomic library; 34 clones contained 41 different microsatellites. These microsatellites (mono-, di-, tri- nucleotide repeats) were classified as 19 simple perfect, 18 simple imperfect and 4 compound imperfect types. Dinucleotide repeats were the most abundant (70%). Primer pairs for only 16 microsatellites could be designed, since the flanking sequences of the others were either too short or were otherwise not suitable for designing the microsatellite specific primers. Microsatellite loci of the expected size and polymorphism were successfully amplified from 15 of these 16 primer pairs using three wheat varieties. 14 loci detected by 12 out of the 15 functional primer pairs were assigned to 11 specific chromosomes. An erratum to this article is available at .  相似文献   

15.
Simple sequence repeats (SSRs) or microsatellites are one of the most popular sources of genetic markers and play a significant role in gene function and genome organization. We identified SSRs in the genome of Ganoderma lucidum and analyzed their frequency and distribution in different genomic regions. We also compared the SSRs in G. lucidum with six other Agaricomycetes genomes: Coprinopsis cinerea, Laccaria bicolor, Phanerochaete chrysosporium, Postia placenta, Schizophyllum commune and Serpula lacrymans. Based on our search criteria, the total number of SSRs found ranged from 1206 to 6104 and covered from 0.04% to 0.15% of the fungal genomes. The SSR abundance was not correlated with the genome size, and mono- to tri-nucleotide repeats outnumbered other SSR categories in all of the species examined. In G. lucidum, a repertoire of 2674 SSRs was detected, with mono-nucleotides being the most abundant. SSRs were found in all genomic regions and were more abundant in non-coding regions than coding regions. The highest SSR relative abundance was found in introns (108 SSRs/Mb), followed by intergenic regions (84 SSRs/Mb). A total of 684 SSRs were found in the protein-coding sequences (CDSs) of 588 gene models, with 81.4% of them being tri- or hexa-nucleotides. After scanning for InterPro domains, 280 of these genes were successfully annotated, and 215 of them could be assigned to Gene Ontology (GO) terms. SSRs were also identified in 28 bioactive compound synthesis-related gene models, including one 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR), three polysaccharide biosynthesis genes and 24 cytochrome P450 monooxygenases (CYPs). Primers were designed for the identified SSR loci, providing the basis for the future development of SSR markers of this medicinal fungus.  相似文献   

16.
Studies on microsatellite distribution and divergence in related genomes contribute towards understanding of genome evolution in eukaryotes. Despite the availability of whole genome sequences of four rice genomes, occurrence and significance of microsatellites in the rice genome has remained a relatively unexplored area of research. We have aligned genomes of two rice subspecies i.e. indica and japonica to understand the trends of microsatellite conservation and divergence in the rice genome. Nearly 62% of the indica microsatellites were also found in the japonica genome. Occurrence of microsatellites showed a negative association with that of retrotransposons. Microsatellites repeat unit length and sequence showed direct influence on the microsatellite locus length. Further, microsatellite allele length was also influenced by the sequence characteristics of the neighbouring regions. CCG repeats were most conserved microsatellite sequences across the different syntenic regions in the two rice genomes and often showed association with CpG islands. Our study suggested that microsatellite distribution is not only governed by a balance between replication slippage and point mutations as proposed earlier, but also by the microsatellite motif sequence and characteristics of microsatellite neighbouring regions in the genome. Thus, this study is likely to prove an important reference for understanding the process of microsatellite evolution and dynamics in the two rice subspecies.  相似文献   

17.
RAPD identification of microsatellites in Daphnia   总被引:10,自引:0,他引:10  
Simple sequence repeats (SSRs, or microsatellites) have been constantly gaining importance as single-locus DNA markers in population genetics and behavioural ecology. We tested a PCR-based strategy for finding microsatellite loci in anonymous genomes, which avoids genomic library construction and screening, and the need for larger amounts of DNA. In the first step, parts of a genome are randomly amplified with arbitrary 10mer primers using RAPD fingerprinting. Labelled SSR-oligonucleotides serve as probes to detect complementary sequences in RAPD products by means of Southern analyses. Subsequently, positive RAPD fragments of suitable size are cloned and sequenced. Using GA and GT probes, we applied this approach to waterfleas ( Daphnia ) and revealed 37 hybridization signals in 20 RAPD profiles. Thirteen positive RAPD fragments from three Daphnia species and two hybrid 'species' were cloned and sequenced. In all cases simple sequence repeats were detected. We characterized seven perfect repeat loci, which were found to be polymorphic within and between species.  相似文献   

18.
Radish (Raphanus sativus L.) is an edible root vegetable crop that is cultivated worldwide and whose genome has been sequenced. Here we report the complete nucleotide sequence of the radish cultivar WK10039 chloroplast (cp) genome, along with a de novo assembly strategy using whole genome shotgun sequence reads obtained by next generation sequencing. The radish cp genome is 153,368 bp in length and has a typical quadripartite structure, composed of a pair of inverted repeat regions (26,217 bp each), a large single copy region (83,170 bp), and a small single copy region (17,764 bp). The radish cp genome contains 87 predicted protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence analysis revealed the presence of 91 simple sequence repeats (SSRs) in the radish cp genome.  相似文献   

19.
Simple Sequence Repeats (SSRs) or microsatellites constitute a significant portion of genomes however; their significance in organellar genomes has not been completely understood. The availability of organelle genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. In the present work, SSRs were identified and categorized in 14 mitochondrial and 22 chloroplast genomes of algal species belonging to Chlorophyta. Based on the study, it was observed that number of SSRs in non-coding region were more as compared to coding region and frequency of mononucleotides repeats were highest followed by dinucleotides in both mitochondrial and chloroplast genomes. It was also observed that maximum number of SSRs was found in genes encoding for beta subunit of RNA polymerase in chloroplast genomes and NADH dehydrogenase in mitochondrial genomes. This is the first and original report on whole genomes sequence analysis of organellar genomes of green algae.  相似文献   

20.
BackgroundSome ferns have medicinal properties and are used in therapeutic interventions. However, the classification and phylogenetic relationships of ferns remain incompletely reported. Considering that chloroplast genomes provide ideal information for species identification and evolution, in this study, three unpublished and one published ferns were sequenced and compared with other ferns to obtain comprehensive information on their classification and evolution.Materials and MethodsThe complete chloroplast genomes of Dryopteris goeringiana (Kunze) Koidz, D. crassirhizoma Nakai, Athyrium brevifrons Nakai ex Kitagawa, and Polystichum tripteron (Kunze) Presl were sequenced using the Illumina HiSeq 4,000 platform. Simple sequence repeats (SSRs), nucleotide diversity analysis, and RNA editing were investigated in all four species. Genome comparison and inverted repeats (IR) boundary expansion and contraction analyses were also performed. The relationships among the ferns were studied by phylogenetic analysis based on the whole chloroplast genomes.ResultsThe whole chloroplast genomes ranged from 148,539 to 151,341 bp in size and exhibited typical quadripartite structures. Ten highly variable loci with parsimony informative (Pi) values of > 0.02 were identified. A total of 75–108 SSRs were identified, and only six SSRs were present in all four ferns. The SSRs contained a higher number of A + T than G + C bases. C‐to‐U conversion was the most common type of RNA editing event. Genome comparison analysis revealed that single‐copy regions were more highly conserved than IR regions. IR boundary expansion and contraction varied among the four ferns. Phylogenetic analysis showed that species in the same genus tended to cluster together with and had relatively close relationships.ConclusionThe results provide valuable information on fern chloroplast genomes that will be useful to identify and classify ferns, and study their phylogenetic relationships and evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号