首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
Mutational bias toward expansion or contraction of simple sequence repeats (SSRs) is referred to as directionality of SSR evolution. In this communication, we report the mutational bias exhibited by mononucleotide SSRs occurring in the non-coding regions of several prokaryotic genomes. Our investigations revealed that the strains or species lacking mismatch repair (MMR) system generally show higher number of polymorphic SSRs than those species/strains having MMR system. An exception to this observation was seen in the mycobacterial genomes that are MMR deficient where only a few SSR tracts were seen with mutations. This low incidence of SSR mutations even in the MMR-deficient background could be attributed to the high fidelity of the DNA polymerases as a consequence of high generation time of the mycobacteria. MMR system-deficient species generally did not show any bias toward mononucleotide SSR expansions or contractions indicating a neutral evolution of SSRs in these species. The MMR-proficient species in which the observed mutations correspond to secondary mutations showed bias toward contraction of polymononucleotide tracts, perhaps, indicating low efficiency of MMR system to repair SSR-induced slippage errors on template strands. This bias toward deletion in the mononucleotide SSR tracts might be a probable reason behind scarcity for long poly A|T and G|C tracts in prokaryotic systems which are mostly MMR proficient. In conclusion, our study clearly demonstrates mutational dynamics of SSRs in relation to the presence/absence of MMR system in the prokaryotic system.  相似文献   

2.
简单重复序列亦称微卫星,被成功应用于许多真核生物、原核生物和病毒的基因组和进化研究,但是噬菌体中的微卫星目前很少被研究。因此对60条尾病毒目基因组中的微卫星和和复合型微卫星(由两个或两个以上直接相邻的微卫星组成)做综合性分析,在这60个基因组中总共观察到11 874个微卫星和449个复合型微卫星。相关性分析表明微卫星个数与基因组大小成正线性相关(ρ=0.899, P<0.01)。参考序列中的微卫星个数少于对应的随机序列中微卫星个数,这种反常现象主要是因为参考序列含有较少的单核苷酸和二核苷酸重复。A/T和AT/TA重复是单核苷酸和二核苷酸重复中最主要的类型,因此单核苷酸重复中的GC含量明显低于相应的序列中的GC含量;相比之下,微卫星中的二核苷酸和三核苷酸重复的GC含量与对应的参考序列的GC含量无明显区别。尾病毒目基因组中的这些结果与其它生物体基因组存在一定的差别。有助于了解尾病毒目中微卫星的分布、进化和生物学功能。  相似文献   

3.
Simultaneous identification and comparison of perfect and imperfect microsatellites within a genome is a valuable tool both to overcome the lack of a consensus definition of SSRs and to assess repeat history. Detailed analysis of the overall distribution of perfect and imperfect microsatellites in closely related bacterial taxa is expected to give new insight into the evolution of prokaryotic genomes. We have performed a genome-wide analysis of microsatellite distribution in four Escherichia coli and seven Chlamydial strains. Chlamydial strains generally have a higher density of SSRs and show greater intra-group differences of SSR distribution patterns than E. coli genomes. In most investigated genomes the distribution of the total lengths of matching perfect and imperfect trinucleotide repeats are highly similar, with the notable exception of C. muridarum. Closely related strains show more similar repeat distribution patterns than strains separated by a longer divergence time. The discrepancy between the preferred classes of perfect and imperfect repeats in C. muridarum implies accelerated evolution of SSRs in this particular strain. Our results suggest that microsatellites, although considerably less abundant than in eukaryotic genomes, may nevertheless play an important role in the evolution of prokaryotic genomes and several gene families.  相似文献   

4.
Simple sequence repeats (SSRs) are indel mutational hotspots in genomes. In prokaryotes, SSR loci can cause phase variation, a microbial survival strategy that relies on stochastic, reversible on-off switching of gene activity. By analyzing multiple strains of 42 fully sequenced prokaryotic species, we measure the relative variability and density distribution of SSRs in coding regions. We demonstrate that repeat type strongly influences indel mutation rates, and that the most mutable types are most strongly avoided across genomes. We thoroughly characterize SSR density and variability as a function of N→C position along protein sequences. Using codon-shuffling algorithms that preserve amino acid sequence, we assess evolutionary pressures on SSRs. We find that coding sequences suppress repeats in the middle of proteins, and enrich repeats near termini, yielding U-shaped SSR density curves. We show that for many species this characteristic shape can be attributed to purely biophysical constraints of protein structure. In multiple cases, however, particularly in certain pathogenic bacteria, we observe over enrichment of SSRs near protein N-termini significantly beyond expectation based on structural constraints. This increases the probability that frameshifts result in non-functional proteins, revealing that these species may evolutionarily tune SSR positions in coding regions to facilitate phase variation.  相似文献   

5.
Simple sequence repeat (SSR) markers are widely used in many plant and animal genomes due to their abundance, hypervariability, and suitability for high-throughput analysis. Development of SSR markers using molecular methods is time consuming, laborious, and expensive. Use of computational approaches to mine ever-increasing sequences such as expressed sequence tags (ESTs) in public databases permits rapid and economical discovery of SSRs. Most of such efforts to date focused on mining SSRs from monocotyledonous ESTs. In this study, we have computationally mined and examined the abundance of SSRs in more than 1.54 million ESTs belonging to 55 dicotyledonous species. The frequency of ESTs containing SSRs among species ranged from 2.65% to 16.82%. Dinucleotide repeats were found to be the most abundant followed by tri- or mono-nucleotide repeats. The motifs A/T, AG/GA/CT/TC, and AAG/AGA/GAA/CTT/TTC/TCT were the predominant mono-, di-, and tri-nucleotide SSRs, respectively. Most of the mononucleotide SSRs contained 15-25 repeats, whereas the majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. The comprehensive SSR survey data presented here demonstrates the potential of in silico mining of ESTs for rapid development of SSR markers for genetic analysis and applications in dicotyledonous crops.  相似文献   

6.
Simple sequence repeats (SSRs) are very common short repeatsin eukaryotic genomes. "Long" SSRs are considered "hypermutable"sequences because they exhibit a high rate of expansion andcontraction. Because they are potentially deleterious, longSSRs tend to be uncommon in coding sequences. However, severalgenes contain long SSRs in their exonic sequences. Here, weidentify 1,291 human genes that host a mononucleotide SSR longenough to be prone to expansion or contraction, being calledhypermutable hereafter. On the basis of Gene Ontology annotations,we show that only a restricted number of functions are overrepresentedamong those hypermutable genes including cell cycle and maintenanceof DNA integrity. Using a probabilistic model, we show thatgenes involved in these functions are expected to host longSSRs because they tend to be long and/or are biased in nucleotidecomposition. Finally, we show that for almost all functionswe observe fewer hypermutable sequences than expected undera neutral model. There are however interesting exceptions, forexample, genes involved in protein and RNA transport, as wellas meiosis and mismatch repair functions that have as many hypermutablegenes as expected under neutrality. Conversely, there are functions(e.g., collagen-related genes) where hypermutable genes aremore often avoided than in other functions. Our results showthat, even though several functions harbor unusually long SSRin their exons, long SSRs are deleterious sequences in almostall functions and are removed by purifying selection. The strengthof this purifying selection however greatly varies from functionto function. We discuss possible explanations for this intriguingresult.  相似文献   

7.
Simple sequence repeats (SSRs) or microsatellites constitute a countable portion of genomes. However, the significance of SSRs in organelle genomes has not been completely understood. The availability of organelle genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. In the current study we surveyed the patterns of SSRs in mitochondrial genomes of different taxa of plants. A total of 16 mitochondrial genomes, from algae to angiosperms, have been considered to analyze the pattern of simple sequence repeats present in them. Based on study, the mononucleotide repeats of A/T were found to be more prevalent in mitochondrial genomes over other repeat types. The dinucleotides repeats, TA/AT, were the second most numerous, whereas tri-, tetra-, and pentanucleotide repeats were in less number and present in intronic or intergenic portions only. Mononucleotide repeats prevailed in protein-coding exonic portions of all organisms. These results indicates that microsatellite pattern in mitochondrial genomes is different from nuclear genomes and also focuses on organization and diversity at SSR locuses in mitochondrial genomes. This is the novel report of microsatellite polymorphism in plant mitochondrion on whole genome level.  相似文献   

8.
Microsatellites or Simple Sequence Repeats (SSRs) are tandem iterations of one to six base pairs, non-randomly distributed throughout prokaryotic and eukaryotic genomes. Limited knowledge is available about distribution of microsatellites in single stranded DNA (ssDNA) viruses, particularly vertebrate infecting viruses. We studied microsatellite distribution in 118 ssDNA virus genomes belonging to three families of vertebrate infecting viruses namely Circoviridae, Parvoviridae, and Anelloviridae, and found that microsatellites constitute an important component of these virus genomes. Mononucleotide repeats were predominant followed by dinucleotide and trinucleotide repeats. A strong positive relationship existed between number of mononucleotide repeats and genome size among all the three virus families. A similar relationship existed for the occurrence of DTTPH (di-, tri-, tetra-, penta- and hexa-nucleotide) repeats in the families Anelloviridae and Parvoviridae only. Relative abundance and relative density of mononucleotide repeats showed a strong positive relationship with genome size in Circoviridae and Parvoviridae. However, in the case of DTTPH repeats, these features showed a strong relationship with genome size in Circoviridae only. On the other hand, relative microsatellite abundance and relative density of mononucleotide repeats were negatively correlated with GC content (%) in Parvoviridae genomes. On the basis of available annotations, our analysis revealed maximum occurrence of mononucleotide as well as DTTPH repeats in the coding regions of these virus genomes. Interestingly, after normalizing the length of the coding and non-coding regions of each virus genome, we found relative density of microsatellites much higher in the non-coding regions. We understand that the present study will help in the better characterization of the stability, genome organization and evolution of these virus classes and may provide useful leads to decipher the etiopathogenesis of these viruses.  相似文献   

9.
Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes. Supplementary Data pertaining to this article is available on the Journal of Biosciences Website at  相似文献   

10.
Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R~= 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n,(AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 11() 561 ± 93 482 and the frequency of 87.25%± 5.73% on average, and the number and frequency decline with the increase oflength. Most SSRs(83.34%± 7.72%) are located in intergenic regions, followed by intron regions (11.59%± 5.59%), exon regions (3.74%± 1.95%), and untranslated regions (1.32%± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55%± 0.85%) and exon regions (99.27%± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrenee in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes.  相似文献   

11.
In fungi, microsatellites occur less frequently throughout the genome and tend to be less polymorphic compared with other organisms. Most studies that develop microsatellites for fungi focus on dinucleotide and trinucleotide repeats, and thus mononucleotide repeats, which are much more abundant in fungal genomes, may represent an overlooked resource. This study examined the relative probabilities of polymorphism in mononucleotide, dinucleotide and trinucleotide repeats in Aspergillus nidulans. As previously found, the probability of polymorphism increased with increasing number of repeating units. Dinucleotide and trinucleotide repeats had higher probabilities of polymorphism than mononucleotide repeats, but this was offset by the presence of numerous long mononucleotide repeats within the genome. Mononucleotide microsatellites with 20 or more repeating units have a probability of polymorphism similar to dinucleotide and trinucleotide microsatellites, and therefore, consideration of mononucleotide repeats will substantially increase the number of potential markers available.  相似文献   

12.
Pseudomonas aeruginosa is an opportunistic pathogen that chronically infects the airways of cystic fibrosis (CF) patients and undergoes a process of genetic adaptation based on mutagenesis. We evaluated the role of mononucleotide G:C and A:T simple sequence repeats (SSRs) in this adaptive process. An in silico survey of the genome sequences of 7 P. aeruginosa strains showed that mononucleotide G:C SSRs but not A:T SSRs were greatly under-represented in coding regions, suggesting a strong counterselection process for G:C SSRs with lengths >5 bp but not for A:T SSRs. A meta-analysis of published whole genome sequence data for a P. aeruginosa strain from a CF patient with chronic airway infection showed that G:C SSRs but not A:T SSRs were frequently mutated during the infection process through the insertion or deletion of one or more SSR subunits. The mutation tendency of G:C SSRs was length-dependent and increased exponentially as a function of SSR length. When this strain naturally became a stable Mismatch Repair System (MRS)-deficient mutator, the degree of increase of G:C SSRs mutations (5-fold) was much higher than that of other types of mutation (2.2-fold or less). Sequence analysis of several mutated genes reported for two different collections, both containing mutator and non-mutator strains of P. aeruginosa from CF chronic infections, showed that the proportion of G:C SSR mutations was significantly higher in mutators than in non-mutators, whereas no such difference was observed for A:T SSR mutations. Our findings, taken together, provide genome-scale evidences that under a MRS-deficient background, long G:C SSRs are able to stochastically bias mutagenic pathways by making the genes in which they are harbored more prone to mutation. The combination of MRS deficiency and virulence-related genes that contain long G:C SSRs is therefore a matter of concern in P. aeruginosa CF chronic infection.  相似文献   

13.
Simple sequence repeats (SSRs) are omnipresent in prokaryotes and eukaryotes, and are found anywhere in the genome in both protein encoding and noncoding regions. In present study the whole genome sequences of seven chromosomes (Shigella flexneri 2a str301 and 2457T, Shigella sonnei, Escherichia coli k12, Mycobacterium tuberculosis, Mycobacterium leprae and Staphylococcus saprophyticus) have downloaded from the GenBank database for identifying abundance, distribution and composition of SSRs and also to determine difference between the tandem repeats in real genome and randomness genome (using sequence shuffling tool) of the organisms included in this study. The data obtained in the present study show that: (i) tandem repeats are widely distributed throughout the genomes; (ii) SSRs are differentially distributed among coding and noncoding regions in investigated Shigella genomes; (iii) total frequency of SSRs in noncoding regions are higher than coding regions; (iv) in all investigated chromosomes ratio of Trinucleotide SSRs in real genomes are much higher than randomness genomes and Di nucleotide SSRs are lower; (v) Ratio of total and mononucleotide SSRs in real genome is higher than randomness genomes in E. coli K12, S. flexneri str 301 and S. saprophyticus, while it is lower in S. flexneri str 2457T, S.sonnei and M. tuberculosis and it is approximately same in M. leprae; (vi) frequency of codon repetitions are vary considerably depending on the type of encoded amino acids.  相似文献   

14.
微卫星或简单重复序列(simple sequence repeat, SSR)在真核和原核生物以及病毒基因组中普遍存在,并被广泛用于遗传与进化研究。本研究从NCBI中下载埃博拉病毒属的四个不同种的埃博拉病毒全基因组序列,筛选36条作为实验材料,利用IMEx在线提取软件提取SSRs,用Python编程统计数据,从而分析SSRs在埃博拉病毒全基因组序列中的分布情况。分析得出,埃博拉病毒基因组序列中二型SSRs含量最为丰富,其次是一型SSRs,三型SSRs有少量,四型SSRs则更少,没有发现五型和六型SSRs。在更深入的分析中得出在埃博拉病毒属四个种中,含A/T碱基的SSRs含量远远大于含C/G碱基的SSRs。分析得出一型SSRs中(A)n/(T)n远多于(G)n/(C)n,二型SSRs中不存在(GC/CG)n,三型中也不存在(GGC/CGG/GCG/CCG/CGC/GCC) n。上述发现可能跟埃博拉病毒的致病机理有密切联系。通过对埃博拉病毒基因组序列中SSRs的分析,为研究埃博拉病毒的变异情况及致病机制提供更多参考。  相似文献   

15.
Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species.  相似文献   

16.
Plant genomes are complex and contain large amounts of repetitive DNA including microsatellites that are distributed across entire genomes. Whole genome sequences of several monocot and dicot plants that are available in the public domain provide an opportunity to study the origin, distribution and evolution of microsatellites, and also facilitate the development of new molecular markers. In the present investigation, a genome-wide analysis of microsatellite distribution in monocots (Brachypodium, sorghum and rice) and dicots (Arabidopsis, Medicago and Populus) was performed. A total of 797,863 simple sequence repeats (SSRs) were identified in the whole genome sequences of six plant species. Characterization of these SSRs revealed that mono-nucleotide repeats were the most abundant repeats, and that the frequency of repeats decreased with increase in motif length both in monocots and dicots. However, the frequency of SSRs was higher in dicots than in monocots both for nuclear and chloroplast genomes. Interestingly, GC-rich repeats were the dominant repeats only in monocots, with the majority of them being present in the coding region. These coding GC-rich repeats were found to be involved in different biological processes, predominantly binding activities. In addition, a set of 22,879 SSR markers that were validated by e-PCR were developed and mapped on different chromosomes in Brachypodium for the first time, with a frequency of 101 SSR markers per Mb. Experimental validation of 55 markers showed successful amplification of 80% SSR markers in 16 Brachypodium accessions. An online database 'BraMi' (Brachypodium microsatellite markers) of these genome-wide SSR markers was developed and made available in the public domain. The observed differential patterns of SSR marker distribution would be useful for studying microsatellite evolution in a monocot-dicot system. SSR markers developed in this study would be helpful for genomic studies in Brachypodium and related grass species, especially for the map based cloning of the candidate gene(s).  相似文献   

17.
All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.  相似文献   

18.
Survey of simple sequence repeats in completed fungal genomes   总被引:7,自引:0,他引:7  
The use of simple sequence repeats or microsatellites as genetic markers has become very popular because of their abundance and length variation between different individuals. SSRs are tandem repeat units of 1 to 6 base pairs that are found abundantly in many prokaryotic and eukaryotic genomes. This is the first study examining and comparing SSRs in completely sequenced fungal genomes. We analyzed and compared the occurrences, relative abundance, relative density, most common, and longest SSRs in nine taxonomically different fungal species: Aspergillus nidulans, Cryptococcus neoformans, Encephalitozoon cuniculi, Fusarium graminearum, Magnaporthe grisea, Neurospora crassa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Ustilago maydis. Our analysis revealed that, in all of the genomes studied, the occurrence, abundance, and relative density of SSRs varied and was not influenced by the genome sizes. No correlation between relative abundance and the genome sizes was observed, but it was shown that N. crassa, the largest genome analyzed had the highest relative abundance of SSRs. In most genomes, mononucleotide, dinucleotide, and trinucleotide repeats were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. Our analysis showed that the relative abundance of SSRs in fungi is low compared with the human genome and that longer SSRs in fungi are rare. In addition to providing new information concerning the abundance of SSRs for each of these fungi, the results provide a general source of molecular markers that could be useful for a variety of applications such as population genetics and strain identification of fungal organisms.  相似文献   

19.
X Zhao  Y Tian  R Yang  H Feng  Q Ouyang  Y Tian  Z Tan  M Li  Y Niu  J Jiang  G Shen  R Yu 《BMC genomics》2012,13(1):435
ABSTRACT: BACKGROUND: Relationship between the level of repetitiveness in genomic sequence and genome size has been investigated by making use of complete prokaryotic and eukaryotic genomes, but relevant studies have been rarely made in virus genomes. RESULTS: In this study, a total of 257 viruses were examined, which cover 90% of genera. The results showed that simple sequence repeats (SSRs) is strongly, positively and significantly correlated with genome size. Certain repeat class is distributed in a certain range of genome sequence length. Mono-, di- and tri- repeats are widely distributed in all virus genomes, tetra- SSRs as a common component consist in genomes which more than 100 kb in size; in the range of genome < 100 kb, genomes containing penta- and hexa- SSRs are not more than 50%. Principal components analysis (PCA) indicated that dinucleotide repeat affects the differences of SSRs most strongly among virus genomes. Results showed that SSRs tend to accumulate in larger virus genomes; and the longer genome sequence, the longer repeat units. CONCLUSIONS: We conducted this research standing on the height of the whole virus. We concluded that genome size is an important factor in affecting the occurrence of SSRs; hosts are also responsible for the variances of SSRs content to a certain degree.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号