首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Staphylococci are Gram-positive bacteria which play an important role in infectious disease and are major causes of communityacquired and hospital-acquired infections. Strains of Staphylococcus aureus are reported as genomically and phenotypically highly heterogeneous; hence in-silico based comparison of genomic data on simple sequence repeats may provide valuable information for understanding the pathogenicity and control measures. This study determined the distribution of a specific group of Simple Sequence Repeats (SSRs), in genome sequences of six Staphylococcus strains (Staphylococcus aureus COL, S.aureus MRSA252, S.aureus MSSA476, S.aureus Mu50, S.aureus MW2, S.aureus N315) and plasmid sequences of four Staphylococcus strains (Staphylococcus aureus COL pT181, Staphylococcus aureus MSSA pSAS, Staphylococcus aureus VRSAp, Staphylococcus aureus, Staphylococcus aureus pN315 DNA) downloaded from the GenBank database for identifying abundance, distribution and composition of SSRs. The data obtained in the present study shows that (i) a large number of tandem repeats are distributed throughout the genome and plasmid sequences. (ii) Number of mononucleotide SSRs decreased rapidly with increase in size of repeat unit. (iii) Total frequency of SSRs in plasmid regions is less than genomic regions. (iv) In all investigated strains, ratios of AT/TA repeats are dominating over GC/CG repeats in genomics as well as plasmid sequences, and (v) Dinucleotide combination of AT is dominated in all the six Staphylococcus genome sequences.  相似文献   

2.
Survey of simple sequence repeats in completed fungal genomes   总被引:7,自引:0,他引:7  
The use of simple sequence repeats or microsatellites as genetic markers has become very popular because of their abundance and length variation between different individuals. SSRs are tandem repeat units of 1 to 6 base pairs that are found abundantly in many prokaryotic and eukaryotic genomes. This is the first study examining and comparing SSRs in completely sequenced fungal genomes. We analyzed and compared the occurrences, relative abundance, relative density, most common, and longest SSRs in nine taxonomically different fungal species: Aspergillus nidulans, Cryptococcus neoformans, Encephalitozoon cuniculi, Fusarium graminearum, Magnaporthe grisea, Neurospora crassa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Ustilago maydis. Our analysis revealed that, in all of the genomes studied, the occurrence, abundance, and relative density of SSRs varied and was not influenced by the genome sizes. No correlation between relative abundance and the genome sizes was observed, but it was shown that N. crassa, the largest genome analyzed had the highest relative abundance of SSRs. In most genomes, mononucleotide, dinucleotide, and trinucleotide repeats were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. Our analysis showed that the relative abundance of SSRs in fungi is low compared with the human genome and that longer SSRs in fungi are rare. In addition to providing new information concerning the abundance of SSRs for each of these fungi, the results provide a general source of molecular markers that could be useful for a variety of applications such as population genetics and strain identification of fungal organisms.  相似文献   

3.
Simple Sequence Repeats (SSRs) represent short tandem duplications found within all eukaryotic organisms. To examine the distribution of SSRs in the genome of Brassica rapa ssp. pekinensis, SSRs from different genomic regions representing 17.7 Mb of genomic sequence were surveyed. SSRs appear more abundant in non-coding regions (86.6%) than in coding regions (13.4%). Comparison of SSR densities in different genomic regions demonstrated that SSR density was greatest within the 5'-flanking regions of the predicted genes. The proportion of different repeat motifs varied between genomic regions, with trinucleotide SSRs more prevalent in predicted coding regions, reflecting the codon structure in these regions. SSRs were also preferentially associated with gene-rich regions, with peri-centromeric heterochromatin SSRs mostly associated with retrotransposons. These results indicate that the distribution of SSRs in the genome is non-random. Comparison of SSR abundance between B. rapa and the closely related species Arabidopsis thaliana suggests a greater abundance of SSRs in B. rapa, which may be due to the proposed genome triplication. Our results provide a comprehensive view of SSR genomic distribution and evolution in Brassica for comparison with the sequenced genomes of A. thaliana and Oryza sativa.  相似文献   

4.
Simple sequence repeats (SSRs) or microsatellites are a common component of genomes but vary greatly across species in their abundance. We tested the hypothesis that this variation is due in part to AT/GC content of genomes, with genomes biased toward either high AT or high CG generating more short random repeats that are long enough to enhance expansion through slippage during replication. To test this hypothesis, we identified repeats with perfect tandem iterations of 1-6 bp from 25 protists with complete or near-complete genome sequences. As expected, the density and the frequency are highly related to genome AT content, with excellent fits to quadratic regressions with minima near a 50% AT content and rising toward both extremes. Within species, the same trends hold, except the limited variation in AT content within each species places each mainly on the descending (GC rich), middle, or ascending (AT rich) part of the curve. The base usages of repeat motifs are also significantly correlated with genome nucleotide compositions: Percentages of AT-rich motifs rise with the increase of genome AT content but vice versa for GC-rich subgroups. Amino acid homopolymer repeats also show the expected quadratic relationship, with higher abundance in species with AT content biased in either direction. Our results show that genome nucleotide composition explains up to half of the variance in the abundance and motif constitution of SSRs.  相似文献   

5.
Chen M  Tan Z  Zeng G 《Bioinformation》2011,6(4):171-172
Simple sequence repeats (SSRs) are ubiquitous short tandem repeats, which are associated with various regulatory mechanisms and have been found in viral genomes. Herein, we develop MfSAT (Multi-functional SSRs Analytical Tool), a new powerful tool which can fast identify SSRs in multiple short viral genomes and then automatically calculate the numbers and proportions of various SSR types (mono-, di-, tri-, tetra-, penta- and hexanucleotide repeats). Furthermore, it also can detect codon repeats and report the corresponding amino acid.  相似文献   

6.
Simple sequence repeats (SSRs) are indel mutational hotspots in genomes. In prokaryotes, SSR loci can cause phase variation, a microbial survival strategy that relies on stochastic, reversible on-off switching of gene activity. By analyzing multiple strains of 42 fully sequenced prokaryotic species, we measure the relative variability and density distribution of SSRs in coding regions. We demonstrate that repeat type strongly influences indel mutation rates, and that the most mutable types are most strongly avoided across genomes. We thoroughly characterize SSR density and variability as a function of N→C position along protein sequences. Using codon-shuffling algorithms that preserve amino acid sequence, we assess evolutionary pressures on SSRs. We find that coding sequences suppress repeats in the middle of proteins, and enrich repeats near termini, yielding U-shaped SSR density curves. We show that for many species this characteristic shape can be attributed to purely biophysical constraints of protein structure. In multiple cases, however, particularly in certain pathogenic bacteria, we observe over enrichment of SSRs near protein N-termini significantly beyond expectation based on structural constraints. This increases the probability that frameshifts result in non-functional proteins, revealing that these species may evolutionarily tune SSR positions in coding regions to facilitate phase variation.  相似文献   

7.
乙型肝炎病毒(hepatitis B virus,HBV)属于嗜肝DNA病毒科(hepadnaviridae),它分布在世界各地并严重危害人类的健康。本文从NCBI的GenBank中下载了106条B2亚型和130条C2亚型的基因组全长序列,以下载的数据为材料,分析简单重复序列(simple sequence repeats,SSRs)在B2和C2亚型基因组序列中的分布情况。结果显示,简单重复序列的重复次数比较少;二型简单重复序列在研究的五种简单重复序列类型中占绝对优势;这可能与B2和C2亚型基因组序列有较高的突变率有关。同时还发现最普遍的SSRs、序列间SSRs的平均相对丰度和平均相对密度在B2和C2亚型中分布情况相似。  相似文献   

8.
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.  相似文献   

9.
Simple sequence repeats (SSRs) can be derived from the complete genome sequence. These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop SSRs for cotton gene mapping, we selected the complete genome sequence of Gossypium raimondii, which consisted of 4447 non-redundant scaffolds. Out of 775.2 Mb sequence examined, a total of 136,345 microsatellites were identified with a density of 5.69 kb per SSR in the G. raimondii genome leading to development of 112,177 primer pairs. The distributions of SSRs in the genome were non-random. Among the different motifs ranging from 1 to 6 bp, penta-nucleotide repeats were most abundant (30.5%), followed by tetra-nucleotide repeats (18.2%) and di-nucleotide repeats (16.9%). Among all identified 457 motif types, the most frequently occurring repeat motifs were poly-AT/TA, which accounted for 79.8% of the total di-nt SSRs, followed by AAAT/TTTA with 51.5% of the total tetra-nucleotede. Further, 18,834 microsatellites were detected from the protein-coding genes, and the frequency of gene containing SSRs was 46.0% in 40,976 genes of G. raimondii. These genome-based SSRs developed in the present study will lay the groundwork for developing large numbers of SSR markers for genetic mapping, gene discovery, genetic diversity analysis, and MAS breeding in cotton.  相似文献   

10.
Simple sequence repeats are predominantly found in most organisms. They play a major role in studies of genetic diversity, and are useful as diagnostic markers for many diseases. The simple sequence repeats database (SSRD) for the human genome was created for easy access to such repeats, for analysis, and to be used to understand their biological significance. The data includes the abundance and distribution of SSRs in the coding and non-coding regions of the genome, as well as their association with the UTRs of genes. The exact locations of repeats with respect to genomic regions (such as UTRs, exons, introns or intergenic regions) and their association with STS markers are also highlighted. The resource will facilitate repeat sequence analysis in the human genome and the understanding of the functional and evolutionary significance of simple sequence repeats. SSRD is available through two websites, http://www.ccmb.res.in/ssr and http://www.ingenovis.com/ssr.  相似文献   

11.
Simple sequence repeat (SSR) markers are widely used in many plant and animal genomes due to their abundance, hypervariability, and suitability for high-throughput analysis. Development of SSR markers using molecular methods is time consuming, laborious, and expensive. Use of computational approaches to mine ever-increasing sequences such as expressed sequence tags (ESTs) in public databases permits rapid and economical discovery of SSRs. Most of such efforts to date focused on mining SSRs from monocotyledonous ESTs. In this study, we have computationally mined and examined the abundance of SSRs in more than 1.54 million ESTs belonging to 55 dicotyledonous species. The frequency of ESTs containing SSRs among species ranged from 2.65% to 16.82%. Dinucleotide repeats were found to be the most abundant followed by tri- or mono-nucleotide repeats. The motifs A/T, AG/GA/CT/TC, and AAG/AGA/GAA/CTT/TTC/TCT were the predominant mono-, di-, and tri-nucleotide SSRs, respectively. Most of the mononucleotide SSRs contained 15-25 repeats, whereas the majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. The comprehensive SSR survey data presented here demonstrates the potential of in silico mining of ESTs for rapid development of SSR markers for genetic analysis and applications in dicotyledonous crops.  相似文献   

12.
The Limnanthaceae (Order Brassicales) is a family of 18 taxa of Limnanthes (meadowfoam) native to California, Oregon, and British Columbia. Cultivated meadowfoam (L. alba Benth.), a recently domesticated plant, has been the focus of research and development as an industrial oilseed for three decades. The goal of the present research was to develop several hundred simple sequence repeat (SSR) markers for genetic mapping, molecular breeding, and genomics research in wild and cultivated meadowfoam taxa. We developed 389 SSR markers for cultivated meadowfoam by isolating and sequencing 1,596 clones from L. alba genomic DNA libraries enriched for AG n or AC n repeats, identifying one or more unique SSRs in 696 clone sequences, and designing and testing primers for 624 unique SSRs. The SSR markers were screened for cross- taxa utility and polymorphisms among ten of 17 taxa in the Limnanthaceae; 373 of these markers were polymorphic and 106 amplified loci from every taxon. Cross-taxa amplification percentages ranged from 37.3% in L. douglasii ssp. rosea (145/389) to 85.6% in L. montana (333/389). The SSR markers amplified 4,160 unique bands from 14 genotypes sampled from ten taxa (10.7 unique bands per SSR marker), of which 972 were genotype-specific. Mean and maximum haplotype heterozygosities were 0.71 and 0.90, respectively, among six L. alba genotypes and 0.63 and 0.93, respectively, among 14 genotypes (ten taxa). The SSR markers supply a critical mass of high-throughput DNA markers for biological and agricultural research across the Limnanthaceae and open the way to the development of a genetic linkage map for meadowfoam (x = 5).Electronic Supplementary Material Supplementary material is available in the online version of this article at Communicated by O. Savolainen  相似文献   

13.
The physical distribution of ten simple-sequence repeated DNA motifs (SSRs) was studied on chromosomes of bread wheat, rye and hexaploid triticale. Oligomers with repeated di-, tri- or tetra-nucleotide motifs were used as probes for fluorescence in situ hybridization to root-tip metaphase and anther pachytene chromosomes. All motifs showed dispersed hybridization signals of varying strengths on all chromosomes. In addition, the motifs (AG)12, (CAT)5, (AAG)5, (GCC)5 and, in particular, (GACA)4 hybridized strongly to pericentromeric and multiple intercalary sites on the B genome chromosomes and on chromosome 4A of wheat, giving diagnostic patterns that resembled N-banding. In rye, all chromosomes showed strong hybridization of (GACA)4 at many intercalary sites that did not correspond to any other known banding pattern, but allowed identification of all R genome chromosome arms. Overall, SSR hybridization signals were found in related chromosome positions independently of the motif used and showed remarkably similar distribution patterns in wheat and rye, indicating the special role of SSRs in chromosome organization as a possible ancient genomic component of the tribe Triticeae (Gramineae). Received: 13 February 1998; in revised form: 18 August 1998 / Accepted: 18 August 1998  相似文献   

14.
The type and frequency of simple sequence repeats (SSRs) in plant genomes was investigated using the expanding quantity of DNA sequence data deposited in public databases. In Arabidopsis, 306 genomic DNA sequences longer than 10 kb and 36,199 EST sequences were searched for all possible mono- to pentanucleotide repeats. The average frequency of SSRs was one every 6.04 kb in genomic DNA, decreasing to one every 14 kb in ESTs. SSR frequency and type differed between coding, intronic, and intergenic DNA. Similar frequencies were found in other plant species. On the basis of these findings, an approach is proposed and demonstrated for the targeted isolation of single or multiple, physically clustered SSRs linked to any gene that has been mapped using low-copy DNA-based markers. The approach involves sample sequencing a small number of subclones of selected randomly sheared large insert DNA clones (e.g., BACs). It is shown to be both feasible and practicable, given the probability of fortuitously sequencing through an SSR. The approach is demonstrated in barley where sample sequencing 34 subclones of a single BAC selected by hybridization to the Big1 gene revealed three SSRs. These allowed Big1 to be located at the top of barley linkage group 6HS.  相似文献   

15.
Simple sequence repeats (SSRs) are becoming standard DNA markers for plant genome analysis and are being used as markers in marker assisted breeding. And hence because of its great significance we have initiated this study to analyze complete genome of Arabidopsis thaliana for the prevalence of mono-, di-, tri-, tetra-, penta- and hexa- mer repeats in the coding and non-coding regions of the chromosome and to map their exact position on the sequence. We have developed a program that can search a repeat of any length, its exact position on the chromosome and also its frequency of occurrence in the genome. Analysis of the results reveal that maximum number of repeats were found in chromosome 1 followed by chromosome 2 and 4 whereas, chromosome 3 and 5 contain relatively less number of these repeats. Among the SSRs, hexamers and dimers were more predominant in the chromosomes. Overall data showed that Chromosome 5 has minimum number of repeats. The abundance or rarity of various simple repeats in different chromosomes is not explained by nucleotide composition of sequence or potential repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication / repair / recombination machinery might play an important role in genesis of repeats. The positional information is given at www.geocities.com/amubioinfo/ARD. This positional information can help Arabidopsis researchers to identify new polymorphisms in chromosomal regions of interest based on the SSRs that map in the area.  相似文献   

16.
The abundance and inherent potential for extensive allelic variations in simple sequence repeats (SSRs) or microsatellites resulted in valuable source for genetic markers in eukaryotes. In this study, we analyzed and compared the abundance and organisation of SSR in the genome of two important fungal pathogens of wheat, brown or leaf rust (Puccinia triticina) and black or stem rust (Puccinia graminis f. sp. tritici). P. triticina genome with two fold genome size as compared to P. graminis tritici has lower relative abundance and SSR density. The distribution pattern of different SSR motifs provides the evidence of greater accumulation of dinucleotide followed by trinucleotide repeats. More than two-hundred different types of repeat motifs were observed in the genomes. The longest SSR motifs varied in both genomes and some of the repeat motifs are found in higher frequency. The information about survey of relative abundance, relative density, length and frequency of different repeat motifs in Puccinia sp. will be useful for developing SSR markers that could find several applications in analysis of fungal genome such as genetic diversity, population genetics, race identification and acquisition of new virulence.  相似文献   

17.
Although molecular markers and DNA sequence data are now available for many crop species, our ability to identify genetic variation associated with functional or adaptive diversity is still limited. In this study, our aim was to quantify and characterize diversity in a panel of cultivated and wild sorghums (Sorghum bicolor), establish genetic relationships, and, simultaneously, identify selection signals that might be associated with sorghum domestication. We assayed 98 simple sequence repeat (SSR) loci distributed throughout the genome in a panel of 104 accessions comprising 73 landraces (i.e., cultivated lines) and 31 wild sorghums. Evaluation of SSR polymorphisms indicated that landraces retained 86% of the diversity observed in the wild sorghums. The landraces and wilds were moderately differentiated (F st=0.13), but there was little evidence of population differentiation among racial groups of cultivated sorghums (F st=0.06). Neighbor-joining analysis showed that wild sorghums generally formed a distinct group, and about half the landraces tended to cluster by race. Overall, bootstrap support was low, indicating a history of gene flow among the various cultivated types or recent common ancestry. Statistical methods (Ewens-Watterson test for allele excess, lnRH, and F st) for identifying genomic regions with patterns of variation consistent with selection gave significant results for 11 loci (approx. 15% of the SSRs used in the final analysis). Interestingly, seven of these loci mapped in or near genomic regions associated with domestication-related QTLs (i.e., shattering, seed weight, and rhizomatousness). We anticipate that such population genetics-based statistical approaches will be useful for re-evaluating extant SSR data for mining interesting genomic regions from germplasm collections.Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

18.
Simple sequence repeats (SSRs) composed of extensive tandem iterations of a single nucleotide or a short oligonucleotide are rare in most bacterial genomes, but they are common among Mycoplasma. Some of these repeats act as contingency loci in association with families of surface antigens. By contraction or expansion during replication, these SSRs increase genetic variance of the population and facilitate avoidance of the immune response of the host. Occurrence and distribution of SSRs are analyzed in complete genomes of 11 Mycoplasma and 3 related Mollicutes in order to gain insights into functional and evolutionary diversity of the SSRs in Mycoplasma. The results revealed an unexpected variety of SSRs with respect to their distribution and composition and suggest that it is unlikely that all SSRs function as contingency loci or recombination hot spots. Various types of SSRs are most abundant in Mycoplasma hyopneumoniae, whereas Mycoplasma penetrans, Mycoplasma mobile, and Mycoplasma synoviae do not contain unusually long SSRs. Mycoplasma hyopneumoniae and Mycoplasma pulmonis feature abundant short adenine and thymine runs periodically spaced at 11 and 12 bp, respectively, which likely affect the supercoiling propensities of the DNA molecule. Physiological roles of long adenine and thymine runs in M. hyopneumoniae appear independent of location upstream or downstream of genes, unlike contingency loci that are typically located in protein-coding regions or upstream regulatory regions. Comparisons among 3 M. hyopneumoniae strains suggest that the adenine and thymine runs are rarely involved in genome rearrangements. The results indicate that the SSRs in the Mycoplasma genomes play diverse roles, including modulating gene expression as contingency loci, facilitating genome rearrangements via recombination, affecting protein structure and possibly protein-protein interactions, and contributing to the organization of the DNA molecule in the cell.  相似文献   

19.
The zebrafish has drawn a great deal of attention as a developmental system because it offers the ability to combine excellent embryology and genetics. Here, we report that simple sequence repeats are abundant in the zebrafish genome and are highly polymorphic between two outbred lines, making them useful markers for the construction of a genetic map of this organism.  相似文献   

20.
The abundance and inherent potential for variations in simple sequence repeats (SSRs) or microsatellites resulted in valuable source for genetic markers in eukaryotes. We describe the organization and abundance of SSRs in fungus Fusarium graminearum (causative agent for Fusarium head blight or head scab of wheat). We identified 1705 SSRs of various nucleotide repeat motifs in the sequence database of F. graminearum. It is observed that mononucleotide repeats (62%) were most abundant followed by di- (20%) and trinucleotide repeats (14%). It is noted that tetra-, penta- and hexanucleotide repeats accounted for only 4% of SSRs. The estimated frequency of Class I SSRs (perfect repeats ≥20 nucleotides) was one SSR per 124.5 kb, whereas the frequency of Class II (perfect repeats >10 nucleotides and ≫20 nucleotides) was one SSR per 25.6 kb. The dynamics of SSRs will be a powerful tool for taxonomic, phylogenetic, genome mapping and population genetic studies as SSR based markers show high levels of allelic variation, codominant inheritance and ease of analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号