首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
Simple sequence repeats (SSRs) or microsatellites are the repetitive nucleotide sequences of motifs of length 1–6 bp. They are scattered throughout the genomes of all the known organisms ranging from viruses to eukaryotes. Microsatellites undergo mutations in the form of insertions and deletions (INDELS) of their repeat units with some bias towards insertions that lead to microsatellite tract expansion. Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these enzymes and as a null hypothesis one could expect these genomes to harbour many long tracts. It is therefore interesting to analyse the mycobacterial genomes for distribution and abundance of microsatellites tracts and to look for potentially polymorphic microsatellites. Available mycobacterial genomes, Mycobacterium avium, M. leprae, M. bovis and the two strains of M. tuberculosis (CDC1551 and H37Rv) were analysed for frequencies and abundance of SSRs. Our analysis revealed that the SSRs are distributed throughout the mycobacterial genomes at an average of 220–230 SSR tracts per kb. All the mycobacterial genomes contain few regions that are conspicuously denser or poorer in microsatellites compared to their expected genome averages. The genomes distinctly show scarcity of long microsatellites despite the absence of a post-replicative DNA repair system. Such severe scarcity of long microsatellites could arise as a result of strong selection pressures operating against long and unstable sequences although influence of GC-content and role of point mutations in arresting microsatellite expansions can not be ruled out. Nonetheless, the long tracts occasionally found in coding as well as non-coding regions may account for limited genome plasticity in these genomes. Supplementary Data pertaining to this article is available on the Journal of Biosciences Website at  相似文献   

2.
Simple sequence repeats in the Helicobacter pylori genome   总被引:5,自引:4,他引:1  
We describe an integrated system for the analysis of DNA sequence motifs within complete bacterial genome sequences. This system is based around ACeDB, a genome database with an integrated graphical user interface; we identify and display motifs in the context of genetic, sequence and bibliographic data. Tomb et al . (1997) previously reported the identification of contingency genes in Helicobacter pylori through their association with homopolymeric tracts and dinucleotide repeats. With this as a starting point, we validated the system by a search for this type of repeat and used the contextual information to assess the likelihood that they mediate phase variation in the associated open reading frames (ORFs). We found all of the repeats previously described, and identified 27 putative phase-variable genes (including 17 previously described). These could be divided into three groups: lipopolysaccharide (LPS) biosynthesis, cell-surface-associated proteins and DNA restriction/modification systems. Five of the putative genes did not have obvious homologues in any of the public domain sequence databases. The reading frame of some ORFs was disrupted by the presence of the repeats, including the alpha(1-2) fucosyltransferase gene, necessary for the synthesis of the Lewis Y epitope. An additional benefit of this approach is that the results of each search can be analysed further and compared with those from other genomes. This revealed that H . pylori has an unusually high frequency of homopurine:homopyrimidine repeats suggesting mechanistic biases that favour their presence and instability.  相似文献   

3.
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.  相似文献   

4.

Background

On most common microarray platforms many genes are represented by multiple probes. Although this is quite common no one has systematically explored the concordance between probes mapped to the same gene.

Results

Here we present an analysis of all the cases of multiple probe sets measuring the same gene on the Affymetrix U133a GeneChip and found that although in the majority of cases both measurements tend to agree there are a significant number of cases in which the two measurements differ from each other. In these cases the measurements can not be simply averaged but rather should be handled individually.

Conclusion

Our analysis allows us to provide a comprehensive list of the correlation between all pairs of probe sets that are mapped to the same gene and thus allows microarray users to sort out the cases that deserve further analysis. Comparison between the set of highly correlated pairs and the set of pairs that tend to differ from each other reveals potential factors that may affect it.  相似文献   

5.
The tailing genome walking strategies are simple and efficient. However, they sometimes can be restricted due to the low stringency of homo-oligomeric primers. Here we modified their conventional tailing step by adding polythymidine and polyguanine to the target single-stranded DNA (ssDNA). The tailed ssDNA was then amplified exponentially with a specific primer in the known region and a primer comprising 5′ polycytosine and 3′ polyadenosine. The successful application of this novel method for identifying integration sites mediated by φC31 integrase in goat genome indicates that the method is more suitable for genomes with high complexity and local GC content.  相似文献   

6.
Ouyang Q  Zhao X  Feng H  Tian Y  Li D  Li M  Tan Z 《Gene》2012,499(1):37-40
The presence, locations and composition of simple sequence repeats (SSRs) in Herpes simplex virus type 1 (HSV-1) genome were extracted and analyzed by using the software Imperfect Microsatellite Extractor (IMEx). There were 663 mon-, 502 di-, 184 tri-, 20 tetra-, 4 penta- and 4 hexanucleotide SSRs that were observed in different distribution between coding and noncoding regions in the HSV-1 genome. G/C, GC/CG, and (GGC)(n) were predominant in mononucleotide, dinucletide, trinucleotide repeats respectively. Indeed, the results showed that GC content in simple sequence repeats was notably higher than that in entire HSV-1 genome. Our data might be helpful for studying the pathogenesis, genome structure and evolution of HSV-1.  相似文献   

7.
MOTIVATION: Simple sequence repeats (SSRs) are abundant across genomes. However, the significance of SSRs in organellar genomes of rice has not been completely understood. The availability of organellar genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. RESULTS: We have analyzed SSRs in mitochondrial and chloroplast genomes of rice. We identified 2528 SSRs in the mitochondrial genome and average 870 SSRs in the chloroplast genomes. About 8.7% of the mitochondrial and 27.5% of the chloroplast SSRs were observed in the genic region. Dinucleotides were the most abundant repeats in genic and intergenic regions of the mitochondrial genome while mononucleotides were predominant in the chloroplast genomes. The rps and nad gene clusters of mitochondria had the maximum repeats, while the rpo and ndh gene clusters of chloroplast had the maximum repeats. We identified SSRs in both organellar genomes and validated in different cultivars and species.  相似文献   

8.
Musto et al. [H. Musto, H. Naya, A. Zavala, H. Romero, F. Alvarez-Valin, G. Bernardi, Genomic GC level, optimal growth temperature, and genome size in prokaryotes, Biochem. Biophys. Res. Commun. 347 (2006) 1-3] recently reported a linear correlation between GC content and genome length. The regression model was heteroscedactic which suggested that the relationship might be more clearly defined. Alternative regression models (R2>0.95) were fitted to a set of over 900 sequences compliant with Chargaff’s second parity rule. The new models suggest that the relationship between GC content and genome length is more complex than was originally suggested. While similar models can be derived for non-Chargaff compliant genomes, their interpretation is likely to be more difficult.  相似文献   

9.
Rickettsia typhi, the causative agent of murine typhus, is an obligate intracellular bacterium with a life cycle involving both vertebrate and invertebrate hosts. Here we present the complete genome sequence of R. typhi (1,111,496 bp) and compare it to the two published rickettsial genome sequences: R. prowazekii and R. conorii. We identified 877 genes in R. typhi encoding 3 rRNAs, 33 tRNAs, 3 noncoding RNAs, and 838 proteins, 3 of which are frameshifts. In addition, we discovered more than 40 pseudogenes, including the entire cytochrome c oxidase system. The three rickettsial genomes share 775 genes: 23 are found only in R. prowazekii and R. typhi, 15 are found only in R. conorii and R. typhi, and 24 are unique to R. typhi. Although most of the genes are colinear, there is a 35-kb inversion in gene order, which is close to the replication terminus, in R. typhi, compared to R. prowazekii and R. conorii. In addition, we found a 124-kb R. typhi-specific inversion, starting 19 kb from the origin of replication, compared to R. prowazekii and R. conorii. Inversions in this region are also seen in the unpublished genome sequences of R. sibirica and R. rickettsii, indicating that this region is a hot spot for rearrangements. Genome comparisons also revealed a 12-kb insertion in the R. prowazekii genome, relative to R. typhi and R. conorii, which appears to have occurred after the typhus (R. prowazekii and R. typhi) and spotted fever (R. conorii) groups diverged. The three-way comparison allowed further in silico analysis of the SpoT split genes, leading us to propose that the stringent response system is still functional in these rickettsiae.  相似文献   

10.
Simple sequence repeats in Cucumis mapping and map merging.   总被引:14,自引:0,他引:14  
Thirty-four polymorphic simple-sequence repeats (SSRs) were evaluated for length polymorphism in melon (Cucumis melo L.) and cucumber (Cucumis sativus L.). SSR markers were located on three melon maps (18 on the map of 'Vedrantais' and PI 161375, 23 on the map of 'Piel de Sapo' and PI 161375, and 16 on the map of PI 414723 and 'Dulce'). In addition, 14 of the markers were located on the cucumber map of GY14 and PI 183967. SSRs proved to be randomly distributed throughout the melon and cucumber genomes. Mapping of the SSRs in the different maps led to the cross-identification of seven linkage groups in all melon maps. In addition, nine SSRs were common to both melon and cucumber maps. The potential of SSR markers as anchor points for melon-map merging and for comparative mapping with cucumber was demonstrated.  相似文献   

11.
Survey of simple sequence repeats in completed fungal genomes   总被引:7,自引:0,他引:7  
The use of simple sequence repeats or microsatellites as genetic markers has become very popular because of their abundance and length variation between different individuals. SSRs are tandem repeat units of 1 to 6 base pairs that are found abundantly in many prokaryotic and eukaryotic genomes. This is the first study examining and comparing SSRs in completely sequenced fungal genomes. We analyzed and compared the occurrences, relative abundance, relative density, most common, and longest SSRs in nine taxonomically different fungal species: Aspergillus nidulans, Cryptococcus neoformans, Encephalitozoon cuniculi, Fusarium graminearum, Magnaporthe grisea, Neurospora crassa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Ustilago maydis. Our analysis revealed that, in all of the genomes studied, the occurrence, abundance, and relative density of SSRs varied and was not influenced by the genome sizes. No correlation between relative abundance and the genome sizes was observed, but it was shown that N. crassa, the largest genome analyzed had the highest relative abundance of SSRs. In most genomes, mononucleotide, dinucleotide, and trinucleotide repeats were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. Our analysis showed that the relative abundance of SSRs in fungi is low compared with the human genome and that longer SSRs in fungi are rare. In addition to providing new information concerning the abundance of SSRs for each of these fungi, the results provide a general source of molecular markers that could be useful for a variety of applications such as population genetics and strain identification of fungal organisms.  相似文献   

12.
Simple repetitive sequences in the genomes of archaebacteria   总被引:1,自引:0,他引:1  
Stretches of simple sequences poly(dG-dT).poly(dC-dA), poly(dG-dA).poly(dC-dT), poly(dG).poly(dC) and poly(dA).poly(dT), the occurrence of which is a characteristic feature of eukaryotic genomes, are found in the genomes of archaebacteria Halobacterium halobium and Sulfolobus acidocaldarius. In S. acidocaldarius these sequences constitute a considerable portion of the genome; they belong to a class of repetitive sequences dispersed throughout the genome, being transcribed and found in RNAs of different lengths.  相似文献   

13.

Background

Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison.

Results

In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more) closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors.

Conclusions

We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial Genotyping Page" is a service for strain identification at the subspecies level.
  相似文献   

14.
The annotated Arabidopsis genome sequence was exploited as a tool for carrying out comparative analyses of the Arabidopsis and Capsella rubella genomes. Comparison of a set of random, short C. rubella sequences with the corresponding sequences in Arabidopsis revealed that aligned protein-coding exon sequences differ from aligned intron or intergenic sequences in respect to the degree of sequence identity and the frequency of small insertions/deletions. Molecular-mapped markers and expressed sequence tags derived from Arabidopsis were used for genetic mapping in a population derived from an interspecific cross between Capsella grandiflora and C. rubella. The resulting eight Capsella linkage groups were compared to the sequence maps of the five Arabidopsis chromosomes. Fourteen colinear segments spanning approximately 85% of the Arabidopsis chromosome sequence maps and 92% of the Capsella genetic linkage map were detected. Several fusions and fissions of chromosomal segments as well as large inversions account for the observed arrangement of the 14 colinear blocks in the analyzed genomes. In addition, evidence for small-scale deviations from genome colinearity was found. Colinearity between the Arabidopsis and Capsella genomes is more pronounced than has been previously reported for comparisons between Arabidopsis and different Brassica species.  相似文献   

15.
The physical distribution of ten simple-sequence repeated DNA motifs (SSRs) was studied on chromosomes of bread wheat, rye and hexaploid triticale. Oligomers with repeated di-, tri- or tetra-nucleotide motifs were used as probes for fluorescence in situ hybridization to root-tip metaphase and anther pachytene chromosomes. All motifs showed dispersed hybridization signals of varying strengths on all chromosomes. In addition, the motifs (AG)12, (CAT)5, (AAG)5, (GCC)5 and, in particular, (GACA)4 hybridized strongly to pericentromeric and multiple intercalary sites on the B genome chromosomes and on chromosome 4A of wheat, giving diagnostic patterns that resembled N-banding. In rye, all chromosomes showed strong hybridization of (GACA)4 at many intercalary sites that did not correspond to any other known banding pattern, but allowed identification of all R genome chromosome arms. Overall, SSR hybridization signals were found in related chromosome positions independently of the motif used and showed remarkably similar distribution patterns in wheat and rye, indicating the special role of SSRs in chromosome organization as a possible ancient genomic component of the tribe Triticeae (Gramineae). Received: 13 February 1998; in revised form: 18 August 1998 / Accepted: 18 August 1998  相似文献   

16.
Japanese red pine Pinus densiflora has 2 n=24 chromosomes and after FISH-detection of Arabidopsis-type (A-type) telomere sequences, many telomere signals were observed on these chromosomes at interstitial and proximal regions in addition to the chromosome ends. These interstitial and proximal signal sites were observed as DAPI-positive bands, suggesting that the interstitial and proximal telomere signal sites are composed of AT-rich highly repetitive sequences. Four DNA clones (PAL810, PAL1114, PAL1539, PAL1742) localized at the interstitial telomere signals were selected from AluI-digested genomic DNA library using colony blot hybridization probed with A-type telomere sequences and characterized using FISH and Southern blot hybridization. The AT-contents of these selected four clones were 60.8–76.3%, and repeat units of the telomere sequence and degenerated telomere sequences were found in their nucleotide sequences. Except for two sites of PAL1114, FISH signals of the four clones co-localized with interstitial and proximal A-type telomere sequence signals. FISH signals a showed similar distribution pattern, but the patterns of signal intensity were different among the four clones. PAL810, PAL1539 and PAL 1742 showed similar FISH signal patterns, and the differences were only with respect to the signal intensity of some signal sites. PAL1114 had unique signals that appeared on chromosomes 7 and 10. Based on results of the Southern blot hybridization these four sequences are not arranged tandemly. Our results suggest that the interstitial A-type telomere sequence signal sites were composed of a mixture of several AT-rich repetitive sequences and that these repetitive sequences contained A-type telomere sequences or degenerated A-type telomere sequence repeats.  相似文献   

17.
基因组中开阅读框架长度的分布模型与基因组进化   总被引:3,自引:1,他引:2  
分析了5种真核、15种细菌和10种古菌基因组中开阅读框架(open reading flame,ORF)的数目随长度的分布,发现不同生物的分布相似且有明显的规律性。用各种分布模型进行拟合比较,结果显示每种生物的这类分布均符合Г(α,β)分布,由此提出生物基因组中ORF的数目随长度的分布是Г(α,β)分布的假设。分析各生物基因组的拟合参数,发现α和β值与基因组进化存在明显的相关性;讨论了α和β值的生物进化意义,并给出了真核生物偏好使用长基因的结论;依照Г(α,β)分布估计了酵母基因组中ORF数目的上限为5870个。该方法对于研究生物基因组进化以及评估理论预测基因的可靠性具有建设性意义。  相似文献   

18.
The zebrafish has drawn a great deal of attention as a developmental system because it offers the ability to combine excellent embryology and genetics. Here, we report that simple sequence repeats are abundant in the zebrafish genome and are highly polymorphic between two outbred lines, making them useful markers for the construction of a genetic map of this organism.  相似文献   

19.
Simple sequence repeats (SSRs) can be derived from the complete genome sequence. These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop SSRs for cotton gene mapping, we selected the complete genome sequence of Gossypium raimondii, which consisted of 4447 non-redundant scaffolds. Out of 775.2 Mb sequence examined, a total of 136,345 microsatellites were identified with a density of 5.69 kb per SSR in the G. raimondii genome leading to development of 112,177 primer pairs. The distributions of SSRs in the genome were non-random. Among the different motifs ranging from 1 to 6 bp, penta-nucleotide repeats were most abundant (30.5%), followed by tetra-nucleotide repeats (18.2%) and di-nucleotide repeats (16.9%). Among all identified 457 motif types, the most frequently occurring repeat motifs were poly-AT/TA, which accounted for 79.8% of the total di-nt SSRs, followed by AAAT/TTTA with 51.5% of the total tetra-nucleotede. Further, 18,834 microsatellites were detected from the protein-coding genes, and the frequency of gene containing SSRs was 46.0% in 40,976 genes of G. raimondii. These genome-based SSRs developed in the present study will lay the groundwork for developing large numbers of SSR markers for genetic mapping, gene discovery, genetic diversity analysis, and MAS breeding in cotton.  相似文献   

20.
李伟  陈怀谷  李伟  张爱香  陈丽华  姜伟丽 《遗传》2007,29(9):1154-1160
利用公共的真菌基因组数据库资源, 对核盘菌(Sclerotinia sclerotiorum)和灰葡萄孢(Botrytis cinerea)基因组中SSRs的结构类型、分布、丰度及最长序列等进行了系统分析, 并与已经研究过的禾谷镰孢菌(Fusarium graminearum), 稻瘟病菌(Magnaporthe grisea)和黑粉菌(Ustilago maydis)等几种植物病原真菌基因组中的SSRs进行了比较。结果表明: 核盘菌和灰葡萄孢基因组中的SSRs非常丰富, 分别为6 539和8 627个, 并且在结构类型和分布规律上具有一定的相似性; 与其他几种病原真菌相比, 核盘菌和灰葡萄孢基因组中长重复的四、五、六核苷酸基序更为丰富, 从而使得这两种真菌具有更高的变异性。同时, 我们发现真菌基因组中SSRs的丰度与基因组的大小及GC含量没有必然的关系。文章对核盘菌和灰葡萄孢基因组中SSRs的丰度、出现频率及最长基序的分析为快速、便捷地设计多态性丰富的SSRs引物提供了有益的信息。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号