首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Xin D  Sun J  Wang J  Jiang H  Hu G  Liu C  Chen Q 《Molecular biology reports》2012,39(9):9047-9057
Microsatellites, or simple sequence repeats (SSRs), are very useful molecular markers for a number of plant species. We used a new publicly available module (TROLL) to extract microsatellites from the public database of soybean expressed sequence tag (EST) sequences. A total of 12,833 sequences containing di- to penta-type SSRs were identified from 200,516 non-redundant soybean ESTs. On average, one SSR was found per 7.25?kb of EST sequences, with the tri-nucleotide motifs being the most abundant. Primer sequences flanking the SSR motifs were successfully designed for 9,638 soybean ESTs using the software primer3.0 and only 59 pairs of them were found in earlier studies. We synthesized 124 pairs of the primers to determine the polymorphism and heterozygosity among eight genotypes of soybean cultivars, which represented a wide range of the cultivated soybean cultivars. PCR amplification products with anticipated SSRs were obtained with 81 pairs of primers; 36 PCR products appeared to be homozygous and the remaining 45 PCR products appeared to be heterozygous and displayed polymorphism among the eight cultivars. We further analysed the EST sequences containing 45 polymorphic EST-SSR markers using the programs BLASTN and BLASTX. Sequence alignment showed that 29 ESTs have homologous sequences and 15 ESTs could be classified into a Uni-gene cluster with comparatively convincing protein products. Among these 15 ESTs belonging to a Uni-gene cluster, 9 SSRs were located in 3'-UTR, 4 SSRs were located in the intron region and 2 SSRs were located in the CDS region. None of these SSRs was located in the 5'-UTR. These novel SSRs identified in the ESTs of soybean provide useful information for gene mapping and cloning in future studies.  相似文献   

2.
With the advent of high-throughput sequencing technology, sequences from many genomes are being deposited to public databases at a brisk rate. Open access to large amount of expressed sequence tag (EST) data in the public databases has provided a powerful platform for simple sequence repeat (SSR) development in species where sequence information is not available. SSRs are markers of choice for their high reproducibility, abundant polymorphism and high inter-specific transferability. The mining of SSRs from ESTs requires different high-throughput computational tools that need to be executed individually which are computationally intensive and time consuming. To reduce the time lag and to streamline the cumbersome process of SSR mining from ESTs, we have developed a user-friendly, web-based EST-SSR pipeline "EST-SSR-MARKER PIPELINE (ESMP)". This pipeline integrates EST pre-processing, clustering, assembly and subsequently mining of SSRs from assembled EST sequences. The mining of SSRs from ESTs provides valuable information on the abundance of SSRs in ESTs and will facilitate the development of markers for genetic analysis and related applications such as marker-assisted breeding. AVAILABILITY: The database is available for free at http://bioinfo.aau.ac.in/ESMP.  相似文献   

3.
Simple sequence repeats (SSRs) are valuable molecular markers in many plant species. In common wheat (Triticum aestivum L.), which is characteristic of its large genomes and alloploidy, SSRs are one of the most useful markers. To increase SSR marker sources and construct an SSR-based linkage map of appropriate density, we tried to develop new SSR markers from SSR-enriched genomic libraries and the public database. SSRs having (GA)n and (GT)n motifs were isolated from enriched libraries, and di- and tri-nucleotide repeats were mined from expressed sequence tags (ESTs) and DNA sequences of Triticum species in the public database. Of the 1,147 primer pairs designed, 842 primers gave accurate amplification products, and 478 primers showed polymorphism among the nine wheat lines examined. Using a doubled haploid (DH) population from an intraspecific cross between Kitamoe and Münstertaler (KM), we constructed an SSR-based linkage map that consisted of 464 loci: 185 loci from genomic libraries, 65 loci from the sequence database including ESTs, 213 loci from the SSR markers already reported, and 1 locus of morphological marker. Although newly developed SSR loci were distributed throughout all chromosomes, clustering of them around putative centromeric regions was found on several chromosomes. The total length of the KM map spanned 3,441 cM and corresponded to approximately 86% genome coverage. The KM map comprised of 23 linkage groups because two gaps of over 50 cM distance remained on chromosome 6A. This is a first report of SSR-based linkage map using single intraspecific population of common wheat. This mapping result suggests that it becomes possible to construct linkage maps with sufficient genome coverage using only SSR markers without RFLP markers, even in an intraspecific population of common wheat. Moreover, the new SSR markers will contribute to the enrichment of molecular marker resources in common wheat.  相似文献   

4.
The abundance and inherent potential for variations in simple sequence repeats (SSRs) or microsatellites resulted in valuable source for genetic markers in eukaryotes. We describe the organization and abundance of SSRs in fungus Fusarium graminearum (causative agent for Fusarium head blight or head scab of wheat). We identified 1705 SSRs of various nucleotide repeat motifs in the sequence database of F. graminearum. It is observed that mononucleotide repeats (62%) were most abundant followed by di- (20%) and trinucleotide repeats (14%). It is noted that tetra-, penta- and hexanucleotide repeats accounted for only 4% of SSRs. The estimated frequency of Class I SSRs (perfect repeats ≥20 nucleotides) was one SSR per 124.5 kb, whereas the frequency of Class II (perfect repeats >10 nucleotides and ≫20 nucleotides) was one SSR per 25.6 kb. The dynamics of SSRs will be a powerful tool for taxonomic, phylogenetic, genome mapping and population genetic studies as SSR based markers show high levels of allelic variation, codominant inheritance and ease of analysis.  相似文献   

5.
With the ever increasing number of Expressed Sequence Tags (ESTs) from various sequencing projects, ESTs have become valuable and first-hand source of in-silico mining of simple sequence repeats (SSR) markers. We examined a total of 3419 EST sequences from three bamboo species, namely, Phyllostachys edulis, Bambusa oldhamii and Dendrocalamus sinicus for the presence of di- to hexa- microsatellites. The frequency of SSR containing ESTs varied from 5.36% in B. oldhamii to 13.05% in P. edulis. No SSRs were found in D. sinicus. Tri-nucleotide repeats (49.34%) were most frequent in P. edulis, while not much comparable difference in repeats was found in B. oldhamii. Flanking primer pairs were also designed in-silico for the sequences containing SSRs and their position on the genome hypothesized using similarity searching. SSRs located in open reading frame (ORF) were given functional annotation using Gene Ontology. Polymorphic SSRs were also detected using new pipeline- polySSR. Polymorphism level was very low (2.43%) and the position of the polymorphic SSRs was determined. The development of SSRs and the study of polymorphism will help in the further study of intra- and inter- gene flow, genetic structure, variability, linkage mapping and evolutionary relationships in bamboo.  相似文献   

6.
Simple sequence repeats (SSRs) or microsatellites are an important class of molecular markers for genome analysis and plant breeding applications. In this paper, the SSR distributions within ESTs from the legumes soybean (Glycine max, representing 135.86 Mb), medicago (Medicago truncatula, 121.1 Mb) and lotus (Lotus japonicus, 45.4 Mb) have been studied relative to the distributions in cereals such as sorghum (Sorghum bicolor, 98.9 Mb), rice (Oryza sativa, 143.9 Mb) and maize (Zea mays, 183.7 Mb). The relative abundance, density, composition and putative annotations of di-, tri-, tetra- and penta-nucleotide repeats have been compared and SSR containing ESTs (SSR-ESTs) have been clustered to give a non-redundant set of EST-SSRs, available in a database. Further, a subset of such candidate EST-SSRs from sorghum have been tested for their ability to detect polymorphism between Striga-susceptible, stay-green drought tolerant mapping population parent 'E 36-1' and its Striga-resistant, non-stay-green counterpart 'N13'. Primer sets for 64% of the EST-SSRs tested produced a clear and specific PCR product band and 34% of these detected scorable polymorphism between the N13 and E 36-1 parental lines. Over half of these markers have been genotyped on 94 RILs from the (N13 x E 36-1)-based mapping population, with 42 markers mapping onto the ten sorghum linkage groups. This establishes the value of this database as a resource of molecular markers for practical applications in cereal and legume genetics and breeding. The primer pairs for non-redundant EST-SSRs have been designed and are freely available through the database (http://intranet.icrisat.org/gt1/ssr/ssrdatabase.html).  相似文献   

7.
To identify EST-SSR molecular markers, 41,986 cattle UniGene sequences from NCBI were mined for analyzing SSRs. A total of 1,831 SSRs were identified from 1,666 ESTs, which represented an average density of 19.88 kb per SSR. The frequency of EST-SSRs was 4.0%. The dinucleotide repeat motif was the most abundant SSR, accounting for 54%, followed by 22%, 13%, 7% and 4%, respec-tively, for tri-, hexa-, penta- and tetra-nucleotide repeats. Depending upon the length of the repeat unit, the length of microsatellites varied from 14 to 86 bp. Among the di- and tri-nucleotide repeats, AC/TG (57%) and AGC (12%) were the most abundant type. Annotation of EST-SSRs was also carried out. Three hundred primer pairs were randomly designed using Prime Premier 5.0 program and Oligo 5.0 for further experimental validation.  相似文献   

8.
9.
We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development project ('Marine Genomics Europe' Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs (4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most abundant accounting for 47.6% of the total number. EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and investigate their cross-species amplification in sixteen teleost fish species: seven sparid species and nine other species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome analyses for this species and further with other model fish species and finally to help advance genetic analysis for cultivated and wild populations and accelerate breeding programs.  相似文献   

10.
An in-silico analysis of simple sequence repeats (SSRs) in genomes of 32 species of potexviruses was performed wherein a total of 691 SSRs and 33 cSSRs were observed. Though SSRs were present in all the studied genomes their incident frequency ranged from 11 to 30 per genome. Further, 10 potexvirus genomes possessed no cSSRs when extracted at a dMAX of 10 and wherein present, the highest frequency was 3. SSR and cSSR incidence, relative density and relative abundance were non-significantly correlated with genome size and GC content suggesting an ongoing evolutionary and adaptive phase of the virus species. SSRs present primarily ranged from mono- to tri-nucleotide repeat motifs with a greatly skewed distribution across the coding and non-coding regions. Present work is an effort for the undergoing compilation and analysis of incidence, distribution and variation of the viral repeat sequences to understand their evolutionary and functional relevance.  相似文献   

11.
Teleost fish genome projects involving model species are resulting in a rapid accumulation of genomic and expressed DNA sequences in public databases. The expressed sequence tags (ESTs) collected in the databases can be mined for the analysis of both structural and functional genomics. In this study, we in silico analyzed 49,430 unigenes representing a total of 692,654 ESTs from four model fish for their potential use in developing simple sequence repeats (SSRs), or microsatellites. After bioinformatical mining, a total of 3,018 EST derived SSRs (EST-SSRs) were identified for 2,335 SSR containing ESTs (SSR-ESTs). The frequency of identified SSR-ESTs ranged from 1.5% for Xiphophorus to 7.3% for zebrafish. The dinucleotide repeat motif is the most abundant SSR, accounting for 47%, 52%, 64%, and 78% for medaka, Fundulus, zebrafish, and Xiphophorus, respectively. Simulation analysis suggests that a majority of these EST-SSRs have sufficient flanking sequences for polymerase chain reaction (PCR) primer design. Comparative DNA sequence analyses of SSR-ESTs identified several cross-species SSRs and sequences that may be used as cross-reference genes in comparative studies. For example, the flanking sequences of one SSR (CTG)n within the pituitary tumor-transforming gene (PTTG) 1 interacting protein (PTTGIP), showed conservation spanning the medaka, Fundulus, human, and mouse genomes. This study provides a large body of information on EST-SSRs that can be useful for the development of polymorphic markers, gene mapping, and comparative genome analysis. Functional analysis of these SSR-ESTs may reveal their role in metabolism and gene evolution of these model species.  相似文献   

12.
为了在芦笋中开发EST-SSR功能性标记,对来源于NCBI公共数据库的8590条芦笋(AsparagusofficinalisL.)EST序列进行简单重复序列SSR搜索。剔除冗余序列,得到非冗余序列8377条。在非冗余序列中共挖掘出469个EST-SSR,平均相隔14.80kb出现1个SSR。在所有的重复基序中,二核苷酸重复基序的SSR所占比例最高40.51%(190/469),其次是三核苷酸34.97%(164/469),六核苷酸21.11%(99/469)。在所有基序里,CT/AG出现的频率最高有62次,占全部重复基序的13.22%(62/469)。选取含SSR的EST序列30条,并利用primer5软件设计引物,进行SSR位点的扩增,其中27对引物扩增产物,24对有较清晰可靠的目标扩增条带,占引物数的80%,且所检测出的芦笋等位基因数量较丰富,平均4.93个/对。这些EST-SSR标记的开发将有助于芦笋群体遗传多样性、遗传图谱构建、基因定位、分子标记和系谱分析等方面的研究。  相似文献   

13.
Simple sequence repeats (SSRs) or microsatellites are one of the most popular sources of genetic markers and play a significant role in gene function and genome organization. We identified SSRs in the genome of Ganoderma lucidum and analyzed their frequency and distribution in different genomic regions. We also compared the SSRs in G. lucidum with six other Agaricomycetes genomes: Coprinopsis cinerea, Laccaria bicolor, Phanerochaete chrysosporium, Postia placenta, Schizophyllum commune and Serpula lacrymans. Based on our search criteria, the total number of SSRs found ranged from 1206 to 6104 and covered from 0.04% to 0.15% of the fungal genomes. The SSR abundance was not correlated with the genome size, and mono- to tri-nucleotide repeats outnumbered other SSR categories in all of the species examined. In G. lucidum, a repertoire of 2674 SSRs was detected, with mono-nucleotides being the most abundant. SSRs were found in all genomic regions and were more abundant in non-coding regions than coding regions. The highest SSR relative abundance was found in introns (108 SSRs/Mb), followed by intergenic regions (84 SSRs/Mb). A total of 684 SSRs were found in the protein-coding sequences (CDSs) of 588 gene models, with 81.4% of them being tri- or hexa-nucleotides. After scanning for InterPro domains, 280 of these genes were successfully annotated, and 215 of them could be assigned to Gene Ontology (GO) terms. SSRs were also identified in 28 bioactive compound synthesis-related gene models, including one 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR), three polysaccharide biosynthesis genes and 24 cytochrome P450 monooxygenases (CYPs). Primers were designed for the identified SSR loci, providing the basis for the future development of SSR markers of this medicinal fungus.  相似文献   

14.
The increasing availability of expressed sequence tags (ESTs) in wheat (Triticum aestivum) and related cereals provides a valuable resource of non-anonymous DNA molecular markers. We examined 170,746 wheat ESTs from the public (International Triticeae EST Cooperative) and Génoplante databases, previously clustered in contigs, for the presence of di- to hexanucleotide simple sequence repeats (SSRs). Analysis of 46,510 contigs identified 3,530 SSRs, which represented 7.5% of the total number of contigs. Only 74% of the sequences allowed primer pairs to be designed, 70% led to an amplification product, mainly of a high quality (68%), and 53% exhibited polymorphism for at least one cultivar among the eight tested. Even though dinucleotide SSRs were less represented than trinucleotide SSRs (15.5% versus 66.5%, respectively), the former showed a much higher polymorphism level (83% versus 46%). The effect of the number and type of repeats is also discussed. The development of new EST-SSRs markers will have important implications for the genetic analysis and exploitation of the genetic resources of wheat and related species and will provide a more direct estimate of functional diversity.  相似文献   

15.
Simple sequence repeats (SSRs) derived from expressed sequence tags (ESTs) are valuable markers because they represent transcribed regions and often have putative functions. We mined and characterized microsatellites in melon ESTs. Three hundred and eighty‐three SSR loci were identified in 309 of 3188 unigenes assembled by 5747 EST and mRNA sequences in GenBank with occurring frequency of 1/4.7 kb. Twenty‐two polymorphic EST‐SSR markers were developed with the mean allele number of 2.9 per locus and mean expected heterozygosity of 0.442. Amplification products were also detected by 15 pairs of primer in Cucumis sativus. Those informative EST‐SSR markers can be used in melon genetic improvement projects.  相似文献   

16.
A genome-wide sequence search was conducted to identify simple sequence repeat (SSR) loci in phylloxera, Daktulosphaira vitifoliae, a major grape pest throughout the world. Collectively, 1524 SSR loci containing mono-, di-, tri-, tetra-, penta-, and hexanucleotide motifs were identified. Among them, trinucleotide repeats were the most abundant in the phylloxera genome (34.4%), followed by hexanucleotide (20.4%) and dinucleotide (19.6%) repeats. Mono-, tetra- and pentanucleotide repeats were found at a frequency of 1.3, 11.2 and 12.9%, respectively. The abundance and inherent variations in SSRs provide valuable information for developing molecular markers. The high levels of allelic variation and codominant features of SSRs make this marker system a useful tool for genotyping, diversity assessment and population genetic studies of reproductive characteristics of phylloxera in agricultural and natural populations.  相似文献   

17.
Pineapple (Ananas comosus (L.) Merrill) is the second most important tropical fruit in term of international trade. The availability of whole genomic sequences and expressed sequence tags (ESTs) offers an opportunity to identify and characterize microsatellite or simple sequence repeat (SSR) markers in pineapple. A total of 278,245 SSRs and 41,962 SSRs with an overall density of 728.57 SSRs/Mb and 619.37 SSRs/Mb were mined from genomic and ESTs sequences, respectively. 5′-untranslated regions (5′-UTRs) had the greatest amount of SSRs, 3.6–5.2 fold higher SSR density than other regions. For repeat length, 12 bp was the predominant repeat length in both assembled genome and ESTs. Class I SSRs were underrepresented compared with class II SSRs. For motif length, dinucleotide repeats were the most abundant in genomic sequences, whereas trinucleotides were the most common motif in ESTs. Tri- and hexanucleotides of total SSRs were more prevalent in ESTs than in the whole genome. The SSR frequency decreased dramatically as repeat times increased. AT was the most frequent single motif across the entire genome while AG was the most abundant motif in ESTs. Across six examined plant species, the pineapple genome displayed the highest density, substantially more than the second-place cucumber. Annotation and expression analyses were also conducted for genes containing SSRs. This thorough analysis of SSR markers in pineapple provided valuable information on the frequency and distribution of SSRs in the pineapple genome. This genomic resource will expedite genomic research and pineapple improvement.  相似文献   

18.
The availability of sequence data derived from shotgun sequencing programs enables mining for simple sequence repeats (SSRs), providing useful genetic markers for crop improvement. This study presents the development and characterization of 40 SSR markers from Brassica oleracea shotgun sequence and their cross‐amplification across Brassica species. The markers show reliable amplification, genome specificity and considerable polymorphism, demonstrating the utility of SSRs for genetic analysis of commercial Brassica germplasm.  相似文献   

19.
The detection of simple sequence repeats (SSRs) within expressed sequence tags (ESTs) connects potential microsatellite markers with specific genes, generating Type I markers. Using an in silico approach, we identified 1975 SSRs from the Genome Research on Atlantic Salmon Project EST database. We designed primers to amplify 158 SSRs, of which 65 amplified 76 loci (including 11 duplicated loci). Sixty‐one of the 76 loci were variable in 24 Atlantic salmon from seven populations, and 96% of these markers also amplify DNA from other salmonids. Functions for 16 of the SSR associated ESTs have been determined, confirming them as Type I markers.  相似文献   

20.
SSR (simple sequence repeats) markers derived from ESTs (expressed sequence tags), commonly called EST‐SSRs or genic SSRs provide useful genetic markers for crop improvement. These are easy and economical to develop as by‐products of large‐scale EST resources that have become available as part of the functional genomic studies in many plant species. Here, we describe for the first time, nine genic‐SSRs of coffee that are developed from the microsatellite containing ESTs from a cDNA library of moisture‐stressed leaves of coffee variety, ‘CxR’ (a commercial interspecific hybrid between Coffea congensis and Coffea canephora). The markers show considerable allelic diversity with PIC values up to 0.70 and 0.75 for Coffea arabica and Coffea canephora, respectively, and robust cross‐species amplification in 16 other related taxa of coffee. The validation studies thus demonstrate the potential utility of the EST‐SSRs for genetic analysis of coffee germplasm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号