首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Simple Sequence Repeats (SSRs) developed from Expressed Sequence Tags (ESTs), known as EST-SSRs are most widely used and potentially valuable source of gene based markers for their high levels of crosstaxon portability, rapid and less expensive development. The EST sequence information in the publicly available databases is increasing in a faster rate. The emerging computational approach provides a better alternative process of development of SSR markers from the ESTs than the conventional methods. In the present study, 12,851 EST sequences of Camellia sinensis, downloaded from National Center for Biotechnology Information (NCBI) were mined for the development of Microsatellites. 6148 (4779 singletons and 1369 contigs) non redundant EST sequences were found after preprocessing and assembly of these sequences using various computational tools. Out of total 3822.68 kb sequence examined, 1636 (26.61%) EST sequences containing 2371 SSRs were detected with a density of 1 SSR/1.61 kb leading to development of 245 primer pairs. These mined EST-SSR markers will help further in the study of variability, mapping, evolutionary relationship in Camellia sinensis. In addition, these developed SSRs can also be applied for various studies across species.  相似文献   

2.
Simple sequence repeat (SSR) markers are widely used in many plant and animal genomes due to their abundance, hypervariability, and suitability for high-throughput analysis. Development of SSR markers using molecular methods is time consuming, laborious, and expensive. Use of computational approaches to mine ever-increasing sequences such as expressed sequence tags (ESTs) in public databases permits rapid and economical discovery of SSRs. Most of such efforts to date focused on mining SSRs from monocotyledonous ESTs. In this study, we have computationally mined and examined the abundance of SSRs in more than 1.54 million ESTs belonging to 55 dicotyledonous species. The frequency of ESTs containing SSRs among species ranged from 2.65% to 16.82%. Dinucleotide repeats were found to be the most abundant followed by tri- or mono-nucleotide repeats. The motifs A/T, AG/GA/CT/TC, and AAG/AGA/GAA/CTT/TTC/TCT were the predominant mono-, di-, and tri-nucleotide SSRs, respectively. Most of the mononucleotide SSRs contained 15-25 repeats, whereas the majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. The comprehensive SSR survey data presented here demonstrates the potential of in silico mining of ESTs for rapid development of SSR markers for genetic analysis and applications in dicotyledonous crops.  相似文献   

3.
Microsatellites are the markers of choice due to their high abundance reproducibility, degree of polymorphism and co-dominant nature. These are mainly used for studying the genetic variability in different species and Marker assisted selection. Expressed Sequence Tags (ESTs) serve as the main resource for Simple Sequence Repeats (SSRs). The computational approach for detecting SSRs and developing SSR markers from EST-SSRs is preferred over the conventional methods as it reduces time and cost to a great extent. The available EST sequence databases, various web interfaces and standalone tools provide the platform for an easy analysis of the EST sequences leading to the development of potential EST-SSR Markers. This paper is an overview of in silico approach to develop SSR Markers from the EST sequence using some of the most efficient tools that are available freely for academic purpose.  相似文献   

4.
Xin D  Sun J  Wang J  Jiang H  Hu G  Liu C  Chen Q 《Molecular biology reports》2012,39(9):9047-9057
Microsatellites, or simple sequence repeats (SSRs), are very useful molecular markers for a number of plant species. We used a new publicly available module (TROLL) to extract microsatellites from the public database of soybean expressed sequence tag (EST) sequences. A total of 12,833 sequences containing di- to penta-type SSRs were identified from 200,516 non-redundant soybean ESTs. On average, one SSR was found per 7.25?kb of EST sequences, with the tri-nucleotide motifs being the most abundant. Primer sequences flanking the SSR motifs were successfully designed for 9,638 soybean ESTs using the software primer3.0 and only 59 pairs of them were found in earlier studies. We synthesized 124 pairs of the primers to determine the polymorphism and heterozygosity among eight genotypes of soybean cultivars, which represented a wide range of the cultivated soybean cultivars. PCR amplification products with anticipated SSRs were obtained with 81 pairs of primers; 36 PCR products appeared to be homozygous and the remaining 45 PCR products appeared to be heterozygous and displayed polymorphism among the eight cultivars. We further analysed the EST sequences containing 45 polymorphic EST-SSR markers using the programs BLASTN and BLASTX. Sequence alignment showed that 29 ESTs have homologous sequences and 15 ESTs could be classified into a Uni-gene cluster with comparatively convincing protein products. Among these 15 ESTs belonging to a Uni-gene cluster, 9 SSRs were located in 3'-UTR, 4 SSRs were located in the intron region and 2 SSRs were located in the CDS region. None of these SSRs was located in the 5'-UTR. These novel SSRs identified in the ESTs of soybean provide useful information for gene mapping and cloning in future studies.  相似文献   

5.

Background  

Simple Sequence Repeat (SSR) or microsatellite markers are valuable for genetic research. Experimental methods to develop SSR markers are laborious, time consuming and expensive. In silico approaches have become a practicable and relatively inexpensive alternative during the last decade, although testing putative SSR markers still is time consuming and expensive. In many species only a relatively small percentage of SSR markers turn out to be polymorphic. This is particularly true for markers derived from expressed sequence tags (ESTs). In EST databases a large redundancy of sequences is present, which may contain information on length-polymorphisms in the SSR they contain, and whether they have been derived from heterozygotes or from different genotypes. Up to now, although a number of programs have been developed to identify SSRs in EST sequences, no software can detect putatively polymorphic SSRs.  相似文献   

6.
7.
Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function.  相似文献   

8.
Teleost fish genome projects involving model species are resulting in a rapid accumulation of genomic and expressed DNA sequences in public databases. The expressed sequence tags (ESTs) collected in the databases can be mined for the analysis of both structural and functional genomics. In this study, we in silico analyzed 49,430 unigenes representing a total of 692,654 ESTs from four model fish for their potential use in developing simple sequence repeats (SSRs), or microsatellites. After bioinformatical mining, a total of 3,018 EST derived SSRs (EST-SSRs) were identified for 2,335 SSR containing ESTs (SSR-ESTs). The frequency of identified SSR-ESTs ranged from 1.5% for Xiphophorus to 7.3% for zebrafish. The dinucleotide repeat motif is the most abundant SSR, accounting for 47%, 52%, 64%, and 78% for medaka, Fundulus, zebrafish, and Xiphophorus, respectively. Simulation analysis suggests that a majority of these EST-SSRs have sufficient flanking sequences for polymerase chain reaction (PCR) primer design. Comparative DNA sequence analyses of SSR-ESTs identified several cross-species SSRs and sequences that may be used as cross-reference genes in comparative studies. For example, the flanking sequences of one SSR (CTG)n within the pituitary tumor-transforming gene (PTTG) 1 interacting protein (PTTGIP), showed conservation spanning the medaka, Fundulus, human, and mouse genomes. This study provides a large body of information on EST-SSRs that can be useful for the development of polymorphic markers, gene mapping, and comparative genome analysis. Functional analysis of these SSR-ESTs may reveal their role in metabolism and gene evolution of these model species.  相似文献   

9.
Expressed sequence tags (ESTs) from Coffea canephora leaves and fruits were used to search for types and frequencies of simple sequence repeats (EST–SSRs) with a motif length of 1–6 bp. From a non-redundant (NR) EST set of 5,534 potential unigenes, 6.8% SSR-containing sequences were identified, with an average density of one SSR every 7.73 kb of EST sequences. Trinucleotide repeats were found to be the most abundant (34.34%), followed by di- (25.75%) and hexa-nucleotide (22.04%) motifs. The development of unique genic SSR markers was optimized by a computational approach which allowed us to eliminate redundancy in the original EST set and also to test the specificity of each pair of designed primers. Twenty-five EST–SSRs were developed and used to evaluate cross-species transferability in the Coffea genus. The orthology was supported by the amplicon sequence similarity and the amplification patterns. The >94% identity of flanking sequences revealed high sequence conservation across the Coffea genus. A high level of polymorphic loci was obtained regardless of the species considered (from 75% for C. liberica to 86% for C. canephora). Moreover, the polymorphism revealed by EST–SSR was similar to that exposed by genomic SSR. It is concluded that Coffea ESTs are a valuable resource for microsatellite mining. EST-SSR markers developed from C. canephora sequences can be easily transferred to other Coffea species for which very little molecular information is available. They constitute a set of conserved orthologous markers, which would be ideal for assessing genetic diversity in coffee trees as well as for cross-referencing transcribed sequences in comparative genomics studies.  相似文献   

10.
With the ever increasing number of Expressed Sequence Tags (ESTs) from various sequencing projects, ESTs have become valuable and first-hand source of in-silico mining of simple sequence repeats (SSR) markers. We examined a total of 3419 EST sequences from three bamboo species, namely, Phyllostachys edulis, Bambusa oldhamii and Dendrocalamus sinicus for the presence of di- to hexa- microsatellites. The frequency of SSR containing ESTs varied from 5.36% in B. oldhamii to 13.05% in P. edulis. No SSRs were found in D. sinicus. Tri-nucleotide repeats (49.34%) were most frequent in P. edulis, while not much comparable difference in repeats was found in B. oldhamii. Flanking primer pairs were also designed in-silico for the sequences containing SSRs and their position on the genome hypothesized using similarity searching. SSRs located in open reading frame (ORF) were given functional annotation using Gene Ontology. Polymorphic SSRs were also detected using new pipeline- polySSR. Polymorphism level was very low (2.43%) and the position of the polymorphic SSRs was determined. The development of SSRs and the study of polymorphism will help in the further study of intra- and inter- gene flow, genetic structure, variability, linkage mapping and evolutionary relationships in bamboo.  相似文献   

11.
12.
Mining and characterizing microsatellites from citrus ESTs   总被引:17,自引:0,他引:17  
Freely available computer programs were arranged in a pipeline to extract microsatellites from public citrus EST sequences, retrieved from the NCBI. In total, 3,278 bi- to hexa-type SSR-containing sequences were identified from 56,199 citrus ESTs. On an average, one SSR was found per 5.2 kb of EST sequence, with the tri-nucleotide motifs as the most abundant. Primer sequences flanking SSR motifs were successfully identified from 2,295 citrus ESTs. Among those, a subset (100 pairs) were synthesized and tested to determine polymorphism and heterozygosity between/within two genera, sweet orange (C. sinensis) and Poncirus (P. trifoliata), which are the parents of the citrus core mapping population selected for an international citrus genomics effort. Eighty-seven pairs of primers gave PCR amplification to the anticipated SSRs, of which 52 and 35 appear to be homozygous and heterozygous, respectively, in sweet orange, and 67 and 20, respectively, in Poncirus. By pairing the loci between the two intergeneric species, it was found that 40 are heterozygous in at least one species with two alleles (9), three alleles (28), or four alleles (3), and the remaining 47 are homozygous in both species with either one allele (31) or two alleles (16). These EST-derived SSRs can be a resource used for understanding of the citrus SSR distribution and frequency, and development of citrus EST-SSR genetic and physical maps. These SSR primer sequences are available upon request. Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.  相似文献   

13.
Expressed Sequence Tags (ESTs) are short, usually unedited sequences obtained by single-pass sequencing of cDNA clones from any cDNA library. Analyzing and comparing ESTs can provide information on gene expression, function and evolution. Large-scale EST sequencing has become an attractive alternative to plant genome sequencing. Currently, plant EST collections comprise over 3.8 million sequences from about 200 species. They have proved to be a valuable tool for gene discovery and plant metabolism analysis. Several plant-specific EST databases have been created which provide access to sequence data and bioinformatics-based tools for data mining. Searching EST collections allows pre-selection of genes for preparing cDNA arrays, targeted to bring maximum information on specialized processes, like stress response, symbiotic nitrogen fixation etc. Also, ESt-based molecular markers such as SNP, SSR, and indels are fast developing tools for breeders and researchers.  相似文献   

14.
Simple sequence repeats (SSRs) derived from expressed sequence tags (ESTs) are valuable markers because they represent transcribed regions and often have putative functions. We mined and characterized microsatellites in melon ESTs. Three hundred and eighty‐three SSR loci were identified in 309 of 3188 unigenes assembled by 5747 EST and mRNA sequences in GenBank with occurring frequency of 1/4.7 kb. Twenty‐two polymorphic EST‐SSR markers were developed with the mean allele number of 2.9 per locus and mean expected heterozygosity of 0.442. Amplification products were also detected by 15 pairs of primer in Cucumis sativus. Those informative EST‐SSR markers can be used in melon genetic improvement projects.  相似文献   

15.
银杏EST序列中微卫星的分布特征   总被引:5,自引:0,他引:5  
本文利用从NCBI下载的21 590条银杏EST序列,分析了银杏(表达序列标签微卫星)EST-SSR在银杏EST序列的分布和比较了在不同长度EST序列中的SSR特性.在剔除冗余和低质量序列后,得到总长为5 708.385 kb的无冗余EST序列7 961条,发现了405个EST序列(5.09%)含有475个SSR,长度400-1000 bp的EST序列含SSR位点数为445个,占SSR总数的93.68%.二核苷酸和三核苷酸基元类型是银杏EST-SSR的主要类型,分别占SSR总数的73.89%和24.00%,最常见的SSR基元是:(AT)_n、(AG)_n、(AC)_n、(AAG)_n和(AAT)_n.通过对银杏EST序列中SSR位点信息的发掘分析,为有针对性地设计EST-SSR引物,开发银杏EST-SSR分子标记奠定基础.  相似文献   

16.
? Premise of the study: The redundancies in expressed sequence tags (ESTs) in the National Center for Biotechnology Information sequence database were used to identify and develop polymorphic simple sequence repeat (SSR) markers for pepper (Capsicum annuum). ? Methods and Results: Sixty-eight polymorphic SSR loci were identified in the contigs (containing redundant ESTs) generated by assembling 118060 pepper ESTs from the public sequence database. Thirty-three SSR markers exhibited polymorphism among 31 pepper varieties, with alleles per SSR marker ranging from two to six. The mean observed and expected heterozygosity were 0.28 and 0.39, respectively. There were 18 SSR markers with a motif repeat number of less than five, accounting for 55% of the total. ? Conclusions: We demonstrated the value of mining the redundant sequences in public sequence databases for the development of polymorphic SSR markers, which can be used for marker-assisted breeding in pepper.  相似文献   

17.
The public availability of large quantities of gene sequence data provides a valuable resource of the mining of Simple Sequence Repeat (SSR) molecular genetic markers for genetic analysis. These markers are inexpensive, require minimal labour to produce and can frequently be associated with functionally annotated genes. This study presents the characterization of barley EST‐SSRs and the identification of putative polymorphic SSRs from EST data. Polymorphic SSRs are distinguished from monomorphic SSRs by the representation of varying motif lengths within an alignment of sequence reads. Two measures of confidence are calculated, redundancy of a polymorphism and co‐segregation with accessions. The utility of this method is demonstrated through the discovery of 597 candidate polymorphic SSRs, from a total of 452 642 consensus expressed sequences. PCR amplification primers were designed for the identified SSRs. Ten primer pairs were validated for polymorphism in barley and for transferability across species. Analysis of the polymorphisms in relation to SSR motif, length, position and annotation is discussed.  相似文献   

18.
Genomic resources for peach, a model species for Rosaceae, are being developed to accelerate gene discovery in other Rosaceae species by comparative mapping. Simple sequence repeats (SSRs) are an important tool for comparative mapping because of their high polymorphism and transportability. To accelerate the development of SSR markers, we analyzed publicly available Rosaceae expressed sequence tags (ESTs) for SSRs. A total of 17,284 ESTs from almond, peach and rose were assembled into putatively non-redundant EST sets. For comparison, 179,099 ESTs from Arabidopsis were also used in the analysis. About 4% of the assembled ESTs contained SSRs in Rosaceae, which was higher than the 2.4% found in Arabidopsis. About half of the SSRs were found in the putative UTR, and the estimated average distance between SSRs in the UTR was 5.5 kb in rose, 5.1 kb in almond, 7 kb in peach and 13 kb in Arabidopsis. In the putative coding region, the estimated average distance was two to four times longer than in the UTR. Rosaceae ESTs containing SSRs were functionally annotated using the GenBank nr database and further classified using the gene ontology terms associated with the matching sequences in the SwissProt database. The detailed data including the sequences and annotation results are available from .  相似文献   

19.
Switchgrass (Panicum virgatum L.) is a model cellulosic biofuel crop in the United States. Simple sequence repeat (SSR) markers are valuable resources for genetic mapping and molecular breeding. A large number of expressed sequence tags (ESTs) of switchgrass are recently available in our sequencing project. The objectives of this study were to develop new SSR markers from the switchgrass EST sequences and to integrate them into an existing linkage map. More than 750 unique primer pairs (PPs) were designed from 243,600 EST contigs and tested for PCR amplifications, resulting in 538 PPs effectively producing amplicons of expected sizes. Of the effective PPs, 481 amplifying informative bands in NL94 were screened for polymorphisms in a panel consisting of NL94 and its seven first-generation selfed (S1) progeny. This led to the selection of 117 polymorphic EST–SSRs to genotype a mapping population encompassing 139 S1 individuals of NL94. Of 83 markers demonstrating clearly scorable alleles in the mapping population, 79 were integrated into a published linkage map, with three linked to accessory loci and one unlinked. The newly identified EST–SSR loci were distributed in 17 of 18 linkage groups with 27 (32.5 %) exhibiting distorted segregations. The integration of EST–SSRs aided in reducing the average marker interval (cM) to 3.7 from 4.2, and reduced the number of gaps (each >15 cM) to 10 from 23. Developing new EST–SSRs and constructing a higher density linkage map will facilitate quantitative trait locus mapping and provide a firm footing for marker-assisted breeding in switchgrass.  相似文献   

20.
The type and frequency of simple sequence repeats (SSRs) in plant genomes was investigated using the expanding quantity of DNA sequence data deposited in public databases. In Arabidopsis, 306 genomic DNA sequences longer than 10 kb and 36,199 EST sequences were searched for all possible mono- to pentanucleotide repeats. The average frequency of SSRs was one every 6.04 kb in genomic DNA, decreasing to one every 14 kb in ESTs. SSR frequency and type differed between coding, intronic, and intergenic DNA. Similar frequencies were found in other plant species. On the basis of these findings, an approach is proposed and demonstrated for the targeted isolation of single or multiple, physically clustered SSRs linked to any gene that has been mapped using low-copy DNA-based markers. The approach involves sample sequencing a small number of subclones of selected randomly sheared large insert DNA clones (e.g., BACs). It is shown to be both feasible and practicable, given the probability of fortuitously sequencing through an SSR. The approach is demonstrated in barley where sample sequencing 34 subclones of a single BAC selected by hybridization to the Big1 gene revealed three SSRs. These allowed Big1 to be located at the top of barley linkage group 6HS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号