首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Simple Sequence Repeats (SSRs) developed from Expressed Sequence Tags (ESTs), known as EST-SSRs are most widely used and potentially valuable source of gene based markers for their high levels of crosstaxon portability, rapid and less expensive development. The EST sequence information in the publicly available databases is increasing in a faster rate. The emerging computational approach provides a better alternative process of development of SSR markers from the ESTs than the conventional methods. In the present study, 12,851 EST sequences of Camellia sinensis, downloaded from National Center for Biotechnology Information (NCBI) were mined for the development of Microsatellites. 6148 (4779 singletons and 1369 contigs) non redundant EST sequences were found after preprocessing and assembly of these sequences using various computational tools. Out of total 3822.68 kb sequence examined, 1636 (26.61%) EST sequences containing 2371 SSRs were detected with a density of 1 SSR/1.61 kb leading to development of 245 primer pairs. These mined EST-SSR markers will help further in the study of variability, mapping, evolutionary relationship in Camellia sinensis. In addition, these developed SSRs can also be applied for various studies across species.  相似文献   

2.
Simple sequence repeats (SSRs) can be derived from the complete genome sequence. These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop SSRs for cotton gene mapping, we selected the complete genome sequence of Gossypium raimondii, which consisted of 4447 non-redundant scaffolds. Out of 775.2 Mb sequence examined, a total of 136,345 microsatellites were identified with a density of 5.69 kb per SSR in the G. raimondii genome leading to development of 112,177 primer pairs. The distributions of SSRs in the genome were non-random. Among the different motifs ranging from 1 to 6 bp, penta-nucleotide repeats were most abundant (30.5%), followed by tetra-nucleotide repeats (18.2%) and di-nucleotide repeats (16.9%). Among all identified 457 motif types, the most frequently occurring repeat motifs were poly-AT/TA, which accounted for 79.8% of the total di-nt SSRs, followed by AAAT/TTTA with 51.5% of the total tetra-nucleotede. Further, 18,834 microsatellites were detected from the protein-coding genes, and the frequency of gene containing SSRs was 46.0% in 40,976 genes of G. raimondii. These genome-based SSRs developed in the present study will lay the groundwork for developing large numbers of SSR markers for genetic mapping, gene discovery, genetic diversity analysis, and MAS breeding in cotton.  相似文献   

3.
Simple sequence repeat (SSR) markers are widely used in many plant and animal genomes due to their abundance, hypervariability, and suitability for high-throughput analysis. Development of SSR markers using molecular methods is time consuming, laborious, and expensive. Use of computational approaches to mine ever-increasing sequences such as expressed sequence tags (ESTs) in public databases permits rapid and economical discovery of SSRs. Most of such efforts to date focused on mining SSRs from monocotyledonous ESTs. In this study, we have computationally mined and examined the abundance of SSRs in more than 1.54 million ESTs belonging to 55 dicotyledonous species. The frequency of ESTs containing SSRs among species ranged from 2.65% to 16.82%. Dinucleotide repeats were found to be the most abundant followed by tri- or mono-nucleotide repeats. The motifs A/T, AG/GA/CT/TC, and AAG/AGA/GAA/CTT/TTC/TCT were the predominant mono-, di-, and tri-nucleotide SSRs, respectively. Most of the mononucleotide SSRs contained 15-25 repeats, whereas the majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. The comprehensive SSR survey data presented here demonstrates the potential of in silico mining of ESTs for rapid development of SSR markers for genetic analysis and applications in dicotyledonous crops.  相似文献   

4.
A new set of 148 apple microsatellite markers has been developed and mapped on the apple reference linkage map Fiesta x Discovery. One-hundred and seventeen markers were developed from genomic libraries enriched with the repeats GA, GT, AAG, AAC and ATC; 31 were developed from EST sequences. Markers derived from sequences containing dinucleotide repeats were generally more polymorphic than sequences containing trinucleotide repeats. Additional eight SSRs from published apple, pear, and Sorbus torminalis SSRs, whose position on the apple genome was unknown, have also been mapped. The transferability of SSRs across Maloideae species resulted in being efficient with 41% of the markers successfully transferred. For all 156 SSRs, the primer sequences, repeat type, map position, and quality of the amplification products are reported. Also presented are allele sizes, ranges, and number of SSRs found in a set of nine cultivars. All this information and those of the previous CH-SSR series can be searched at the apple SSR database () to which updates and comments can be added. A large number of apple ESTs containing SSR repeats are available and should be used for the development of new apple SSRs. The apple SSR database is also meant to become an international platform for coordinating this effort. The increased coverage of the apple genome with SSRs allowed the selection of a set of 86 reliable, highly polymorphic, and overall the apple genome well-scattered SSRs. These SSRs cover about 85% of the genome with an average distance of one marker per 15 cM.E. Silfverberg-Dilworth and C. L. Matasci contributed equally to this work.  相似文献   

5.
Eucalyptus microsatellites mined in silico: survey and evaluation   总被引:1,自引:0,他引:1  
Eucalyptus is an important short rotation pulpy woody plant, grown widely in the tropics. Recently, many genomic programmes are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. These sequences can be utilized for analysis of simple sequence repeats (SSRs) and single nucleotide polymorphism (SNPs) available in the transcribed genes. In this study, in silico analysis of 15,285 sequences representing partial and full-length mRNA from Eucalyptus species for their use in developing SSRs or microsatellites were carried out. A total of 875 EST-SSRs were identified from 772 SSR containing ESTs. Motif size of 6 for dinucleotide and 5 for trinucleotide, tetranucleotide, and pentanucleotides were considered in locating the microsatellites. The average frequency of identified SSRs was 12.9%. The dinucleotide repeats were the most abundant among the dinucleotide, trinucleotide and tetranucleotide motifs and accounted for 50.9% of the Eucalyptus genome. Primer designing analysis showed that 571 sequences with SSRs had sufficient flanking regions for polymerase chain reaction (PCR) primer synthesis. Evaluation of the usefulness of the SSRs showed that EST-derived SSRs can generate polymorphic markers as all the primers showed allelic diversity among the 16 provenances of E. tereticornis.  相似文献   

6.
Xin D  Sun J  Wang J  Jiang H  Hu G  Liu C  Chen Q 《Molecular biology reports》2012,39(9):9047-9057
Microsatellites, or simple sequence repeats (SSRs), are very useful molecular markers for a number of plant species. We used a new publicly available module (TROLL) to extract microsatellites from the public database of soybean expressed sequence tag (EST) sequences. A total of 12,833 sequences containing di- to penta-type SSRs were identified from 200,516 non-redundant soybean ESTs. On average, one SSR was found per 7.25?kb of EST sequences, with the tri-nucleotide motifs being the most abundant. Primer sequences flanking the SSR motifs were successfully designed for 9,638 soybean ESTs using the software primer3.0 and only 59 pairs of them were found in earlier studies. We synthesized 124 pairs of the primers to determine the polymorphism and heterozygosity among eight genotypes of soybean cultivars, which represented a wide range of the cultivated soybean cultivars. PCR amplification products with anticipated SSRs were obtained with 81 pairs of primers; 36 PCR products appeared to be homozygous and the remaining 45 PCR products appeared to be heterozygous and displayed polymorphism among the eight cultivars. We further analysed the EST sequences containing 45 polymorphic EST-SSR markers using the programs BLASTN and BLASTX. Sequence alignment showed that 29 ESTs have homologous sequences and 15 ESTs could be classified into a Uni-gene cluster with comparatively convincing protein products. Among these 15 ESTs belonging to a Uni-gene cluster, 9 SSRs were located in 3'-UTR, 4 SSRs were located in the intron region and 2 SSRs were located in the CDS region. None of these SSRs was located in the 5'-UTR. These novel SSRs identified in the ESTs of soybean provide useful information for gene mapping and cloning in future studies.  相似文献   

7.
Pineapple (Ananas comosus (L.) Merrill) is the second most important tropical fruit in term of international trade. The availability of whole genomic sequences and expressed sequence tags (ESTs) offers an opportunity to identify and characterize microsatellite or simple sequence repeat (SSR) markers in pineapple. A total of 278,245 SSRs and 41,962 SSRs with an overall density of 728.57 SSRs/Mb and 619.37 SSRs/Mb were mined from genomic and ESTs sequences, respectively. 5′-untranslated regions (5′-UTRs) had the greatest amount of SSRs, 3.6–5.2 fold higher SSR density than other regions. For repeat length, 12 bp was the predominant repeat length in both assembled genome and ESTs. Class I SSRs were underrepresented compared with class II SSRs. For motif length, dinucleotide repeats were the most abundant in genomic sequences, whereas trinucleotides were the most common motif in ESTs. Tri- and hexanucleotides of total SSRs were more prevalent in ESTs than in the whole genome. The SSR frequency decreased dramatically as repeat times increased. AT was the most frequent single motif across the entire genome while AG was the most abundant motif in ESTs. Across six examined plant species, the pineapple genome displayed the highest density, substantially more than the second-place cucumber. Annotation and expression analyses were also conducted for genes containing SSRs. This thorough analysis of SSR markers in pineapple provided valuable information on the frequency and distribution of SSRs in the pineapple genome. This genomic resource will expedite genomic research and pineapple improvement.  相似文献   

8.
Gene-derived simple sequence repeats (genic SSRs), also known as functional markers, are often preferred over random genomic markers because they represent variation in gene coding and/or regulatory regions. We characterized 544 genic SSR loci derived from 138 candidate genes involved in wood formation, distributed throughout the genome of Populus tomentosa, a key ecological and cultivated wood production species. Of these SSRs, three-quarters were located in the promoter or intron regions, and dinucleotide (59.7%) and trinucleotide repeat motifs (26.5%) predominated. By screening 15 wild P. tomentosa ecotypes, we identified 188 polymorphic genic SSRs with 861 alleles, 2–7 alleles for each marker. Transferability analysis of 30 random genic SSRs, testing whether these SSRs work in 26 genotypes of five genus Populus sections (outgroup, Salix matsudana), showed that 72% of the SSRs could be amplified in Turanga and 100% could be amplified in Leuce. Based on genotyping of these 26 genotypes, a neighbour-joining analysis showed the expected six phylogenetic groupings. In silico analysis of SSR variation in 220 sequences that are homologous between P. tomentosa and Populus trichocarpa suggested that genic SSR variations between relatives were predominantly affected by repeat motif variations or flanking sequence mutations. Inheritance tests and single-marker associations demonstrated the power of genic SSRs in family-based linkage mapping and candidate gene-based association studies, as well as marker-assisted selection and comparative genomic studies of P. tomentosa and related species.  相似文献   

9.
Simple sequence repeat (SSR) markers were developed from expressed sequence tags (ESTs) in the eastern oyster (Crassostrea virginica). ESTs of the eastern oyster were downloaded from GenBank and screened for SSRs with at least eight units of dinucleotide or five units of tri-, tetra-, penta-, and hexa-nucleotide repeats. The screening of 9101 ESTs identified 127 (1.4%) SSR-containing sequences. Primers were designed for 88 SSR-containing ESTs with good and sufficient flanking sequences. Polymerase chain reaction (PCR) amplification was successful for 71 primer pairs, including 19 (27%) pairs that amplified fragments longer than expected sizes, probably due to introns. Sixty-six pairs that produced fragments shorter than 800 bp were screened for polymorphism in five oysters from three populations via polyacrylamide gels, and 53 of them (80%) were polymorphic. Fifty-three polymorphic SSRs were labeled and genotyped in 30 oysters from three populations via an automated sequencer. Five of the SSRs amplified more than two fragments per oyster, suggesting locus duplication. The remaining 48 SSRs had 2 alleles per individual, including 11 with null alleles. In the 30 oysters analyzed, the SSRs had an average of 9.3 alleles per locus, ranging from 2 to 24. Forty-three loci segregated in a family with 100 progeny, with nine showing significant deviation from Mendelian ratios (three after Bonferroni correction). Seventy percent of the loci were successfully amplified in C. rhizophorae and 34% in C. gigas. This study demonstrates that ESTs are valuable resources for the development of SSR markers in the eastern oyster, and EST-derived SSRs are more transferable across species than genomic SSRs.  相似文献   

10.
甜瓜EST序列中微卫星的分布特征   总被引:2,自引:0,他引:2  
GenBank中35547条甜瓜EST经去冗余处理后,得到总长度为250.3Mb的无冗余EST34438条。这些序列中有2813个微卫星简单重复序列(Simple sequence repeat,SSR),分布于2107条EST中,出现频率为8.16%,平均分布距离为8.90kb。三核苷酸重复是主导重复类型,占SSR总数的47.14%;其次是二核苷酸和单核苷酸重复,分别占SSR总数的20.72%和16.99%。AAG/TTC是优势重复基元,占微卫星总数的29.26%,AG/CT和A/T分别占14.61%和16.25%。在所有的SSR中,重复次数为4~10次的占70.32%,长度为12~20bp的占51.12%。并对这些SSR的多态性潜能进行了评价。  相似文献   

11.
The frequency, type and distribution of simple sequence repeats (SSRs) in Porphyra haitanensis genomes was investigated using expressed sequence tag (EST) data deposited in public databases. A total of 3,489 non-redundant P. haitanensis ESTs were screened for SSRs using SSRhunter software. From those, 224 SSRs in 210 ESTs were identified; trinucleotides were the most common type of SSR (64.29%), followed by dinucleotides (33.48%). Tetranucleotides, pentanucleotides, and hexanucleotides were not common. Among all identified motif types, CGG/CCG had the highest frequency (33.9%), followed by TC/AG (24.6%). From these EST-SSRs, 37 SSR primer-pairs were designed and tested using common SSR reaction conditions with 15 P. haitanensis DNAs as templates. The results showed that 28 SSR primer-pairs gave good amplification patterns. These were used to conduct SSR analyses of genetic variations of the 15 germplasm strains of P. haitanensis. A total of 224 alleles were detected, with the number of alleles ranging from 4 to 15. The effective number of alleles, expected heterozygosity, and polymorphism information content of the 15 germplasm strains of P. haitanensis were 2.81, 0.64, and 0.57, respectively. All of these parameters indicate that the 15 germplasm strains of P. haitanensis harbor rich genetic variation.  相似文献   

12.
柔嫩艾美尔球虫EST序列中SSR的获取及分析   总被引:1,自引:0,他引:1  
对柔嫩艾美尔球虫EST—SSR进行生物信息学分析,共获取Eimeria tenella EST序列34074条,总长度为16.45Mb,小于12bpSSR的ESTs达7651条,从中获得SSR序列19576条、总长度为0.35Mb,EST—SSRs的频率是48.00%,平均相隔S40bp出现一个长度不小于12bp的SSR。在E.tenella的核苷酸重复基元中,2、3、4、5、6和7bp重复序列在基因组中出现的种类分别有11种472条、49种14710条、31种525条、13种25条、21种43条和15种400条,3碱基重复序列是最丰富的重复单元,占总数的75.14%。各种SSRs中富含G、C碱基的重复单元以GCA出现频率最多(28.63%),次为AGC(17.59%),GCT(8.76%),TGC(7.62%),CTG(7.15%)。  相似文献   

13.
Genomic resources for peach, a model species for Rosaceae, are being developed to accelerate gene discovery in other Rosaceae species by comparative mapping. Simple sequence repeats (SSRs) are an important tool for comparative mapping because of their high polymorphism and transportability. To accelerate the development of SSR markers, we analyzed publicly available Rosaceae expressed sequence tags (ESTs) for SSRs. A total of 17,284 ESTs from almond, peach and rose were assembled into putatively non-redundant EST sets. For comparison, 179,099 ESTs from Arabidopsis were also used in the analysis. About 4% of the assembled ESTs contained SSRs in Rosaceae, which was higher than the 2.4% found in Arabidopsis. About half of the SSRs were found in the putative UTR, and the estimated average distance between SSRs in the UTR was 5.5 kb in rose, 5.1 kb in almond, 7 kb in peach and 13 kb in Arabidopsis. In the putative coding region, the estimated average distance was two to four times longer than in the UTR. Rosaceae ESTs containing SSRs were functionally annotated using the GenBank nr database and further classified using the gene ontology terms associated with the matching sequences in the SwissProt database. The detailed data including the sequences and annotation results are available from .  相似文献   

14.
In the present study, 3217 UniGene sequences of Neurospora crassa downloaded from the National Center for Biotechnology Information (NCBI) were mined for the identification of microsatellites or simple sequence repeats (SSRs). A total of 287 SSRs detected gives density of 1SSR/14.6 kb of 4187.86 kb sequences mined suggests that only 250 (7.8%) of sequences contained SSRs. Depending on the repeat units, the length of SSRs ranged from 14 to 17 bp for mono-, 14 to 48 bp for di-, 18 to 90 bp for tri-, 24 to 48 bp for tetra-, 30 for penta- and 42 to 48 bp for hexa-nucleotide repeats. Tri-nucleotide repeats were the most frequent repeat type (88.8%) followed by di-nucleotide repeats (5.9%). An attempt was also made with the help of bioinformatics approach to find out primer pairs for identified SSRs and primers were found only for 239 sequences. But, this part needs experimental validation. Annotation of SSRs containing sequences was also carried out.  相似文献   

15.
The abundance and inherent potential for variations in simple sequence repeats (SSRs) or microsatellites resulted in valuable source for genetic markers in eukaryotes. We describe the organization and abundance of SSRs in fungus Fusarium graminearum (causative agent for Fusarium head blight or head scab of wheat). We identified 1705 SSRs of various nucleotide repeat motifs in the sequence database of F. graminearum. It is observed that mononucleotide repeats (62%) were most abundant followed by di- (20%) and trinucleotide repeats (14%). It is noted that tetra-, penta- and hexanucleotide repeats accounted for only 4% of SSRs. The estimated frequency of Class I SSRs (perfect repeats ≥20 nucleotides) was one SSR per 124.5 kb, whereas the frequency of Class II (perfect repeats >10 nucleotides and ≫20 nucleotides) was one SSR per 25.6 kb. The dynamics of SSRs will be a powerful tool for taxonomic, phylogenetic, genome mapping and population genetic studies as SSR based markers show high levels of allelic variation, codominant inheritance and ease of analysis.  相似文献   

16.
With the advent of high-throughput sequencing technology, sequences from many genomes are being deposited to public databases at a brisk rate. Open access to large amount of expressed sequence tag (EST) data in the public databases has provided a powerful platform for simple sequence repeat (SSR) development in species where sequence information is not available. SSRs are markers of choice for their high reproducibility, abundant polymorphism and high inter-specific transferability. The mining of SSRs from ESTs requires different high-throughput computational tools that need to be executed individually which are computationally intensive and time consuming. To reduce the time lag and to streamline the cumbersome process of SSR mining from ESTs, we have developed a user-friendly, web-based EST-SSR pipeline "EST-SSR-MARKER PIPELINE (ESMP)". This pipeline integrates EST pre-processing, clustering, assembly and subsequently mining of SSRs from assembled EST sequences. The mining of SSRs from ESTs provides valuable information on the abundance of SSRs in ESTs and will facilitate the development of markers for genetic analysis and related applications such as marker-assisted breeding. AVAILABILITY: The database is available for free at http://bioinfo.aau.ac.in/ESMP.  相似文献   

17.
为了在芦笋中开发EST-SSR功能性标记,对来源于NCBI公共数据库的8590条芦笋(AsparagusofficinalisL.)EST序列进行简单重复序列SSR搜索。剔除冗余序列,得到非冗余序列8377条。在非冗余序列中共挖掘出469个EST-SSR,平均相隔14.80kb出现1个SSR。在所有的重复基序中,二核苷酸重复基序的SSR所占比例最高40.51%(190/469),其次是三核苷酸34.97%(164/469),六核苷酸21.11%(99/469)。在所有基序里,CT/AG出现的频率最高有62次,占全部重复基序的13.22%(62/469)。选取含SSR的EST序列30条,并利用primer5软件设计引物,进行SSR位点的扩增,其中27对引物扩增产物,24对有较清晰可靠的目标扩增条带,占引物数的80%,且所检测出的芦笋等位基因数量较丰富,平均4.93个/对。这些EST-SSR标记的开发将有助于芦笋群体遗传多样性、遗传图谱构建、基因定位、分子标记和系谱分析等方面的研究。  相似文献   

18.
Expressed sequence tag (EST) derived simple sequence repeats (SSRs, microsatellites) were screened and identified from 3863 almond and 10 185 peach EST sequences, and the spectra of SSRs in the non-redundant EST sequences were investigated after sequence assembly. One hundred seventy-eight (12.07%) almond SSRs and 497 (9.97%) peach SSRs were detected. The EST-SSR occurs every 4.97 kb in almond ESTs and 6.57 kb in peach, and SSRs with di- and trinucleotide repeat motifs are the most abundant in both almond and peach ESTs. Twenty one EST-SSRs were thereafter, developed and used together with 7 genomic SSRs, to study the genetic relationship among 36 almond (P. communis Fritsch.) cultivars from China and the Mediterranean area, as well as 8 accessions of other related species from the genus Prunus. Both EST-derived and genomic SSR markers showed high cross-species transferability in the genus. Out of the 112 polymorphic alleles detected in the 36 cultivated almonds, 28 are specific to Chinese cultivars and 25 to the others. The 44 accessions were clustered into 4 groups in the phylogenetic tree and the 36 almond cultivars formed two distinct subgroups, one containing only Chinese cultivars and one of unknown origin and the other only those originating from the Mediterranean area, indicating that Chinese almond cultivars have a distinct evolutionary history from the Mediterranean almond. Our preliminary results indicated that common almond was more closely related to peach (P. persica (L.) Batsch.) than to the four wild species of almond, (P. mongolica Maxim., P. ledebouriana Schleche, P. tangutica Batal., and P. triloba Lindl.). The implications of these SSR markers for evolutionary analysis and molecular mapping of Prunus species are discussed.  相似文献   

19.
Expressed sequence tags (ESTs) from Coffea canephora leaves and fruits were used to search for types and frequencies of simple sequence repeats (EST–SSRs) with a motif length of 1–6 bp. From a non-redundant (NR) EST set of 5,534 potential unigenes, 6.8% SSR-containing sequences were identified, with an average density of one SSR every 7.73 kb of EST sequences. Trinucleotide repeats were found to be the most abundant (34.34%), followed by di- (25.75%) and hexa-nucleotide (22.04%) motifs. The development of unique genic SSR markers was optimized by a computational approach which allowed us to eliminate redundancy in the original EST set and also to test the specificity of each pair of designed primers. Twenty-five EST–SSRs were developed and used to evaluate cross-species transferability in the Coffea genus. The orthology was supported by the amplicon sequence similarity and the amplification patterns. The >94% identity of flanking sequences revealed high sequence conservation across the Coffea genus. A high level of polymorphic loci was obtained regardless of the species considered (from 75% for C. liberica to 86% for C. canephora). Moreover, the polymorphism revealed by EST–SSR was similar to that exposed by genomic SSR. It is concluded that Coffea ESTs are a valuable resource for microsatellite mining. EST-SSR markers developed from C. canephora sequences can be easily transferred to other Coffea species for which very little molecular information is available. They constitute a set of conserved orthologous markers, which would be ideal for assessing genetic diversity in coffee trees as well as for cross-referencing transcribed sequences in comparative genomics studies.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号