首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Xin D  Sun J  Wang J  Jiang H  Hu G  Liu C  Chen Q 《Molecular biology reports》2012,39(9):9047-9057
Microsatellites, or simple sequence repeats (SSRs), are very useful molecular markers for a number of plant species. We used a new publicly available module (TROLL) to extract microsatellites from the public database of soybean expressed sequence tag (EST) sequences. A total of 12,833 sequences containing di- to penta-type SSRs were identified from 200,516 non-redundant soybean ESTs. On average, one SSR was found per 7.25?kb of EST sequences, with the tri-nucleotide motifs being the most abundant. Primer sequences flanking the SSR motifs were successfully designed for 9,638 soybean ESTs using the software primer3.0 and only 59 pairs of them were found in earlier studies. We synthesized 124 pairs of the primers to determine the polymorphism and heterozygosity among eight genotypes of soybean cultivars, which represented a wide range of the cultivated soybean cultivars. PCR amplification products with anticipated SSRs were obtained with 81 pairs of primers; 36 PCR products appeared to be homozygous and the remaining 45 PCR products appeared to be heterozygous and displayed polymorphism among the eight cultivars. We further analysed the EST sequences containing 45 polymorphic EST-SSR markers using the programs BLASTN and BLASTX. Sequence alignment showed that 29 ESTs have homologous sequences and 15 ESTs could be classified into a Uni-gene cluster with comparatively convincing protein products. Among these 15 ESTs belonging to a Uni-gene cluster, 9 SSRs were located in 3'-UTR, 4 SSRs were located in the intron region and 2 SSRs were located in the CDS region. None of these SSRs was located in the 5'-UTR. These novel SSRs identified in the ESTs of soybean provide useful information for gene mapping and cloning in future studies.  相似文献   

2.
Expressed Sequence Tag (EST) analysis has pioneered genome-wide gene discovery and expression profiling. In order to establish a gene expression index in the rice cultivar indica, we sequenced and analyzed 86,136 ESTs from nine rice cDNA libraries from the super hybrid cultivar LYP9 and its parental cultivars. We assembled these ESTs into 13,232 contigs and leave 8,976 singletons. Overall, 7,497 sequences were found similar to the existing sequences in GenBank and 14,711 are novel. These sequences are classified by molecular function, biological process and pathways according to the Gene Ontology. We compared our sequenced ESTs with the publicly available 95,000 ESTs from japonica, and found little sequence variation, despite the large difference between genome sequences. We then assembled the combined 173,000 rice ESTs for further analysis. Using the pooled ESTs, we compared gene expression in metabolism pathway between rice and Arabidopsis according to KEGG. We further profiled gene expression pattern  相似文献   

3.
Simple Sequence Repeats (SSRs) developed from Expressed Sequence Tags (ESTs), known as EST-SSRs are most widely used and potentially valuable source of gene based markers for their high levels of crosstaxon portability, rapid and less expensive development. The EST sequence information in the publicly available databases is increasing in a faster rate. The emerging computational approach provides a better alternative process of development of SSR markers from the ESTs than the conventional methods. In the present study, 12,851 EST sequences of Camellia sinensis, downloaded from National Center for Biotechnology Information (NCBI) were mined for the development of Microsatellites. 6148 (4779 singletons and 1369 contigs) non redundant EST sequences were found after preprocessing and assembly of these sequences using various computational tools. Out of total 3822.68 kb sequence examined, 1636 (26.61%) EST sequences containing 2371 SSRs were detected with a density of 1 SSR/1.61 kb leading to development of 245 primer pairs. These mined EST-SSR markers will help further in the study of variability, mapping, evolutionary relationship in Camellia sinensis. In addition, these developed SSRs can also be applied for various studies across species.  相似文献   

4.
Human bone marrow stromal cells (HBMSC) are pluripotent cells with the potential to differentiate into osteoblasts, chondrocytes, myelosupportive stroma, and marrow adipocytes. We used high-throughput DNA sequencing analysis to generate 4258 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' (97) and 3' (4161) ends of human cDNA clones from a HBMSC cDNA library. Our goal was to obtain tag sequences from the maximum number of possible genes and to deposit them in the publicly accessible database for ESTs (dbEST of the National Center for Biotechnology Information). Comparisons of our EST sequencing data with nonredundant human mRNA and protein databases showed that the ESTs represent 1860 gene clusters. The EST sequencing data analysis showed 60 novel genes found only in this cDNA library after BLAST analysis against 3.0 million ESTs in NCBI's dbEST database. The BLAST search also showed the identified ESTs that have close homology to known genes, which suggests that these may be newly recognized members of known gene families. The gene expression profile of this cell type is revealed by analyzing both the frequency with which a message is encountered and the functional categorization of expressed sequences. Comparing an EST sequence with the human genomic sequence database enables assignment of an EST to a specific chromosomal region (a process called digital gene localization) and often enables immediate partial determination of intron/exon boundaries within the genomic structure. It is expected that high-throughput EST sequencing and data mining analysis will greatly promote our understanding of gene expression in these cells and of growth and development of the skeleton.  相似文献   

5.
6.
To characterize genes whose expression is induced in carbon-stress conditions, 12,969 and 13,450 5'-end expressed sequence tags (ESTs) were generated from cells grown in low-CO2 and high-CO2 conditions of the unicellular green alga, Chlamydomonas reinhardtii. These ESTs were clustered into 4436 and 3566 non-redundant EST groups, respectively. Comparison of their sequences with those of 3433 non-redundant ESTs previously generated from the cells under the standard growth condition indicated that 2665 and 1879 EST groups occurred only in the low-CO2 and high-CO2 populations, respectively. It was also noted that 96.2% and 96.0% of the cDNA species respectively obtained from the low-CO2 and high-CO2 conditions had no similar EST sequence deposited in the public databases. The EST species identified only in the low-CO2 treated cells included genes previously reported to be expressed specifically in low-CO2 acclimatized cells, suggesting that the ESTs generated in this study will be a useful source for analysis of genes related to carbon-stress acclimatization. The sequence information and search results of each clone will appear at the web site: http://www.kazusa.or.jp/en/plant/chlamy/EST/.  相似文献   

7.
8.
In order to study gene expression in a reproductive organ, we constructed a cDNA library of mature flower buds in Lotus japonicus, and characterized expressed sequence tags (ESTs) of 842 clones randomly selected. The EST sequences were clustered into 718 non-redundant groups. From BLAST and FASTA search analyses of both protein and DNA databases, 58.5% of the EST groups showed significant sequence similarities to known genes. Several genes encoding these EST clones were identified as pollen-specific genes, such as pectin methylesterase, ascorbate oxidase, and polygalacturonase, and as homologous genes involved in pollen-pistil interaction. Comparison of these EST sequences with those derived from the whole plant of L. japonicus, revealed that 64.8% of EST sequences from the flower buds were not found in EST sequences of the whole plant. Taken together, the EST data from flower buds generated in this study is useful in dissecting gene expression in floral organ of L. japonicus.  相似文献   

9.
利用差异显示PCR方法分离了63个草甘膦诱导后在大豆和棉花中差异表达的片段,测序分析结果表明属于33个草甘膦诱导的大豆和棉花EST序列。通过在GenBank中进一步比时研究发现:约85%的EST序列与水杨酸、冷、创伤、氧化等非生物胁迫诱导后表达库中的EST序列有高达95%以上的同源性,由此可推测这些基因参与了植物对非生物胁迫的反应过程。草甘膦诱导后高表达EST、序列的获得将有利于进一步分离相关非生物胁迫诱导表达基因及启动子,研究其转录调控的机理,有望建立草甘膦诱导系统。从而解决组成型表达造成外源基因在植物体所有发育阶段和所有组织部位表达,造成植物体能量浪费。  相似文献   

10.
Expressed sequence tags (ESTs) from the marine red alga Gracilaria gracilis   总被引:2,自引:0,他引:2  
Expressed sequence tags (ESTs) are partial sequences of cDNAs, and can be used to characterize gene expression in organisms or tissues. We have constructed a 200-sequence EST database from vegetative thalli of Gracilaria gracilis, the first ESTs reported from any alga. This database contains recognizable ESTs corresponding to genes of carbohydrate metabolism (seven), amino acid metabolism (three), photosynthesis (five), nucleic acid synthesis, repair and processing (three), protein synthesis (14), protein degradation (six), cellular maintenance and stress response (three), other identifiable protein-coding genes (13) and 146 sequences for which significant matches were not found in existing sequence databases. We have already used this EST database to recover genes of carbohydrate biosynthesis from G. gracilis. This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

11.
Expressed sequence tags (ESTs) represent 500-1000-bp-long sequences corresponding to mRNAs derived from different sources (cell lines, tissues, etc.). The human EST database contains over 8,000,000 sequences, with over 4,000,000,000 total nucleotides. RNA molecules are transcribed from a genomic DNA template; therefore, all ESTs should match corresponding genomes. Nevertheless, we have found in the human EST database approximately 11,000 ESTs not matching sequences in the human genome database. The presence of "trash" ESTs (TESTs) in the EST database could result from DNA or RNA contamination of the laboratory equipment, tissues, or cell lines. TESTs could also represent sequences from unidentified human genes or from species inhabiting the human body. Here, we attempt to identify the sources of human EST database contaminations. In particular, we discuss systematic contamination of the mammalian EST databases with sequences of plants.  相似文献   

12.
一种新的EST聚类方法   总被引:11,自引:0,他引:11  
该研究发展了一种EST(expressed sequence tag)聚类方法(ESTClustering),用于分析大规模EST测序中所产生的大量数据,以获得高质量,非重复表达序列,该方法在聚类过程中采用MEGABLAST工具对一致序列进行序列同源比较,并用phrap程序对每一EST簇进行拼接检验。这一聚类策略能降低测序错误带来的影响,有效识别基因家族成员,并避免选择性剪接的干扰,与NCB(National Center for Biotechnology Information)的UniGene clustering)方法相比,ESTClustering的聚类结果可以更好地反映表达序列的多样性,用ESTClustering对112256条拟南芥EST聚类测试,产生23581个EST簇,其中13597个EST簇有对应拟南芥基因组编码序列,与该基因组中有EST作为依据的预测基因数目接近。应用该方法对收集的147191条水稻EST序列进行聚类,形成33896个EST簇。  相似文献   

13.
14.
Expressed Sequence Tags (ESTs) are short, usually unedited sequences obtained by single-pass sequencing of cDNA clones from any cDNA library. Analyzing and comparing ESTs can provide information on gene expression, function and evolution. Large-scale EST sequencing has become an attractive alternative to plant genome sequencing. Currently, plant EST collections comprise over 3.8 million sequences from about 200 species. They have proved to be a valuable tool for gene discovery and plant metabolism analysis. Several plant-specific EST databases have been created which provide access to sequence data and bioinformatics-based tools for data mining. Searching EST collections allows pre-selection of genes for preparing cDNA arrays, targeted to bring maximum information on specialized processes, like stress response, symbiotic nitrogen fixation etc. Also, ESt-based molecular markers such as SNP, SSR, and indels are fast developing tools for breeders and researchers.  相似文献   

15.
16.
17.
18.
基于PC/Linux的核酸序列电子延伸系统的构建及其应用   总被引:5,自引:0,他引:5  
新基因全长cDNA序列的获得常常是分子生物学工作者面临的难题。人类基因组计划及其相关计划的实施导致了大量表达序列标签(EST)的产生。利用一定的生物信息学算法,这些EST序列往往可用来对新基因片段进行延伸。采用Linux操作系统,利用Blast软件和Phrap软件以及EST数据库在微机上构建了EST序列的电子延伸系统,并对来自于人胎肝的11386条EST序列和511条插入片段全长cDNA序列进行了电子延伸,结果显示8373条EST序列和389条插入片段全长cDNA序列得到了程度不等的延伸,部分结果通过RACE实验得到证实。该套系统可高效地、规模化进行EST序列的延伸,可为通过实验获得新基因全长cDNA序列提供重要线索。 Abstract:Normally it is difficult to obtain full-length cDNA sequence of novel genes.More and more expressed sequence tags(ESTs) have been obtained since the start-up of human genome project.Powerful system is badly needed for data mining on these EST sequences.Based on a personal computer coupled with Linux operating system and EST database,the Blast software and Phrap software were used to construct a platform for in silico elongation of ESTs in our lab.The performance was tested using 11386 EST sequences and 511 partial-length cDNA sequences.Results demonstrated that 8373 EST and 389 cDNA sequence were elongated using this system.Thus the platform seems to be a fast way for full-length cDNA sequence cloning of new genes.  相似文献   

19.

Background  

While Expressed Sequence Tags (ESTs) have proven a viable and efficient way to sample genomes, particularly those for which whole-genome sequencing is impractical, phylogenetic analysis using ESTs remains difficult. Sequencing errors and orthology determination are the major problems when using ESTs as a source of characters for systematics. Here we develop methods to incorporate EST sequence information in a simultaneous analysis framework to address controversial phylogenetic questions regarding the relationships among the major groups of seed plants. We use an automated, phylogenetically derived approach to orthology determination called OrthologID generate a phylogeny based on 43 process partitions, many of which are derived from ESTs, and examine several measures of support to assess the utility of EST data for phylogenies.  相似文献   

20.
For comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 22,983 5' end expressed sequence tags (ESTs) were accumulated from normalized and size-selected cDNA libraries constructed from young (2 weeks old) plants. The EST sequences were clustered into 7137 non-redundant groups. Similarity search against public non-redundant protein database indicated that 3302 groups showed similarity to genes of known function, 1143 groups to hypothetical genes, and 2692 were novel sequences. Homologues of 5 nodule-specific genes which have been reported in other legume species were contained in the collected ESTs, suggesting that the EST source generated in this study will become a useful tool for identification of genes related to legume-specific biological processes. The sequence data of individual ESTs are available at the web site: http://www.kazusa.or.jp/en/plant/lotus/EST/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号