首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Simple sequence repeat (SSR) markers are widely used in many plant and animal genomes due to their abundance, hypervariability, and suitability for high-throughput analysis. Development of SSR markers using molecular methods is time consuming, laborious, and expensive. Use of computational approaches to mine ever-increasing sequences such as expressed sequence tags (ESTs) in public databases permits rapid and economical discovery of SSRs. Most of such efforts to date focused on mining SSRs from monocotyledonous ESTs. In this study, we have computationally mined and examined the abundance of SSRs in more than 1.54 million ESTs belonging to 55 dicotyledonous species. The frequency of ESTs containing SSRs among species ranged from 2.65% to 16.82%. Dinucleotide repeats were found to be the most abundant followed by tri- or mono-nucleotide repeats. The motifs A/T, AG/GA/CT/TC, and AAG/AGA/GAA/CTT/TTC/TCT were the predominant mono-, di-, and tri-nucleotide SSRs, respectively. Most of the mononucleotide SSRs contained 15-25 repeats, whereas the majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. The comprehensive SSR survey data presented here demonstrates the potential of in silico mining of ESTs for rapid development of SSR markers for genetic analysis and applications in dicotyledonous crops.  相似文献   

Analysis of expressed sequence tags of Porphyra yezoensis   总被引:9,自引:0,他引:9  
Lee EK  Seo SB  Kim TH  Sung SK  An G  Lee CH  Kim YJ 《Molecules and cells》2000,10(3):338-342
Single direct partial sequencing of anonymous cDNA clones was performed to obtain genetic information on red algae Porphyra yezoensis of which genetic information is not available. This expressed sequence tags (EST) analysis revealed 81 clones (42%) had significant homologies to known genes in GenBank. Of these clones, eight are related to known algal genes, whereas above 90% of the EST clones were newly identified in algae. Putative functional categories of these clones showed that the most abundant genes were involved in stress and defense mechanisms and that the next abundant genes were associated with protein synthetic pathways.  相似文献   

Analysis of Medicago truncatula nodule expressed sequence tags   总被引:2,自引:0,他引:2  
Systematic sequencing of expressed sequence tags (ESTs) can give a global picture of the assembly of genes involved in the development and function of organs. Indeterminate nodules representing different stages of the developmental program are especially suited to the study of organogenesis. With the vector lambdaHybriZAP, a cDNA library was constructed from emerging nodules of Medicago truncatula induced by Sinorhizobium meliloti. The 5' ends of 389 cDNA clones were sequenced, then these ESTs were analyzed both by sequence homology search and by studying their expression in roots and nodules. Two hundred fifty-six ESTs exhibited significant similarities to characterized data base entries and 40 of them represented 26 nodulin genes, while 133 had no similarity to sequences with known function. Only 60 out of the 389 cDNA clones corresponded to previously submitted M. truncatula EST sequences. For 117 cDNAs, reverse Northern (RNA) hybridization with root and nodule RNA probes revealed enhanced expression in the nodule, 48 clones are likely to code for novel nodulins, 33 cDNAs are clones of already known nodulin genes, and 36 clones exhibit similarity to other characterized genes. Thus, systematic analysis of the EST sequences and their expression patterns is a powerful way to identify nodule-specific and nodulation-related genes.  相似文献   

To isolate useful and interesting plant genes in large quantities, random sequencing of cDNA clones from potato leaf library treated with ethylene was performed. Partial sequences of randomly selected 210 clones with the insert of longer than 500 base pair (bp) as well as poly (A) tail have been compared with sequences in GeneBank, EMBL and DDBJ nucleic acid databases and fostered 193 expressed sequence tags (ESTs). The 210 cDNA clones identified are related to various aspect of metabolic pathways such as glycolysis, amino acid synthesis, translation mechanism, ribosome synthesis, hormone response, stress response, regulation of gene expression, and signal transduction. Among the 193 ESTs, 12 ESTs (29 cDNA clones) appeared more than once and 181 ESTs appeared once regarded as a solitary group. Out of 210 clones, 29 clones (13.8%) have no similarity to the known nucleotide sequences and could serve as a potentially useful resource for plant molecular biology referring to particular genes. Nucleotide sequencing to generate more ESTs from ethylene-induced as well as non-induced potato leaf is in progress as well.  相似文献   

MicroRNAs (miRNAs) are a new family of small RNA molecules found in plants and animals. We developed a comprehensive strategy for identifying new miRNA homologues by mining the repository of available citrus expressed sequence tags (ESTs). By adopting a range of filtering criteria, we identified a total of 38 potential miRNAs--nine, five, nine and 15 miRNAs in Citrus trifoliata (ctr-miRNAs), C. clementina (ccl-miRNAs), C. reticulata (crt-miRNAs) and C. sinensis (csi-miRNAs), respectively--from more than 430,000 EST sequences in citrus. Using the potential miRNA sequences, we conducted a further BLAST search of the mRNA database and found six potential target genes in these citrus species. Eight miRNAs were selected in order to verify their existence in citrus using Northern blotting, and the functions of several miRNAs in miRNA-mediated gene regulation are experimentally suggested. It appears that all these miRNAs regulate expression of their target genes by cleavage, which is the most common situation in gene regulation mediated by plant miRNAs. Our achievement in identifying new miRNAs in citrus provides a powerful incentive for further studies on the important roles of these miRNAs.  相似文献   

Aims:  To elucidate the molecular mechanisms associated with mycoparasitism from Chaetomium cupreum , an effective biocontrol agent with ability against plant pathogenic fungi.
Methods and Results:  One cDNA library was constructed from conditions predicted to resemble mycoparasitic process. A total of 1876 ESTs were generated and assembled into 1035 unigenes. B last X search revealed that 585 unigenes had similarities with sequences available from public databases. Based on the ESTs abundance, MFS monosaccharide transporter was found as the gene expressed at the highest level. A KEGG analysis allowed mapping of 60 metabolic pathways well represented by the glycolysis/gluconeogenesis, d -arginine and ornithine metabolism, and tryptophan metabolism. The genes related to mycoparasitism were detected.
Conclusions:  The results revealed that the cell walls of the fungal pathogen can simulate some aspects of the mycoparasitic interaction between C. cupreum and its targets.
Significance and Impact of the Study:  This is the first report to study genes expression under conditions associated with the mycoparasitic process. The findings contribute to elucidate the molecular mechanisms involved in mycoparasitism and will help to advance our efforts in developing novel strategies for biocontrol of plant fungal diseases.  相似文献   

Simple sequence repeats (SSRs) or microsatellites are an important class of molecular markers for genome analysis and plant breeding applications. In this paper, the SSR distributions within ESTs from the legumes soybean (Glycine max, representing 135.86 Mb), medicago (Medicago truncatula, 121.1 Mb) and lotus (Lotus japonicus, 45.4 Mb) have been studied relative to the distributions in cereals such as sorghum (Sorghum bicolor, 98.9 Mb), rice (Oryza sativa, 143.9 Mb) and maize (Zea mays, 183.7 Mb). The relative abundance, density, composition and putative annotations of di-, tri-, tetra- and penta-nucleotide repeats have been compared and SSR containing ESTs (SSR-ESTs) have been clustered to give a non-redundant set of EST-SSRs, available in a database. Further, a subset of such candidate EST-SSRs from sorghum have been tested for their ability to detect polymorphism between Striga-susceptible, stay-green drought tolerant mapping population parent 'E 36-1' and its Striga-resistant, non-stay-green counterpart 'N13'. Primer sets for 64% of the EST-SSRs tested produced a clear and specific PCR product band and 34% of these detected scorable polymorphism between the N13 and E 36-1 parental lines. Over half of these markers have been genotyped on 94 RILs from the (N13 x E 36-1)-based mapping population, with 42 markers mapping onto the ten sorghum linkage groups. This establishes the value of this database as a resource of molecular markers for practical applications in cereal and legume genetics and breeding. The primer pairs for non-redundant EST-SSRs have been designed and are freely available through the database (http://intranet.icrisat.org/gt1/ssr/ssrdatabase.html).  相似文献   

The analysis of expressed sequences from a diverse set of plant species has fueled the increase in understanding of the complex molecular mechanisms underlying plant growth regulation. While representative data sets can be found for the major branches of plant evolution, fern species data are lacking. To further the availability of genetic information in pteridophytes, a normalized cDNA library of Adiantum capillus-veneris was constructed from prothallia grown under white light. A total of 10,420 expressed sequence tags (ESTs) were obtained and clustering of these sequences resulted in 7,100 nonredundant clusters. Of these, 1,608 EST clusters were found to be similar to sequences of known function and 1,092 EST clusters showed similarity to sequences of unknown function. Given the usefulness of Adiantum for developmental studies, the sequence data represented in this report stand to make a significant contribution to the understanding of plant growth regulation, particularly for pteridophytes.  相似文献   

A total of 880 expressed sequence tags (EST) originated from clones randomly selected from a Trypanosoma cruzi amastigote cDNA library have been analyzed. Of these, 40% (355 ESTs) have been identified by similarity to sequences in public databases and classified according to functional categorization of their putative products. About 11% of the mRNAs expressed in amastigotes are related to the translational machinery, and a large number of them (9% of the total number of clones in the library) encode ribosomal proteins. A comparative analysis with a previous study, where clones from the same library were selected using sera from patients with Chagas disease, revealed that ribosomal proteins also represent the largest class of antigen coding genes expressed in amastigotes (54% of all immunoselected clones). However, although more than thirty classes of ribosomal proteins were identified by EST analysis, the results of the immunoscreening indicated that only a particular subset of them contains major antigenic determinants recognized by antibodies from Chagas disease patients.  相似文献   

To identify new vaccine candidates, Eimeria tenella expressed sequence tags (ESTs) from public databases were analysed for secretory molecules with an especially developed automated in silico strategy termed DNAsignalP. A total of 12,187 ESTs were clustered into 2881 contigs followed by a blastx search, which resulted in a significant number of E. tenella contigs with homologies to entries in public databases. Amino acid sequences of appropriate homologous proteins were analysed for the occurrence of an N-terminal signal sequence using the algorithm signalP. The resulting list of 84 entries comprised 51 contigs whose deduced proteins showed homologies to proteins of apicomplexan parasites. Based on function or localisation, we selected candidate proteins classified as (i) secreted proteins of Apicomplexa parasites, (ii) secreted enzymes, and (iii) transport and signalling proteins. To verify our strategy experimentally, we used a functional complementation system in yeast. For five selected candidate proteins we found that these were indeed secreted. Our approach thus represents an efficient method to identify secretory and surface proteins out of EST databases.  相似文献   

To better understand the molecular basis of the defense response against the rice blast fungus (Magnaporthe grisea), a large-scale expressed sequence tag (EST) sequencing approach was used to identify genes involved in the early infection stages in rice (Oryza sativa). Six cDNA libraries were constructed using infected leaf tissues harvested from 6 conditions: resistant, partially resistant, and susceptible reactions at both 6 and 24 h after inoculation. Two additional libraries were constructed using uninoculated leaves and leaves from the lesion mimic mutant spl11. A total of 68,920 ESTs were generated from 8 libraries. Clustering and assembly analyses resulted in 13,570 unique sequences from 10,934 contigs and 2,636 singletons. Gene function classification showed that 42% of the ESTs were predicted to have putative gene function. Comparison of the pathogen-challenged libraries with the uninoculated control library revealed an increase in the percentage of genes in the functional categories of defense and signal transduction mechanisms and cell cycle control, cell division, and chromosome partitioning. In addition, hierarchical clustering analysis grouped the eight libraries based on their disease reactions. A total of 7,748 new and unique ESTs were identified from our collection compared with the KOME full-length cDNA collection. Interestingly, we found that rice ESTs are more closely related to sorghum (Sorghum bicolor) ESTs than to barley (Hordeum vulgare), wheat (Triticum aestivum), and maize (Zea mays) ESTs. The large cataloged collection of rice ESTs in this study provides a solid foundation for further characterization of the rice defense response and is a useful public genomic resource for rice functional genomics studies.  相似文献   

Plant genomics projects involving model species and many agriculturally important crops are resulting in a rapidly increasing database of genomic and expressed DNA sequences. The publicly available collection of expressed sequence tags (ESTs) from several grass species can be used in the analysis of both structural and functional relationships in these genomes. We analyzed over 260000 EST sequences from five different cereals for their potential use in developing simple sequence repeat (SSR) markers. The frequency of SSR-containing ESTs (SSR-ESTs) in this collection varied from 1.5% for maize to 4.7% for rice. In addition, we identified several ESTs that are related to the SSR-ESTs by BLAST analysis. The SSR-ESTs and the related sequences were clustered within each species in order to reduce the redundancy and to produce a longer consensus sequence. The consensus and singleton sequences from each species were pooled and clustered to identify cross-species matches. Overall a reduction in the redundancy by 85% was observed when the resulting consensus and singleton sequences (3569) were compared to the total number of SSR-EST and related sequences analyzed (24606). This information can be useful for the development of SSR markers that can amplify across the grass genera for comparative mapping and genetics. Functional analysis may reveal their role in plant metabolism and gene evolution.  相似文献   

Red algae are distributed globally, and the group contains several commercially important species. Griffithsia okiensis is one of the most extensively studied red algal species. In this study, we conducted expressed sequence tag (ESTs) analysis and synonymous codon usage analysis using cultured G. okiensis samples. A total of 1,104 cDNA clones were sequenced using a cDNA library made from samples collected from Dolsan Island, on the southern coast of Korea. The clustering analysis of these sequences allowed for the identification of 1,048 unigene clusters consisting of 36 consensus and 1,012 singleton sequences. BLASTX searches generated 532 significant hits (E-value <10(-4)) and via further Gene Ontology analysis, we constructed a functional classification of 434 unigenes. Our codon usage analysis showed that unigene clusters with more than three ESTs had higher GC contents (76.5%) at the third position of the codons than the singletons. Also, the majority of the optimal codons of G. okiensis and Chondrus crispus belonging to Bangiophycidae were C-ending, whereas those of Porphyra yezoensis belonging to Florideophycidae were G-ending. An orthologous gene search for the P. yezoensis EST database resulted in the identification of 39 unigenes commonly expressed in two rhodophytes, which have putative functions for structural proteins, protein degradation, signal transduction, stress response, and physiological processes. Although experiments have been conducted on a limited scale, this study provides a material basis for the development of microarrays useful for gene expression studies, as well as useful information for the comparative genomic analysis of red algae.  相似文献   

Teleost fish genome projects involving model species are resulting in a rapid accumulation of genomic and expressed DNA sequences in public databases. The expressed sequence tags (ESTs) collected in the databases can be mined for the analysis of both structural and functional genomics. In this study, we in silico analyzed 49,430 unigenes representing a total of 692,654 ESTs from four model fish for their potential use in developing simple sequence repeats (SSRs), or microsatellites. After bioinformatical mining, a total of 3,018 EST derived SSRs (EST-SSRs) were identified for 2,335 SSR containing ESTs (SSR-ESTs). The frequency of identified SSR-ESTs ranged from 1.5% for Xiphophorus to 7.3% for zebrafish. The dinucleotide repeat motif is the most abundant SSR, accounting for 47%, 52%, 64%, and 78% for medaka, Fundulus, zebrafish, and Xiphophorus, respectively. Simulation analysis suggests that a majority of these EST-SSRs have sufficient flanking sequences for polymerase chain reaction (PCR) primer design. Comparative DNA sequence analyses of SSR-ESTs identified several cross-species SSRs and sequences that may be used as cross-reference genes in comparative studies. For example, the flanking sequences of one SSR (CTG)n within the pituitary tumor-transforming gene (PTTG) 1 interacting protein (PTTGIP), showed conservation spanning the medaka, Fundulus, human, and mouse genomes. This study provides a large body of information on EST-SSRs that can be useful for the development of polymorphic markers, gene mapping, and comparative genome analysis. Functional analysis of these SSR-ESTs may reveal their role in metabolism and gene evolution of these model species.  相似文献   

Survey of plant short tandem DNA repeats   总被引:46,自引:0,他引:46  
Length variations in simple sequence tandem repeats are being given increased attention in plant genetics. Some short tandem repeats (STRs) from a few plant species, mainly those at the dinucleotide level, have been demonstrated to show polymorphisms and Mendelian inheritance. In the study reported here a search for all of the possible STRs ranging from mononucleotide up to tetranucleotide repeats was carried out on EMBL and GenBank DNA sequence databases of 3026 kb nuclear DNA and 1268 kb organelle DNA in 54 and 28 plant species (plus algae), respectively. An extreme rareness of STRs (4 STRs in 1268 kb DNA) was detected in organelle compared with nuclear DNA sequences. In nuclear DNA sequences, (AT)n sequences were the most abundant followed by (A)n · (T)n, (AG)n · (CT)n, (AAT)n · (ATT)n, (AAC)n · (GTT), (AGC)n · (GCT)n, (AAG)n · (CTT)n, (AATT)n · (TTAA)n, (AAAT)n · (ATTT)n and (AC)n · (GT)n sequences. A total of 130 STRs were found, including 49 (AT)n sequences in 31 species, giving an average of 1 STR every 23.3 kb and 1 (AT)n STR every 62 kb. An abundance comparable to that for the dinucleotide repeat was observed for the tri- and tetranucleotide repeats together. On average, there was 1 STR every 64.6 kb DNA in monocotyledons versus 1 every 21.2 kb DNA in dicotyledons. The fraction of STRs that contained G-C basepairs increased as the G+C contents went up from dicotyledons, monocotyledons to algae. While STRs of mono-, di- and tetranucleotide repeats were all located in non coding regions, 57% of the trinucleotide STRs containing G-C basepairs resided in coding regions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号