首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 24 毫秒
1.
The public EST (expressed sequence tag) databases represent an enormous but heterogeneous repository of sequences, including many from a broad selection of plant species and a wide range of distinct varieties. The significant redundancy within large EST collections makes them an attractive resource for rapid pre-selection of candidate sequence polymorphisms. Here we present a strategy that allows rapid identification of candidate SNPs in barley (Hordeum vulgare L.) using publicly available EST databases. Analysis of 271,630 EST sequences from different cDNA libraries, representing 23 different barley varieties, resulted in the generation of 56,302 tentative consensus sequences. In all, 8171 of these unigene sequences are members of clusters with six or more ESTs. By applying a novel SNP detection algorithm (SNiPpER) to these sequences, we identified 3069 candidate inter-varietal SNPs. In order to verify these candidate SNPs, we selected a small subset of 63 present in 36 ESTs. Of the 63 SNPs selected, we were able to validate 54 (86%) using a direct sequencing approach. For further verification, 28 ESTs were mapped to distinct loci within the barley genome. The polymorphism information content (PIC) and nucleotide diversity () values of the SNPs identified by the SNiPpER algorithm are significantly higher than those that were obtained by random sequencing. This demonstrates the efficiency of our strategy for SNP identification and the cost-efficient development of EST-based SNP-markers.The first two authors contributed equally to this work  相似文献   

2.
Pacific white shrimp (Litopenaeus vannamei) are of particular economic importance to the global shrimp aquaculture industry. However, limited genomics information is available for the penaeid species. We utilized the limited public information available, mainly single nucleotide polymorphisms (SNPs) and expressed sequence tags, to discover markers for the construction of the first SNP genetic map for Pacific white shrimp. In total, 1344 putative SNPs were discovered, and out of 825 SNPs genotyped, 418 SNP markers from 347 contigs were mapped onto 45 sex‐averaged linkage groups, with approximate coverage of 2071 and 2130 cm for the female and male maps, respectively. The average‐squared correlation coefficient (r2), a measure of linkage disequilibrium, for markers located more than 50 cm apart on the same linkage group, was 0.15. Levels of r2 increased with decreasing inter‐marker distance from ~80 cm , and increased more rapidly from ~30 cm . A QTL for shrimp gender was mapped on linkage group 13. Comparative mapping to model organisms, Daphnia pulex and Drosophila melanogaster, revealed extensive rearrangement of genome architecture for L. vannamei, and that L. vannamei was more related to Daphnia pulex. This SNP genetic map lays the foundation for future shrimp genomics studies, especially the identification of genetic markers or regions for economically important traits.  相似文献   

3.
Grass carp, Ctenopharyngodon idellus (Valenciennes, 1844), is an economically important species widely cultured in the world, but its genome research resources are largely lacking. The objectives of this study were to construct normalized cDNA libraries for efficient EST analysis, to generate ESTs from these libraries, and to identify EST-related molecular markers such as microsatellites and single nucleotide polymorphisms (SNPs) for genetic analysis of this species. A total of 6,269 ESTs were generated representing 4,815 unique sequences, from which 105 putative microsatellites and 5,228 SNPs were identified. These genome resources provide the material basis for future genetic and functional analyses in this species.  相似文献   

4.
L D Chaves  J A Rowe  K M Reed 《Génome》2005,48(1):12-17
Genome characterization and analysis is an imperative step in identifying and selectively breeding for improved traits of agriculturally important species. Expressed sequence tags (ESTs) represent a transcribed portion of the genome and are an effective way to identify genes within a species. Downstream applications of EST projects include DNA microarray construction and interspecies comparisons. In this study, 694 ESTs were sequenced and analyzed from a library derived from a 24-day-old turkey embryo. The 437 unique sequences identified were divided into 76 assembled contigs and 361 singletons. The majority of significant comparative matches occurred between the turkey sequences and sequences reported from the chicken. Whole genome sequence from the chicken was used to identify potential exon-intron boundaries for selected turkey clones and intron-amplifying primers were developed for sequence analysis and single nucleotide polymorphism (SNP) discovery. Identified SNPs were genotyped for linkage analysis on two turkey reference populations. This study significantly increases the number of EST sequences available for the turkey.  相似文献   

5.
6.
Expressed sequence tags (ESTs) from Coffea canephora leaves and fruits were used to search for types and frequencies of simple sequence repeats (EST–SSRs) with a motif length of 1–6 bp. From a non-redundant (NR) EST set of 5,534 potential unigenes, 6.8% SSR-containing sequences were identified, with an average density of one SSR every 7.73 kb of EST sequences. Trinucleotide repeats were found to be the most abundant (34.34%), followed by di- (25.75%) and hexa-nucleotide (22.04%) motifs. The development of unique genic SSR markers was optimized by a computational approach which allowed us to eliminate redundancy in the original EST set and also to test the specificity of each pair of designed primers. Twenty-five EST–SSRs were developed and used to evaluate cross-species transferability in the Coffea genus. The orthology was supported by the amplicon sequence similarity and the amplification patterns. The >94% identity of flanking sequences revealed high sequence conservation across the Coffea genus. A high level of polymorphic loci was obtained regardless of the species considered (from 75% for C. liberica to 86% for C. canephora). Moreover, the polymorphism revealed by EST–SSR was similar to that exposed by genomic SSR. It is concluded that Coffea ESTs are a valuable resource for microsatellite mining. EST-SSR markers developed from C. canephora sequences can be easily transferred to other Coffea species for which very little molecular information is available. They constitute a set of conserved orthologous markers, which would be ideal for assessing genetic diversity in coffee trees as well as for cross-referencing transcribed sequences in comparative genomics studies.  相似文献   

7.
We report on the data mining of publicly available Litopenaeus vannamei expressed sequence tags (ESTs) to generate simple sequence repeat (SSRs) markers and on their transferability between related Penaeid shrimp species. Repeat motifs were found in 3.8% of the evaluated ESTs at a frequency of one repeat every 7.8 kb of sequence data. A total of 206 primer pairs were designed, and 112 loci were amplified with the highest success in L. vannamei. A high percentage (69%) of EST-SSRs were transferable within the genus Litopenaeus. More than half of the amplified products were polymorphic in a small testing panel of L. vannamei. Evaluation of those primers in a larger testing panel showed that 72% of the markers fit Hardy-Weinberg equilibrium, which shows their utility for population genetic analysis. Additionally, a set of 26 of the EST-SSRs were evaluated for Mendelian segregation. A high percentage of monomorphic markers (46%) proved to be polymorphic by singles-stranded conformational polymorphism analysis. Because of the high number of ESTs available in public databases, a data mining approach similar to the one outlined here might yield high numbers of SSR markers in many animal taxa.  相似文献   

8.
Mining single-nucleotide polymorphisms from hexaploid wheat ESTs.   总被引:20,自引:0,他引:20  
Single-nucleotide polymorphisms (SNPs) represent a new form of functional marker, particularly when they are derived from expressed sequence tags (ESTs). A bioinformatics strategy was developed to discover SNPs within a large wheat EST database and to demonstrate the utility of SNPs in genetic mapping and genetic diversity applications. A collection of > 90000 wheat ESTs was assembled into contiguous sequences (contigs), and 45 random contigs were then visually inspected to identify primer pairs capable of amplifying specific alleles. We estimate that homoeologue sequence variants occurred 1 in 24 bp and the frequency of SNPs between wheat genotypes was 1 SNP/540 bp (theta = 0.0069). Furthermore, we estimate that one diagnostic SNP test can be developed from every contig with 10-60 EST members. Thus, EST databases are an abundant source of SNP markers. Polymorphism information content for SNPs ranged from 0.04 to 0.50 and ESTs could be mapped into a framework of microsatellite markers using segregating populations. The results showed that SNPs in wheat can be discovered in ESTs, validated, and be applied to conventional genetic studies.  相似文献   

9.
10.
The analysis of expressed sequences from a diverse set of plant species has fueled the increase in understanding of the complex molecular mechanisms underlying plant growth regulation. While representative data sets can be found for the major branches of plant evolution, fern species data are lacking. To further the availability of genetic information in pteridophytes, a normalized cDNA library of Adiantum capillus-veneris was constructed from prothallia grown under white light. A total of 10,420 expressed sequence tags (ESTs) were obtained and clustering of these sequences resulted in 7,100 nonredundant clusters. Of these, 1,608 EST clusters were found to be similar to sequences of known function and 1,092 EST clusters showed similarity to sequences of unknown function. Given the usefulness of Adiantum for developmental studies, the sequence data represented in this report stand to make a significant contribution to the understanding of plant growth regulation, particularly for pteridophytes.  相似文献   

11.
Linkage mapping of gene-associated SNPs to pig chromosome 11   总被引:3,自引:0,他引:3  
Single nucleotide polymorphisms (SNPs) were discovered in porcine expressed sequence tags (ESTs) orthologous to genes from human chromosome 13 (HSA13) and predicted to be located on pig chromosome 11 (SSC11). The SNPs were identified as sequence variants in clusters of EST sequences from pig cDNA libraries constructed in the Sino-Danish pig genome project. In total, 312 human gene sequences from HSA13 were used for similarity searches in our pig EST database. Pig ESTs showing significant similarity with HSA13 genes were clustered and candidate SNPs were identified. Allele frequencies for 26 SNPs were estimated in a group of 80 unrelated pigs from Danish commercial pig breeds: Duroc, Hampshire, Landrace and Large White. Eighteen of the 26 SNPs genotyped in the PiGMaP Reference Families were mapped by linkage analysis to SSC11. The EST-based SNPs published here are new genetic markers useful for linkage and association studies in commercial and experimental pig populations. This study represents the first gene-associated SNP linkage map of pig chromosome 11 and adds new comparative mapping information between SSC11 and HSA13. Furthermore, our data facilitate future studies aimed at the identification of interesting regions on pig chromosome 11, positional cloning and fine mapping of quantitative trait loci in pig.  相似文献   

12.
13.

Background  

Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only.  相似文献   

14.
The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species.  相似文献   

15.
Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST–SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.  相似文献   

16.
A system to use bovine EST data in conjunction with human genomic sequence to improve the bovine linkage map over the entire genome or on specific chromosomes was evaluated. Bovine EST sequence was used to provide primer sequences corresponding to bovine genes, while human genomic sequence directed primer design to flank introns and produce amplicons of appropriate size for efficient direct sequencing. The sequence tagged sites (STS) produced in this way from the four sires of the MARC reference families were examined for single nucleotide polymorphisms (SNPs) that could be used to map the corresponding genes. With this approach, along with a primer/extension mass spectrometry SNP genotyping assay, 100 ESTs were placed on the bovine genetic linkage map. The first 70 were chosen at random from bovine EST–human genomic comparisons. An additional 30 ESTs were successfully mapped to bovine Chromosome 19 (BTA19), and comparison of the resulting BTA19 map to the position of the corresponding human orthologs on the HSA17 draft sequences revealed differences in the spacing and order of genes. Over 80% of successful amplicons contained SNPs, indicating that this is an efficient approach to generating EST-associated genetic markers. We have demonstrated the feasibility of constructing a linkage map based on SNPs associated with ESTs and the plausibility of utilizing EST, comparative mapping information, and human sequence data to target regions of the bovine genome for SNP marker development.  相似文献   

17.
Mining and characterizing microsatellites from citrus ESTs   总被引:17,自引:0,他引:17  
Freely available computer programs were arranged in a pipeline to extract microsatellites from public citrus EST sequences, retrieved from the NCBI. In total, 3,278 bi- to hexa-type SSR-containing sequences were identified from 56,199 citrus ESTs. On an average, one SSR was found per 5.2 kb of EST sequence, with the tri-nucleotide motifs as the most abundant. Primer sequences flanking SSR motifs were successfully identified from 2,295 citrus ESTs. Among those, a subset (100 pairs) were synthesized and tested to determine polymorphism and heterozygosity between/within two genera, sweet orange (C. sinensis) and Poncirus (P. trifoliata), which are the parents of the citrus core mapping population selected for an international citrus genomics effort. Eighty-seven pairs of primers gave PCR amplification to the anticipated SSRs, of which 52 and 35 appear to be homozygous and heterozygous, respectively, in sweet orange, and 67 and 20, respectively, in Poncirus. By pairing the loci between the two intergeneric species, it was found that 40 are heterozygous in at least one species with two alleles (9), three alleles (28), or four alleles (3), and the remaining 47 are homozygous in both species with either one allele (31) or two alleles (16). These EST-derived SSRs can be a resource used for understanding of the citrus SSR distribution and frequency, and development of citrus EST-SSR genetic and physical maps. These SSR primer sequences are available upon request. Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.  相似文献   

18.
19.
20.
Eucalyptus globulus is the most commonly planted hardwood species for pulpwood in temperate regions. We aimed to develop and characterize functional molecular markers for population genetic analyses and molecular breeding in this model tree species. Public expressed sequence tag (EST) databases were screened for nonredundant sequences to predict putative gene functions and to discover simple sequence repeats (EST-SSRs), which were then validated in E. globulus and six other Eucalyptus species. A total of 4,924 nonredundant sequences were identified from 12,690 updated E. globulus ESTs. Approximately 19.3% (952) were unigenes and contained 1,140 EST-SSR markers, which were mainly trimeric (58.6%). A set of 979 primers for putative SSR markers was designed after bioinformatic analysis. The predicted functions of these ESTs containing SSR were classified according to their gene ontology (GO) categories (biological process, molecular function, and cellular component). GO categories were assigned to 226 ESTs (30.2%). Most ESTs containing SSR (78.7%) had significant matches (E ≤ 10−5) with the nonredundant protein database using BLASTX. From a set of 56 random primer pairs, 37 could be validated in eight E. globulus genotypes and were also tested for cross-transferability to other six Eucalyptus species (Eucalyptus grandis, Eucalyptus saligna, Eucalyptus dunnii, Eucalyptus viminalis, Eucalyptus camaldulensis, Eucalyptus tereticornis). Seventeen polymorphic EST-SSR markers for E. globulus were evaluated in 60 unrelated trees, being representative of the species’ natural distribution. As a result, six highly informative markers were proposed for genetic diversity analyses, fingerprinting, and comparative population studies, between different species of E. globulus.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号