首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
茶树EST-SSRs分布特征及引物开发   总被引:10,自引:1,他引:10  
为了在茶树中开发EST-SSRs功能性标记,利用生物信息学方法对NCBI网上公开的3288奈茶树(Camellia subebsus)ESTs序列进行EST-SSRs特征分析。剔除冗余序列,得到非冗余序列2083条。在非冗余序列中发现含不同重复基元SSRs的EST序列有385条,共486个EST-SSRs,平均相隔2.10kb出现1个SSR。在2~6bp的重复基元中,二核苷酸重复基元的SSRs出现频率最高(51.97%),其次是三核苷酸(19.55%)。对所有的重复基元类型进行统计分析发现,所占比例最高的是AG/CT(47.74%),其次分别是AT/TA(4.73%)和AAG/CTT(4.73%)。利用Prime5软件,设计了206对EST-SSRs引物,随机选用72对引物进行SSR扩增,发现31对引物可以扩增出条带,其中29对引物具有多态性,多态性比率为93.5%。这些EST-SSRs将有助于茶树基因组学方面的研究。  相似文献   

3.
A collection of 5,659 expressed sequence tags (ESTs) from pineapple [Ananas comosus (L.) Merr.] was screened for simple sequence repeats (EST-SSRs) with motif lengths between 1 and 6 bp. Lower thresholds of 15, 7 and 5 repeat units were used to define microsatellites of the mono-, di-, and tri- to hexanucleotide repeat type, respectively. Based on these criteria, 696 SSRs were identified among 3,389 EST unigenes, together representing 2,840 kb. This corresponds to an average density of one SSR every 4.1 kb of non-redundant EST sequences. Dinucleotide repeats were most abundant (38.4% of all SSRs) followed by trinucleotide repeats (38.1%). Flanking primer pairs were designed for 537 EST-SSR loci, and 49 of these were screened for their functionality in 12 accessions of A. comosus, 14 accessions of 5 additional Ananas species and 1 species of Pseudananas. Distinct PCR products of the expected size range were obtained with 36 primer pairs. Eighteen loci analyzed in more detail were all polymorphic in pineapple, and primer pairs flanking these loci also generated PCR products from a wide range of genera and species from six subfamilies of the Bromeliaceae. The potential to reveal polymorphism in a heterologous target species was demonstrated in Deuterocohnia brevifolia (subfamily Pitcairnioideae).  相似文献   

4.
To identify EST-SSR molecular markers, 41,986 cattle UniGene sequences from NCBI were mined for analyzing SSRs. A total of 1,831 SSRs were identified from 1,666 ESTs, which represented an average density of 19.88 kb per SSR. The frequency of EST-SSRs was 4.0%. The dinucleotide repeat motif was the most abundant SSR, accounting for 54%, followed by 22%, 13%, 7% and 4%, respec-tively, for tri-, hexa-, penta- and tetra-nucleotide repeats. Depending upon the length of the repeat unit, the length of microsatellites varied from 14 to 86 bp. Among the di- and tri-nucleotide repeats, AC/TG (57%) and AGC (12%) were the most abundant type. Annotation of EST-SSRs was also carried out. Three hundred primer pairs were randomly designed using Prime Premier 5.0 program and Oligo 5.0 for further experimental validation.  相似文献   

5.
Exact Tandem Repeats Analyzer 1.0 (E-TRA) combines sequence motif searches with keywords such as ‘organs’, ‘tissues’, ‘cell lines’ and ‘development stages’ for finding simple exact tandem repeats as well as non-simple repeats. E-TRA has several advanced repeat search parameters/options compared to other repeat finder programs as it not only accepts GenBank, FASTA and expressed sequence tags (EST) sequence files, but also does analysis of multiple files with multiple sequences. The minimum and maximum tandem repeat motif lengths that E-TRA finds vary from one to one thousand. Advanced user defined parameters/options let the researchers use different minimum motif repeats search criteria for varying motif lengths simultaneously. One of the most interesting features of genomes is the presence of relatively short tandem repeats (TRs). These repeated DNA sequences are found in both prokaryotes and eukaryotes, distributed almost at random throughout the genome. Some of the tandem repeats play important roles in the regulation of gene expression whereas others do not have any known biological function as yet. Nevertheless, they have proven to be very beneficial in DNA profiling and genetic linkage analysis studies. To demonstrate the use of E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 GenBank EST sequences. Our results indicated that 12.44% (679,800) of the human EST sequences contained simple and non-simple repeat string patterns varying from one to 126 nucleotides in length. The results also revealed that human organs, tissues, cell lines and different developmental stages differed in number of repeats as well as repeat composition, indicating that the distribution of expressed tandem repeats among tissues or organs are not random, thus differing from the un-transcribed repeats found in genomes.  相似文献   

6.
7.
8.
The cellular and molecular biology of conifer embryogenesis   总被引:4,自引:0,他引:4  
Gymnosperms and angiosperms are thought to have evolved from a common ancestor c. 300 million yr ago. The manner in which gymnosperms and angiosperms form seeds has diverged and, although broad similarities are evident, the anatomy and cell and molecular biology of embryogenesis in gymnosperms, such as the coniferous trees pine, spruce and fir, differ significantly from those in the most widely studied model angiosperm Arabidopsis thaliana. Molecular analysis of signaling pathways and processes such as programmed cell death and embryo maturation indicates that many developmental pathways are conserved between angiosperms and gymnosperms. Recent genomics research reveals that almost 30% of mRNAs found in developing pine embryos are absent from other conifer expressed sequence tag (EST) collections. These data show that the conifer embryo differs markedly from other gymnosperm tissues studied to date in terms of the range of genes transcribed. Approximately 72% of conifer embryo-expressed genes are found in the Arabidopsis proteome and conifer embryos contain mRNAs of very similar sequence to key genes that regulate seed development in Arabidopsis. However, 1388 loblolly pine (Pinus taeda) embryo ESTs (11.4% of the collection) are novel and, to date, have been found in no other plant. The data imply that, in gymnosperm embryogenesis, differences in structure and development are achieved by subtle molecular interactions, control of spatial and temporal gene expression and the regulating agency of a few unique proteins.  相似文献   

9.
为了探究家蚕Bombyx mori EST-SSR标记的多态性, 对检索获得的家蚕第12连锁群的4 465条EST序列进行了分析, 整理和拼接后得到581条非冗余EST序列, 总长度约为480 kb。其中, 有122条序列中共检测到154个EST-SSR, 占所研究的EST序列的2.73%, 平均每3.12 kb 含有一个EST-SSR。在所检测的EST-SSR中, 三核苷酸和四核苷酸重复是主导类型, 分别占总数的36.36%和28.57%,大部分表现为Perfect形式; 核苷酸重复平均长度约为16.2 bp, 最长为30 bp。进一步进行同源性分析, 发现有26条序列可以在NCBI中检索到同源序列, 在这些序列中一共含有40个SSR, 其中14个(35.0%)位于5′-UTR, 11个(27.5%)位于3′-UTR, 15个(37.5%)位于CDS区。根据筛选到的微卫星序列设计11对引物, 其中8对引物有扩增产物, 且条带清晰; 应用引物ES1204对8个家蚕品种进行PCR扩增都呈现多态性。结果说明通过家蚕EST数据库发掘SSR标记是一条可行的途径。  相似文献   

10.
Conserved ortholog set (COS) markers are evolutionary conserved, single-copy genes, identified from large databases of express sequence tags (ESTs). They are of particular use for constructing syntenic genetic maps among species. In this study, we identified a set of 1,813 putative single-copy COS markers between spruce and loblolly pine, then designed primers for 931 of these markers and tested these primers with DNA from spruce, pine, and Douglas fir. Of these 931 primers, 56% (524) amplified a product in both spruce and pine, and 71% (373) of these were single-banded; 224 amplicons were single-banded in all three species. Even though these COS markers were selected from large EST databases, a substantial proportion (20–30%) of amplicons displayed multiple bands or smears, suggesting significant paralogy. Sequencing of three single-banded amplicons showed high nucleotide similarities among 29 conifer species, suggesting orthology of single-banded amplicons. Screening for COS marker polymorphism in two pedigrees of white spruce and two pedigrees of loblolly pine revealed an average informativeness of 36% for spruce and 24% for pine (e.g., at least one parent was heterozygous for a single-nucleotide polymorphism within the entire amplified product). This corresponds to an average nucleotide heterozygosity of 0.05% and 0.03%, respectively, which is considerably lower than that found in other studies of spruce and pine. Thus, the advantages of COS markers for constructing syntenic maps are offset by their lower polymorphism. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

11.
Traditionally, simple sequence repeat (SSR) markers have been developed from libraries of genomic DNA. However, the large, repetitive nature of conifer genomes makes development of robust, single-copy SSR markers from genomic DNA difficult. Expressed sequence tags (ESTs), or sequences of messenger RNA, offer the opportunity to exploit single, low-copy, conserved sequence motifs for SSR development. From a 20,275-unigene spruce EST set, we identified 44 candidate EST-SSR markers. Of these, 25 amplified and were polymorphic in white, Sitka, and black spruce; 20 amplified in all 23 spruce species tested; the remaining five amplified in all except one species. In addition, 101 previously described spruce SSRs (mostly developed from genomic DNA), were tested. Of these, 17 amplified across white, Sitka, and black spruce. The 25 EST-SSRs had approximately 9% less heterozygosity than the 17 genomic-derived SSRs (mean H=0.65 vs 0.72), but appeared to have less null alleles, as evidenced by much lower apparent inbreeding (mean F=0.046 vs 0.126). These robust SSRs are of particular use in comparative studies, and as the EST-SSRs are within the expressed portion of the genome, they are more likely to be associated with a particular gene of interest, improving their utility for quantitative trait loci mapping and allowing detection of selective sweeps at specific genes.  相似文献   

12.
西花蓟马Frankliniella occidentalis是一种世界性入侵昆虫, 近年来传入我国并不断扩散蔓延。基于简单重复序列(simple sequence repeats, SSRs)的西花蓟马种群遗传结构研究对于揭示其传播途径等具有重要的指导价值。本研究对来源于西花蓟马的13 839条EST序列进行了uni-EST组装、 EST-SSR信息分析以及标记筛选, 并比较了EST-SSR与Genomic-SSR在分析遗传多样性方面的差异。结果表明: 在7 707个singlets中共找到2 623个SSR位点, 分布于1 930个uni-EST中, 平均每2.21 kb就出现一个SSR位点。重复单元中, 以单碱基重复单元为主(83.00%), 其次是四碱基重复单元(11.17%), 而二、 三、 五和六碱基重复单元所占比例较低(分别为1.41%, 0.80%, 2.02%和0.91%)。设计出的22对EST-SSR引物中, 4对引物能稳定扩增出清晰的目的条带; 荧光标记毛细管电泳发现3对引物表现出多态性。西花蓟马EST-SSR与Genomic-SSR多态性分析表明, 这3对多态性EST-SSR引物揭示的多态信息含量(PIC)为0.48~0.69, 比5对多态性Genomic-SSR引物揭示的PIC(0.88~0.92)略低。本研究结果可为今后更深入开展西花蓟马的种群遗传结构分析提供帮助。  相似文献   

13.
银杏EST序列中微卫星的分布特征   总被引:5,自引:0,他引:5  
本文利用从NCBI下载的21 590条银杏EST序列,分析了银杏(表达序列标签微卫星)EST-SSR在银杏EST序列的分布和比较了在不同长度EST序列中的SSR特性.在剔除冗余和低质量序列后,得到总长为5 708.385 kb的无冗余EST序列7 961条,发现了405个EST序列(5.09%)含有475个SSR,长度400-1000 bp的EST序列含SSR位点数为445个,占SSR总数的93.68%.二核苷酸和三核苷酸基元类型是银杏EST-SSR的主要类型,分别占SSR总数的73.89%和24.00%,最常见的SSR基元是:(AT)_n、(AG)_n、(AC)_n、(AAG)_n和(AAT)_n.通过对银杏EST序列中SSR位点信息的发掘分析,为有针对性地设计EST-SSR引物,开发银杏EST-SSR分子标记奠定基础.  相似文献   

14.
15.
In order to construct a saturated genetic map and facilitate marker-assisted selection (MAS) breeding, it is necessary to enhance the current reservoir of known molecular markers in Gossypium. Microsatellites or simple sequence repeats (SSRs) occur in expressed sequence tags (EST) in plants (Kantety et al., Plant Mol Biol 48:501–510, 2002). Many ESTs are publicly available now and represent a good tool in developing EST-SSRs. From 13,505 ESTs developed from our two cotton fiber/ovule cDNA libraries constructed for Upland cotton, 966 (7.15%) contained one or more SSRs and from them, 489 EST-SSR primer pairs were developed. Among the EST-SSRs, 59.1% are trinucleotides, followed by dinucleotides (30%), tetranucleotides (6.4%), pentanucleotides (1.8%), and hexanucleotides (2.7%). AT/TA (18.4%) is the most frequent repeat, followed by CTT/GAA (5.3%), AG/TC (5.1%), AGA/TCT (4.9%), AGT/TCA (4.5%), and AAG/TTC (4.5%). One hundred and thirty EST-SSR loci were produced from 114 informative EST-SSR primer pairs, which generated polymorphism between our two mapping parents. Of these, 123 were integrated on our allotetraploid cotton genetic map, based on the cross [(TM-1×Hai7124)TM-1]. EST-SSR markers were distributed over 20 chromosomes and 6 linkage groups in the map. These EST-SSR markers can be used in genetic mapping, identification of quantitative trait loci (QTLs), and comparative genomics studies of cotton. Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users. Zhiguo Han and Changbiao Wang contributed equally to this work.  相似文献   

16.
17.
The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20–40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.  相似文献   

18.
为挖掘番薯(Ipomoea)属EST-SSR资源,从NCBI数据库下载23406条甘薯(Ipomoea batatas (L.) Lam.)EST和62282条牵牛(Ipomoea nil (L.) Roth)EST,利用生物信息学软件预处理、去冗余、拼接处理后得到12812条无冗余的甘薯EST(6.70 Mb)和28422条牵牛唯一序列(17.19 Mb)。对这些序列进行SSR搜索,在甘薯上获得328个SSR位点,发生频率为2.56%;牵牛上筛选到962个SSR位点,出现频率为3.38%。甘薯和牵牛EST-SSR具有多个共同特征:在SSR位点中,主要是二核苷酸重复类型,其次是三核苷酸重复;在二核苷酸重复中,出现最多的重复基序为AG/CT,其次是AT/AT;在三核苷酸重复中,主要基序是AAG/CCT;SSR位点的长度主要集中在20~22 bp。结果表明,这些搜索出的EST-SSR重复基序类型丰富、多态性潜能高,具有较高的开发和利用价值。  相似文献   

19.
通过对桉树属(Eucalyptus)的10000条EST序列进行分析,在其中的1499条序列上共发现1775个微卫星重复序列。含有微卫星的EST序列约占序列总数的15%。此外,还发现桉树EST序列所含微卫星长度的变异速率与重复单元长度呈负相关;微卫星的丰度与重复单元长度也呈负相关(三碱基重复微卫星除外)。在桉树EST序列中,重复单元长度为三碱基的微卫星最为丰富。三碱基重复单元微卫星的过度富集可能是由于遗传密码选择所致。在微卫星的丰度及长度变异方面,桉树EST序列与杨树(Populus trichocarpa)基因组注释的转录序列随重复单元长度的变化呈现出相同的规律,但桉树EST序列中微卫星频率及三碱基重复微卫星的含量显著偏低,推测含微卫星的基因表达丰度极有可能低于不含微卫星的基因。通过对发现的所有微卫星位点进行引物设计,并对设计的引物进行PCR检测,结果表明所设计的引物具有极高的扩增成功率。  相似文献   

20.
《Genome biology》2014,15(3):R59

Background

The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination.

Results

We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome.

Conclusions

In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号