首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 203 毫秒
1.
微卫星或简单重复序列(simple sequence repeat, SSR)在真核和原核生物以及病毒基因组中普遍存在,并被广泛用于遗传与进化研究。本研究从NCBI中下载埃博拉病毒属的四个不同种的埃博拉病毒全基因组序列,筛选36条作为实验材料,利用IMEx在线提取软件提取SSRs,用Python编程统计数据,从而分析SSRs在埃博拉病毒全基因组序列中的分布情况。分析得出,埃博拉病毒基因组序列中二型SSRs含量最为丰富,其次是一型SSRs,三型SSRs有少量,四型SSRs则更少,没有发现五型和六型SSRs。在更深入的分析中得出在埃博拉病毒属四个种中,含A/T碱基的SSRs含量远远大于含C/G碱基的SSRs。分析得出一型SSRs中(A)n/(T)n远多于(G)n/(C)n,二型SSRs中不存在(GC/CG)n,三型中也不存在(GGC/CGG/GCG/CCG/CGC/GCC) n。上述发现可能跟埃博拉病毒的致病机理有密切联系。通过对埃博拉病毒基因组序列中SSRs的分析,为研究埃博拉病毒的变异情况及致病机制提供更多参考。  相似文献   

2.
柑橘衰退病毒(Citrus tristeza virus,CTV)属于长线性病毒科(Closteroviridae),是目前已知植物病毒中基因组最大的病毒,其引起的柑橘衰退病对全世界的柑橘产业造成着严重影响。本文以在GenBank登录的32条全长CTV基因组序列为材料,分析简单重复序列(Simple Sequence Repeats,SSRs)在其基因组序列中的分布情况。研究结果显示,在所有的CTV基因组中均有SSRs的分布,SSRs重复次数较少,二型SSRs占主导地位,未在CTV基因组序列中发现五型和六型SSRs。在32条基因组全长序列中仅在5条序列中发现四型SSRs。这是首次以柑橘病毒为材料进行的SSRs分析研究。  相似文献   

3.
微卫星(simple sequence repeats,SSRs)广泛分布于原核生物和真核生物基因组中,包括编码区和非编码区,是最常用的分子标记。本文利用生物信息学方法搜索和统计了牦牛和水牛全基因组中完整型SSRs序列,并对其生物信息学特征进行比较分析。牦牛和水牛全基因组中SSRs总数量分别为968 134个和1 052 443个,占其全基因组长度的比例分别为5.80‰和5.69‰。牦牛和水牛全基因组SSRs总丰度(366.01 vs 371.07个/Mb)和总密度(5 686.00 vs 5 799.34 bp/Mb)基本接近。牦牛和水牛全基因组SSRs丰富度分布模式如下:单核苷酸SSRs二核苷酸SSRs三核苷酸SSRs五核苷酸SSRs四核苷酸SSRs六核苷酸SSRs,这6种重复类型SSRs特征相互比较有显著差异,而相同重复类型SSRs特征基本一致。水牛第1条染色体上SSRs数量最多(72 934个),其次依次是第2、3、4条染色体,而较少的是第23、24条染色体,其所有染色体上SSRs丰度不存在显著差异(p0.05)。牦牛和水牛SSRs序列随着重复单元中核苷酸数量的增加,而其重复拷贝数逐渐下降。牦牛全基因组和水牛各染色体上各重复类型优势SSRs序列基本一致,并与普通牛、绵羊全基因组中不同重复类型SSRs优势序列相一致。  相似文献   

4.
本文以人腺病毒B亚种31条基因组序列及D亚种39条基因组序列为研究材料,利用ImperfectMicrosatelliteExtractor和DNAMAN软件对这些基因组序列中简单重复序列(SSR)的分布情况进行了系统性分析和比较。分析结果显示:人腺病毒B、D亚种基因组中简单重复序列的平均相对密度是十分接近的,但在不同类型SSR中分布情况又有所不同。D亚种中二型SSR明显高于B亚种,在两亚种一型SSR中(A)n、(T)n都是比较多的,而在两亚种二型SSR中的(CG/GC)n表现出了较高的偏好性。在同亚种多序列比对分析中,D亚种表现出了更高的稳定性。B、D亚种中SSR的这种特异性分布可能与它们的进化机制和致病性有关。  相似文献   

5.
以GenBank公开的甲型流感病毒亚型的血凝素(hemagglutinin,HA)核苷酸序列为材料,从简单重复序列(simple se-quence repeat,SSR)分布的分析角度出发,分析了来自于亚洲、非洲、北美洲、南美洲、欧洲、大洋洲的49个地区的76株甲流病毒的HA片段。分析表明:所分析序列的SSRs的分布都很相似,其中单碱基重复的相对丰度值和相对密度值均高于其它五种碱基重复的相对丰度值和相对密度值;甲流病毒HA片段的SSRs与HIV-1[16]基因中的SSRs相比,前者的相对丰度值和相对密度值高于后者。这些结果表明甲流病毒基因中的SSRs可能与甲流病毒的快速变异相关。  相似文献   

6.
拟南芥与水稻之间简单重复序列的比较分析   总被引:3,自引:0,他引:3  
利用Perl,C 语言编写了鉴定和分析简单重复序列的一系列程序,在全基因组水平上分析了拟南芥(ArabidopsisthalianaL.)简单重复序列的分布及简单重复序列和基因的关系。共发现5652个简单重复序列(≥20bp),大约每20.6kb有1个简单重复序列。拟南芥各染色体之间简单重复序列的密度基本一致。拟南芥的27480条编码序列中,只有677条编码序列含有725个简单重复序列,其中的3碱基简单重复序列多数对应的是小的亲水性的氨基酸。在拟南芥和水稻(OryzasativaL.)第4号染色体的高度保守的基因中,简单重复序列却并不保守。通过比较拟南芥和水稻之间简单重复序列的差异,推论出:水稻的全基因组和基因中简单重复序列的密度都比拟南芥大,这可能是水稻基因组序列比拟南芥大的原因之一,水稻基因组中0.21%来自简单重复序列,而拟南芥中只有0.13%;不但不同物种的基因组对简单重复序列的偏好性不同,而且不同物种的基因对简单重复序列的偏好性也不同。在水稻和拟南芥中都发现了一些嵌套性的卫星序列。  相似文献   

7.
利用Peri,C 语言编写了鉴定和分析简单重复序列的一系列程序,在全基因组水平上分析了拟南芥(Arabidopsisthaliana L.)简单重复序列的分布及简单重复序列和基因的关系.共发现5 652个简单重复序列(≥20bp),大约每20.6kb有1个简单重复序列.拟南芥各染色体之间简单重复序列的密度基本一致.拟南芥的27 480条编码序列中,只有677条编码序列含有725个简单重复序列,其中的3碱基简单重复序列多数对应的是小的亲水性的氨基酸.在拟南芥和水稻(Oryza sativa L.)第4号染色体的高度保守的基因中,简单重复序列却并不保守.通过比较拟南芥和水稻之间简单重复序列的差异,推论出:水稻的全基因组和基因中简单重复序列的密度都比拟南芥大,这可能是水稻基因组序列比拟南芥大的原因之一,水稻基因组中0.21%来自简单重复序列,而拟南芥中只有0.13%;不但不同物种的基因组对简单重复序列的偏好性不同,而且不同物种的基因对简单重复序列的偏好性也不同.在水稻和拟南芥中部发现了一些嵌套性的卫星序列.  相似文献   

8.
本研究利用MSDB v2.4软件以及生物信息学方法获取了家蚕全基因组的完整型SSRs序列,并对其分布规律进行比较分析。家蚕全基因组中SSRs总数量为141 311个,相对丰度为209.01 No/Mb,总长度为2.41 Mb,全基因组SSRs六种碱基重复类型的数量和密度分布模式为:单碱基四碱基三碱基二碱基五碱基六碱基,说明全基因以单碱基为主要碱基类型,六种碱基类型中五碱基SSRs G-C含量最高。对全基因组3'非翻译区(3'UTR)、5'非翻译区(5'UTR)、编码区(CDs)、内含子区(Introns)和基因间隔区(Intergenics)等不同区域SSRs分析表明,Introns区SSRs数量最高,为125 178个,最小的是5'UTR,为278个,其数量大小顺序为IntronsIntergenics3'UTRCDs5'UTR。5个不同区域的SSRs的碱基的总计数差异较大,编码区总计数最大的是三碱基,而其他4个区域最多的是单碱基。分别对5个区域SSRs中六种重复拷贝类别进行统计分析,碱基总计数(或频率)最多的分别是A;AC、AG、AT;AAT、CCG;AAAT、AAAC;AAATC、AAACT和TAAGTT、GAATTT、AATTAA,Introns和Intergenics区的重复类型总计数显著高于3'UTR、CDs和5'UTR。各重复类型拷贝数分布范围为4~100,主要集中在4~30之间。这为进一步系统分析家蚕SSRs分子标记筛选和遗传分析打下基础。  相似文献   

9.
柔嫩艾美尔球虫EST序列中SSR的获取及分析   总被引:1,自引:0,他引:1  
对柔嫩艾美尔球虫EST—SSR进行生物信息学分析,共获取Eimeria tenella EST序列34074条,总长度为16.45Mb,小于12bpSSR的ESTs达7651条,从中获得SSR序列19576条、总长度为0.35Mb,EST—SSRs的频率是48.00%,平均相隔S40bp出现一个长度不小于12bp的SSR。在E.tenella的核苷酸重复基元中,2、3、4、5、6和7bp重复序列在基因组中出现的种类分别有11种472条、49种14710条、31种525条、13种25条、21种43条和15种400条,3碱基重复序列是最丰富的重复单元,占总数的75.14%。各种SSRs中富含G、C碱基的重复单元以GCA出现频率最多(28.63%),次为AGC(17.59%),GCT(8.76%),TGC(7.62%),CTG(7.15%)。  相似文献   

10.
蜜蜂EST中的微卫星分析   总被引:5,自引:1,他引:4  
李斌  夏庆友  鲁成  周泽扬 《遗传学报》2004,31(10):1089-1094
为加速分子标记在蜜蜂遗传、进化与行为等方面的利用,分析了简单重复序列(Simple Sequence Repeats,SSRs)在蜜蜂EST中的分布频率与密度。所分析的蜜蜂EST数据集包含15869条序列,总长为7.9Mb。结果显示,蜜蜂ESTs中SSRs的频率为1/0.52kb,其中6碱基重复基序占总SSRs的45.0%,是最丰富的重复单元,而2、1、3、4与5碱基重复基序分别占总SSRs的17.9%、14.1%、11.6%、9.2%和2.2%。同时,在各种SSRs重复单元中,富含A碱基的重复单元占据优势地位,如:A、AT、AG、AC、AAT、AAG、AAC、AAAT、AAAG、AAAAG、AAAAT、AATAT、AAAAAG和AAAAAT重复基序,而富含G碱基的重复单元在基因编码区中含量较低。进一步分析显示:蜜蜂SSRs在冗余与非冗余EST数据集中的分布频率与密度相似,仅存在极小的偏差,表明可从现有的部分ESTs数据中方便地获取有效的微卫星标记。  相似文献   

11.
All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.  相似文献   

12.
Microsatellites, or simple sequence repeats (SSRs), are highly polymorphic and universally distributed in eukaryotes. SSRs have been used extensively as sequence tagged markers in genetic studies. Recently, the functional and evolutionary importance of SSRs has received considerable attention. Here we report the mining and characterization of the SSRs in papaya genome. We analyzed SSRs from 277.4 Mb of whole genome shotgun (WGS) sequences, 51.2 Mb bacterial artificial chromosome (BAC) end sequences (BES), and 13.4 Mb expressed sequence tag (EST) sequences. The papaya SSR density was one SSR per 0.7 kb of DNA sequence in the WGS, which was higher than that in BES and EST sequences. SSR abundance was dramatically reduced as the repeat length increased. According to SSR motif length, dinucleotide repeats were the most common motif in class I, whereas hexanucleotides were the most copious in class II SSRs. The tri- and hexanucleotide repeats of both classes were greater in EST sequences compared to genomic sequences. In class I SSR, AT and AAT were the most frequent motifs in BES and WGS sequences. By contrast, AG and AAG were the most abundant in EST sequences. For SSR marker development, 9,860 primer pairs were surveyed for amplification and polymorphism. Successful amplification and polymorphic rates were 66.6% and 17.6%, respectively. The highest polymorphic rates were achieved by AT, AG, and ATG motifs. The genome wide analysis of microsatellites revealed their frequency and distribution in papaya genome, which varies among plant genomes. This complete set of SSRs markers throughout the genome will assist diverse genetic studies in papaya and related species.  相似文献   

13.
Brassica rapa ssp. pekinensis (Chinese cabbage) is an economically important crop and a model plant for studies on polyploidization and phenotypic evolution. To gain an insight into the structure of the B. rapa genome we analyzed 12,017 BAC-end sequences for the presence of transposable elements (TEs), SSRs, centromeric satellite repeats and genes, and similarity to the closely related genome of Arabidopsis thaliana. TEs were estimated to occupy 14% of the genome, with 12.3% of the genome represented by retrotransposons. It was estimated that the B. rapa genome contains 43,000 genes, 1.6 times greater than the genome of A. thaliana. A number of centromeric satellite sequences, representing variations of a 176-bp consensus sequence, were identified. This sequence has undergone rapid evolution within the B. rapa genome and has diverged among the related species of Brassicaceae. A study of SSRs demonstrated a non-random distribution with a greater abundance within predicted intergenic regions. Our results provide an initial characterization of the genome of B. rapa and provide the basis for detailed analysis through whole-genome sequencing.  相似文献   

14.
We have explored the possible role of SSR density in genome to generate biological information. In our study, we have checked the SSR (simple sequence repeats) status in virulent and non virulent genes of enteric bacteria to see whether the SSRs distribution contributes to virulence. The genome, plasmid and virulent genes sequences in fasta format were downloaded from NCBI GenBank and VFDB. The sequences were subjected to SSR analysis using software tool ssr.exe. The resulting data was pasted in excel sheet and further analyzed for percentage of each type of SSR. Higher nucleotide repeats have been observed in our study. Overall high density of SSRs can enhance antigenic variance of the pathogen population in a strategy that counteracts the host immune response. Frequency of A and T repeats is higher in the chromosome, plasmid and the virulence genes. However, in dinucleotide repeats the frequencies of GC/CG repeats are higher in genome, whereas plasmid has more of AT/TA repeats. Genome has trinucleotide repeats having predominantly G and C whereas plasmid has trinucleotide repeats having predominantly A and T. The repeat number obtained and percentage of repeats is higher in virulence genes as compared to other gene families. Due to the presence of this large number of SSRs, the organism has an enormous potential for generating this genomic and phenotypic diversity.  相似文献   

15.
孙高飞  何守朴  潘兆娥  杜雄明 《遗传》2015,37(2):192-203
SSRs(Simple sequence repeats)是一类广泛存在于动植物基因组的DNA短串联重复序列,是重要的基因组分子标记。比较不同基因组同源SSR的差异,有利于了解相近物种间的进化过程。文章使用雷蒙德氏棉基因组(D5)、亚洲棉基因组(A2)全基因组序列和陆地棉(AD1)的限制性酶切基因组测序数据,进行全基因组SSR扫描,比较了A组和D组的SSR分布情况,通过识别3个基因组之间的同源SSR,比较它们之间同源SSR重复序列的差异。结果发现,A组和D组同源SSR的分布规律非常相似,但A组与AD组的同源SSR保守性比D组与AD组同源SSR的保守性强。与AD组同源SSR相比,A组中重复序列长度增长的SSR数量约为长度缩短的SSR数量的5倍,在D组中这一比值约为3倍。可以推测,四倍体AD组在与A组、D组的平行进化过程中,由于基因组融合,导致SSR的重复序列长度变化速率与二倍体A、D组有差异,同时这种差异可能导致了AD组SSR重复序列长度在进化过程中与二倍体相比有变短的趋势。文章首次对3个棉花基因组的同源SSR进行了系统地比较,发现了同源SSR在棉属四倍体基因组和二倍体基因组中的显著差异,为进一步揭示棉属基因组的进化规律提供了基础。  相似文献   

16.
17.
Simple sequence repeats (SSRs) are omnipresent in prokaryotes and eukaryotes, and are found anywhere in the genome in both protein encoding and noncoding regions. In present study the whole genome sequences of seven chromosomes (Shigella flexneri 2a str301 and 2457T, Shigella sonnei, Escherichia coli k12, Mycobacterium tuberculosis, Mycobacterium leprae and Staphylococcus saprophyticus) have downloaded from the GenBank database for identifying abundance, distribution and composition of SSRs and also to determine difference between the tandem repeats in real genome and randomness genome (using sequence shuffling tool) of the organisms included in this study. The data obtained in the present study show that: (i) tandem repeats are widely distributed throughout the genomes; (ii) SSRs are differentially distributed among coding and noncoding regions in investigated Shigella genomes; (iii) total frequency of SSRs in noncoding regions are higher than coding regions; (iv) in all investigated chromosomes ratio of Trinucleotide SSRs in real genomes are much higher than randomness genomes and Di nucleotide SSRs are lower; (v) Ratio of total and mononucleotide SSRs in real genome is higher than randomness genomes in E. coli K12, S. flexneri str 301 and S. saprophyticus, while it is lower in S. flexneri str 2457T, S.sonnei and M. tuberculosis and it is approximately same in M. leprae; (vi) frequency of codon repetitions are vary considerably depending on the type of encoded amino acids.  相似文献   

18.
简单重复序列亦称微卫星,被成功应用于许多真核生物、原核生物和病毒的基因组和进化研究,但是噬菌体中的微卫星目前很少被研究。因此对60条尾病毒目基因组中的微卫星和和复合型微卫星(由两个或两个以上直接相邻的微卫星组成)做综合性分析,在这60个基因组中总共观察到11 874个微卫星和449个复合型微卫星。相关性分析表明微卫星个数与基因组大小成正线性相关(ρ=0.899, P<0.01)。参考序列中的微卫星个数少于对应的随机序列中微卫星个数,这种反常现象主要是因为参考序列含有较少的单核苷酸和二核苷酸重复。A/T和AT/TA重复是单核苷酸和二核苷酸重复中最主要的类型,因此单核苷酸重复中的GC含量明显低于相应的序列中的GC含量;相比之下,微卫星中的二核苷酸和三核苷酸重复的GC含量与对应的参考序列的GC含量无明显区别。尾病毒目基因组中的这些结果与其它生物体基因组存在一定的差别。有助于了解尾病毒目中微卫星的分布、进化和生物学功能。  相似文献   

19.
MOTIVATION: Simple sequence repeats (SSRs) are abundant across genomes. However, the significance of SSRs in organellar genomes of rice has not been completely understood. The availability of organellar genome sequences allows us to understand the organization of SSRs in their genic and intergenic regions. RESULTS: We have analyzed SSRs in mitochondrial and chloroplast genomes of rice. We identified 2528 SSRs in the mitochondrial genome and average 870 SSRs in the chloroplast genomes. About 8.7% of the mitochondrial and 27.5% of the chloroplast SSRs were observed in the genic region. Dinucleotides were the most abundant repeats in genic and intergenic regions of the mitochondrial genome while mononucleotides were predominant in the chloroplast genomes. The rps and nad gene clusters of mitochondria had the maximum repeats, while the rpo and ndh gene clusters of chloroplast had the maximum repeats. We identified SSRs in both organellar genomes and validated in different cultivars and species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号