首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
【目的】以遗传片段分析仪内标法替代传统放射性标记引物延伸技术进行样本转录起始位点(TSS)分析,并弥补引物延伸技术应用于未知样本缺乏前期预测和后期评估环节,形成一套基于遗传片段分析仪内标法分析未知样品TSS的完整技术方案。【方法】以粘球菌Myxococcus DK1622来源的双拷贝Gro ELs基因为素材;首先从预测出发,利用数据库进行启动子和转录起始位点预测;其次,根据预测结果设计合成荧光标记引物进行靶标m RNA的反转录;再次,应用遗传片段分析技术内标法鉴定分析粘球菌来源的双拷贝Gro ELs基因转录起始位点(TSS)及其丰度;最后,应用正态分布理论进行鉴定结果评估。【结果】明确了转录起始位点的数量、转录丰度及最可能的TSS位点:粘球菌DK1622基因组中Gro EL1拷贝存在1个启动子,TSS位点为TSS_(286);Gro EL2拷贝存在2个启动子,TSS位点分别为TSS_(548)和TSS_(502),其中TSS_(548)转录丰度是TSS_(502)的13.8倍,Gro EL1的TSS_(286)丰度是gro EL2的TSS_(548)丰度的14.3倍。【结论】预测结果指明了实验设计的范围,遗传片段分析仪内标检测法替代传统放射性标记法使实验更加简便、安全、自动、准确,正态分布理论进一步评估了实验结果的可信度,三者接合形成了完善的转录起始位点鉴定技术方案。  相似文献   

2.
真核生物的基因组由基因和基因间区组成.基因转录时,从转录起始点开始到该基因的转录终止点结束,形成独立的转录单元.然而有少量的文献表明,转录有时会通读基因间区,产生包含上游基因、基因间区和下游基因的融合基因转录本.融合转录本经基因间剪接而成为有功能的成熟转录本.对真核生物转录诱导融合基因的基因间剪接方式、产生机制和意义进行了综述.  相似文献   

3.
【背景】母乳是一个重要的益生菌筛选库,其中植物乳杆菌是一种用途广泛、适应性强的益生菌。然而不同菌株具有不同的功能,现有的生理生化方法对其潜在益生特性研究十分有限,有必要采用高通量的方法寻找具有种群特异性的优质益生菌。【目的】结合菌株生化特征在全基因组的测序与分析的基础上对两株植物乳杆菌的潜在功能进行预测,并重点找寻与肠液耐受性及细菌素的合成相关的基因,即在基因组的结构上对菌株的表型进行探究。【方法】分离筛选出两株母乳源植物乳杆菌MP55、MP37,并利用Illumina genome analyzer对菌株的全基因组进行测序,采用Prokka软件对细菌基因组进行注释,采用Carbohydrate-active enzymes (CAZy)、Koyto encyclopedia of genes and genomes(KEGG)和Clusters of orthologous genes(COG)数据库对基因组进行功能注释;同时采用Prodigal、RNAmmer等工具对编码序列、核糖体RNA进行预测,并用CGView软件绘制菌株的基因组环形图谱。【结果】通过基因组装得到了两株植物乳杆菌的全基因组信息,植物乳杆菌MP37、MP55基因组大小分别为3 204 421 bp和3 299 180 bp;(G+C)mol%含量分别为44.36%和44.46%;分别包含3 012个和3 101个DNA编码序列,结合菌株生化特征在基因组上找到4个与肠液耐受相关的基因及一段细菌素合成相关基因簇。基因组序列原始数据和拼接结果已提交至"gcMeta"平台。【结论】通过高通量测序分析在基因组水平上揭示了植物乳杆菌MP55、MP37在肠道存活性与抑制病原菌相关的可能机理。植物乳杆菌MP55、MP37是两株潜在的益生菌候选菌株,实验结果为进一步阐明其益生菌特性的功能机制提供了遗传学基础。  相似文献   

4.
【目的】优化柞蚕Antheraea pernyi基因组注释,更好地扩展其在比较基因组学及品种改良研究中的应用。【方法】对柞蚕进行全长转录组测序分析;经全长转录本与参考基因组比对,鉴定新基因及新转录本,并对这些新基因和新转录本进行功能注释及长链非编码RNAs (lncRNAs)预测。利用大量的蛋白质编码转录本和lncRNAs对柞蚕基因组中基因结构进行修订。最后创建矫正后的柞蚕基因组基因注释。【结果】新发现1 997个蛋白编码基因和3 399个lncRNA基因,分别由2 402个和3 574个全长转录本数据支持。发现柞蚕基因组含25 021个基因,其中19 825个基因是蛋白编码基因,包括7个保幼激素酸甲基转移酶基因。【结论】本研究促进了对柞蚕基因组基因注释信息的认识,为柞蚕及相关物种功能基因组及比较基因组学研究提供了很有用的数据资源。  相似文献   

5.
真核基因起始与终止密码子旁侧序列的特征对于确定cDNA开放阅读框架 (ORF)和预测基因组序列中的编码区 (CDS)非常重要。基于高质量RefSeq数据库 ,在较大数据规模下统计分析了起始密码子旁侧序列所具有的“Kozak规则” ,发现不同物种之间存在差别。同时分析了不同终止密码子旁侧序列的统计学特征 ,给出了相应的正则表达式。由于发现多种基因中存在同相位起始、终止密码子串联使用的情况 ,亦对此进行了讨论。  相似文献   

6.
【目的】为了研究基因组编辑工具CRISPR/Cas9和CRISPR/Cpf1所产生的DNA双链断裂(DNA doublestrandbreak,DSB)对酿酒酵母DNA的损伤作用及修复响应情况,对比化学物质甲基磺酸甲酯(methyl methanesulfonate,MMS)对酿酒酵母基因组DNA的损伤和修复,阐明编辑细胞在细胞水平和转录水平上的变化。【方法】起始细胞分为两种情况,包括未进行细胞周期同步化和被α-因子同步化细胞周期至G0/G1期。检测CRISPR/Cas9和CRISPR/Cpf1处理后编辑细胞的生长情况。利用流式细胞术检测编辑细胞的细胞周期延滞的情况。利用荧光定量PCR检测编辑细胞和MMS处理细胞后DNA损伤响应关键基因转录表达水平的变化情况。【结果】起始细胞无论是未同步化还是同步化,其生长均受到基因组编辑抑制,细胞存活率降低,细胞周期被滞留在G2/M期,而MMS处理导致细胞周期S期的滞留。此外,随编辑时间的延长,突变率增加,细胞存活率降低。CRISPR/Cpf1编辑细胞的突变率和存活率均低于CRISPR/Cas9,由此可见,CRISPR/Cpf1对细胞的损伤强度高于CRISPR/Cas9。两种编辑均诱导酵母DNA损伤响应关键基因RNR3及HUG1转录水平显著上调,并且CRISPR/Cpf1介导的上调幅度大于CRISPR/Cas9,但两者均低于MMS的处理。【结论】本研究解析了CRISPR/Cas9和CRISPR/Cpf1介导的基因组编辑在细胞水平和转录水平上对DNA损伤作用及修复响应,初步揭示了酿酒酵母应对不同类型的DSB损伤时响应程度的差异,为提高基因组编辑工具的编辑能力和评估基因编辑安全性提供了重要依据。  相似文献   

7.
汪屹  叶江  张惠展 《微生物学报》2012,52(5):566-572
【目的】调查yigP基因启动子的活性,并对该转录调控序列进行分析。【方法】以lacZ为报告基因,克隆启动子片段至启动子探针质粒中,通过检测β-半乳糖苷酶活性判断启动子活性,并通过克隆一系列逐步缩短的启动子片段来确定启动子所在区域。利用定点突变技术,对启动子的重要序列进行定点突变,调查其对启动子活性的影响。【结果】确定了yigP基因启动子的区域,鉴定了启动子的-10区和-35区,并发现了启动子上游存在一个负调控序列,对该序列进行了初步的研究显示其中部分序列是这种负调控作用的核心序列。【结论】对yigP基因的转录调控序列进行了鉴定,丰富了我们对基因转录调控的认识。  相似文献   

8.
【目的】研究副溶血弧菌AphA对vopT的转录调控机制。【方法】提取野生株(WT)和aphA突变株(ΔaphA)的总RNA,采用引物延伸实验研究vopT的转录起始位点,并根据产物的丰度差异判断AphA对其调控关系。分别将WT和ΔaphA的总RNA逆转录成cDNA,利用实时定量RT-PCR进一步研究AphA对靶基因的调控关系。将vopT的启动子区克隆入pHRP309质粒的β-半乳糖苷酶基因上游,构建LacZ重组质粒,并将该重组质粒转入WT和ΔaphA中,通过测定并比较两株菌中β-半乳糖苷酶活性的差异来判定AphA对vopT的调控关系。PCR扩增靶基因整个启动子区DNA序列,并纯化His-AphA蛋白,利用凝胶阻滞实验(EMSA)验证His-AphA对靶基因启动子区是否具有直接的结合作用。【结果】vopT只有一个转录起始位点A (?86),且其转录活性受AphA的间接抑制。RT-PCR和EMSA结果显示AphA对vtrA的转录也具有间接的抑制作用。【结论】AphA间接抑制vopT转录,且该间接抑制作用与VtrA无关。  相似文献   

9.
植物线粒体基因组作为植物细胞中三个遗传系统之一 ,在转录和转录后的加工中存在有许多特殊性 :在基因组结构中 ,植物线粒体基因组比较大 ,且不同物种间差异较大 ;其转录过程具有许多特点 ,例如可以起始于编码区的多个位点等 ;在高等植物中 ,所发现的线粒体基因内含子大多是II型内含子 ,这些内含子有时编码蛋白 ;植物线粒体基因转录后的编辑中C -U转换是一个十分明显的特征 ;在线粒体中 ,多腺化使转录本趋于不稳定 ,而在细胞核中 ,RNA的多腺化可以增强转录本的稳定性。综述了植物线粒体基因组结构以及转录后的编辑、剪接、多腺化等方面的特点和研究进展 。  相似文献   

10.
【目的】了解扬眉线蛱蝶Limenitis helmanni线粒体基因组结构及其分子系统发育。【方法】采用PCR步移法对扬眉线蛱蝶线粒体基因组全序列进行测定和分析。基于线粒体基因组13个蛋白质编码基因和2个rRNA基因的核苷酸序列构建了66种鳞翅目昆虫的系统发育树。【结果】扬眉线蛱蝶线粒体基因组全长15 178 bp(Gen Bank登录号:KY290566),包括13个蛋白质编码基因、22个tRNA基因、2个rRNA基因和一段长度为346 bp的A+T富含区,基因排列顺序与其他已知近缘种昆虫相同。扬眉线蛱蝶线粒体基因组中存在很高的A+T含量(81.1%)。13个蛋白质编码基因中,COI以CGA作为起始密码子,ND5以GTT作为起始密码子,其余均以昆虫典型的ATN为起始密码子。COII和ND4基因使用了不完全终止密码子T,其余基因均以典型的TAA为终止密码子。在所测得的22个tRNA基因中,除tRNASer(AGN)缺少DHU臂外,其余tRNA均能形成典型的三叶草结构。与其他多数鳞翅目昆虫一样,扬眉线蛱蝶的A+T富含区中有一段由ATAGA引导的保守的多聚T结构,长度为20 bp,并散布着一些长短不一的串联重复单元。系统发育树结果显示,蛱蝶科亚科级别的系统发育关系为:(绢蛱蝶亚科+眼蝶亚科)+((蛱蝶亚科+闪蛱蝶亚科)+(釉蛱蝶亚科+线蛱蝶亚科))。【结论】线蛱蝶族与翠蛱蝶族的亲缘关系较近,丽蛱蝶族是该亚科较早分化出来的一支。基于线粒体基因组构建的线蛱蝶亚科物种系统发育关系与传统形态分类学研究结论不一致。  相似文献   

11.
A gene in a genome is defined as putative alien (pA) if its codon usage difference from the average gene exceeds a high threshold and codon usage differences from ribosomal protein genes, chaperone genes and protein-synthesis-processing factors are also high. pA gene clusters in bacterial genomes are relevant for detecting genomic islands (GIs), including pathogenicity islands (PAIs). Four other analyses appropriate to this task are G+C genome variation (the standard method); genomic signature divergences (dinucleotide bias); extremes of codon bias; and anomalies of amino acid usage. For example, the cagA domain of Helicobacter pylori is highly deviant in its genome signature and codon bias from the rest of the genome. Using these methods we can detect two potential PAIs in the Neisseria meningitidis genome, which contain hemagglutinin and/or hemolysin-related genes. Additionally, G+C variation and genome signature differences of the Mycobacterium tuberculosis genome indicate two pA gene clusters.  相似文献   

12.
13.
Recognizing the pseudogenes in bacterial genomes   总被引:9,自引:0,他引:9  
Pseudogenes are now known to be a regular feature of bacterial genomes and are found in particularly high numbers within the genomes of recently emerged bacterial pathogens. As most pseudogenes are recognized by sequence alignments, we use newly available genomic sequences to identify the pseudogenes in 11 genomes from 4 bacterial genera, each of which contains at least 1 human pathogen. The numbers of pseudogenes range from 27 in Staphylococcus aureus MW2 to 337 in Yersinia pestis CO92 (e.g. 1–8% of the annotated genes in the genome). Most pseudogenes are formed by small frameshifting indels, but because stop codons are A + T-rich, the two low-G + C Gram-positive taxa (Streptococcus and Staphylococcus) have relatively high fractions of pseudogenes generated by nonsense mutations when compared with more G + C-rich genomes. Over half of the pseudogenes are produced from genes whose original functions were annotated as ‘hypothetical’ or ‘unknown’; however, several broadly distributed genes involved in nucleotide processing, repair or replication have become pseudogenes in one of the sequenced Vibrio vulnificus genomes. Although many of our comparisons involved closely related strains with broadly overlapping gene inventories, each genome contains a largely unique set of pseudogenes, suggesting that pseudogenes are formed and eliminated relatively rapidly from most bacterial genomes.  相似文献   

14.
Although non-coding RNA (ncRNA) genes do not encode proteins, they play vital roles in cells by producing functionally important RNAs. In this paper, we present a novel method for predicting ncRNA genes based on compositional features extracted directly from gene sequences. Our method consists of two Support Vector Machine (SVM) models--Codon model which uses codon usage features derived from ncRNA genes and protein-coding genes and Kmer model which utilizes features of nucleotide and dinucleotide frequency extracted respectively from ncRNA genes and randomly chosen genome sequences. The 10-fold cross-validation accuracy for the two models is found to be 92% and 91%, respectively. Thus, we could make an automatic prediction of ncRNA genes in one genome without manual filtration of protein-coding genes. After applying our method in Sulfolobus solfataricus genome, 25 prediction results have been generated according to 25 cut-off pairs. We have also applied the approach in E. coli and found our results comparable to those of previous studies. In general, our method enables automatic identification of ncRNA genes in newly sequenced prokaryotic genomes.  相似文献   

15.
Compositional distributions in the three codon positions of the coding sequences of 12 fully sequenced prokaryotic genomes, which are publicly available, were investigated. A universal compositional correlation was observed in most of the genomes under investigation irrespective of their overall genomic GC contents. In all the genomes, the GC contents at the first codon positions are always greater than the overall GC contents of the genomes whereas the reverse is true in the case of second codon positions. GC contents at the third codon positions are higher than the overall genomic GC contents in high GC containing genomes, and the opposite situation was found in case of low GC genomes except for Helicobacter pylori. In high-GC rich genomes, the GC contents at the first + second codon positions are less than the GC contents at the third codon positions, and they are low in low-GC genomes except for Helicobacter pylori. The distributions of four bases at the three different positions were also investigated for all 12 organisms. It was observed that in high-GC genomes G is the most dominant base and in low-GC genomes A is the most dominant base in the first codon positions. But purine bases, i.e., (A + G), predominantly occur in the first codon position. In the second codon position, A is the most dominant base in most of the organisms and G is the least dominant base in all the organisms. There is no unique regular pattern of individual bases at the third codon positions; however, there are significant differences in the occurrences of (G + C) contents in the third codon positions among the different organisms. Calculations of dinucleotide frequencies in 12 different organisms indicate that in GC-rich genomes GG, GC, CC, and CG dinucleotides are the most dominant whereas the reverse is true in case of low-GC genomes. Biological implications of these results are discussed in this paper.  相似文献   

16.
Coding information is the main source of heterogeneity (non-randomness) in the sequences of microbial genomes. The heterogeneity corresponds to a cluster structure in triplet distributions of relatively short genomic fragments (200-400 bp). We found a universal 7-cluster structure in microbial genomic sequences and explained its properties. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy. Based on the analysis of 143 completely sequenced bacterial genomes available in Genbank in August 2004, we show that there are four "pure" types of the 7-cluster structure observed. All 143 cluster animated 3D-scatters are collected in a database which is made available on our web-site (http://www.ihes.fr/~zinovyev/7clusters). The findings can be readily introduced into software for gene prediction, sequence alignment or microbial genomes classification.  相似文献   

17.
The Z curve database: a graphic representation of genome sequences   总被引:7,自引:0,他引:7  
MOTIVATION: Genome projects for many prokaryotic and eukaryotic species have been completed and more new genome projects are being underway currently. The availability of a large number of genomic sequences for researchers creates a need to find graphic tools to study genomes in a perceivable form. The Z curve is one of such tools available for visualizing genomes. The Z curve is a unique three-dimensional curve representation for a given DNA sequence in the sense that each can be uniquely reconstructed given the other. The Z curve database for more than 1000 genomes have been established here. RESULTS: The database contains the Z curves for archaea, bacteria, eukaryota, organelles, phages, plasmids, viroids and viruses, whose genomic sequences are currently available. All the 3-dimensional Z curves and their three component curves are stored in the database. The applications of the Z curve database on comparative genomics, gene prediction, computation of G+C content with a windowless technique, prediction of replication origins and terminations of bacterial and archaeal genomes and study of local deviations from the Chargaff Parity Rule 2 etc. are presented in detail. The Z curve database reported here is a treasure trove in which biologists could find useful biological knowledge.  相似文献   

18.
We present a simple method to detect pathogenicity islands and anomalous gene clusters in bacterial genomes. The method uses iterative discriminant analysis to define genomic regions that deviate most from the rest of the genome in three compositional criteria: G+C content, dinucleotide frequency and codon usage. Using this method, we identify many virulence-related gene islands, e.g. encoding protein secretion systems, adhesins, toxins, and other anomalous gene clusters, such as prophages. The program and the whole dataset, including the catalogs of genes in the detected anomalous segments, are publicly available at http://compbio.sibsnet.org/projects/pai-ida/. This program can be used in searching for virulence-related factors in newly sequenced bacterial genomes.  相似文献   

19.
The Horizontal Gene Transfer DataBase (HGT-DB) is a genomic database that includes statistical parameters such as G+C content, codon and amino-acid usage, as well as information about which genes deviate in these parameters for prokaryotic complete genomes. Under the hypothesis that genes from distantly related species have different nucleotide compositions, these deviated genes may have been acquired by horizontal gene transfer. The current version of the database contains 88 bacterial and archaeal complete genomes, including multiple chromosomes and strains. For each genome, the database provides statistical parameters for all the genes, as well as averages and standard deviations of G+C content, codon usage, relative synonymous codon usage and amino-acid content. It also provides information about correspondence analyses of the codon usage, plus lists of extraneous group of genes in terms of G+C content and lists of putatively acquired genes. With this information, researchers can explore the G+C content and codon usage of a gene when they find incongruities in sequence-based phylogenetic trees. A search engine that allows searches for gene names or keywords for a specific organism is also available. HGT-DB is freely accessible at http://www.fut.es/~debb/HGT.  相似文献   

20.
E A Groisman  M H Saier  Jr    H Ochman 《The EMBO journal》1992,11(4):1309-1316
The genomes of Escherichia coli and Salmonella typhimurium are similar with respect to base composition, chromosome size, and the order, orientation and spacing of genes, but differ with respect to some 29 'loops', regions unique to one species. To evaluate the genetic basis for the structure and organization of the enteric bacterial genomes, we examined the gene encoding a non-specific acid phosphatase (phoN) which maps to a loop at 96 min on the S.typhimurium chromosome. We detected atypical base composition, codon usage pattern and trinucleotide frequencies. The 1.4 kb region containing phoN had an overall base composition of 43% G+C, while the G+C content at the third positions of codons in the phoN reading frame is only 39%, much lower than the Salmonella chromosome which averages 52%. Non-specific acid phosphatase activity, assayed in 14 Gram-negative species, was detected only in Morganella morganii and Providencia stuartii, organisms with low genomic G+C contents. Upstream of the phoN gene in Salmonella is a sequence with high similarity to the oriT region of incFII plasmids, suggesting that the phoN gene, and perhaps the entire loop structure, was acquired by lateral transmission in a plasmid-mediated event.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号