首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
利用Phred/Phrap/Consed、cross.match、RepeatMasker、Blast等软件和自主开发程序,基于Linux操作系统,构建了林木EST序列分析系统,完成了从测序峰图向核酸序列的转化、载体序列的去除、重复序列鉴定、EST序列分类和组装、EST序列功能注释与功能分类以及SSR、SNP的发掘。并通过使用Perl语言结合bioperl模块写的脚本程序使分析过程自动化,从而可以快速地对大批林木EST数据进行分析,为林木的功能基因组学研究提供有用的信息。  相似文献   

2.
目的:研究和开发高通量测序全基因组组装过程中的填补gap的方法。方法:研究组装软件的算法,使用Perl语言编写自动填补gap的程序,并建立全基因组组装的流程。结果:提出了填补gap的末端延伸法,并使用Perl语言进行了编程;在对立克次体高通量测序的组装过程中,这些方法能大大减少gap的数量。结论:本研究提出的末端延伸法能够高效填补全序列组装过程中出现的gap,具有很强的实用性。  相似文献   

3.
Bioperl是Perl语言专门用于生物信息的工具与函数模块集,是世界各地的Perl开发人员在生物信息学、基因组学以及其他生命科学领域的智能结晶,服务于研究生物学问题的生物学家或计算机专家。通过对Bioperl进行了详细的介绍,并利用几个研究中的应用实例充分说明Bioperl在生物信息学研究中的重要地位。  相似文献   

4.
应用生物信息学基础理论,以信息技术为手段,开发了方便高效的生物序列分析平台。该系统可进行核酸及蛋白质序列统计、性质分析、PCR引物设计、联配及同源性分析等。使用该系统设计PCR引物,克隆了灰葡萄孢霉菌-3-羟-3-甲基戊二酰辅酶A还原酶序列片段,并进行同源性分析,表明该系统操作简便、分析结果可靠。  相似文献   

5.
微生物基因组注释系统MGAP   总被引:6,自引:0,他引:6  
利用生物信息学方法和工具开发了微生物基因组注释系统(Microbial genome annotation package, MGAP),并用于蓝细菌PCC7002的基因组注释。该系统由基因组注释系统和基于Web的用户接口程序两部分组成。基因组注释系统整合多个基因识别、功能预测和序列分析软件;以及蛋白质序列数据库、蛋白质资源信息系统和直系同源蛋白质家族数据库等。用户接口程序包括基因组环状图展示、基因和开放读码框在染色体上的分布图,以及注释信息检索工具。该系统基于PC微机和Linux操作系统,用MySQL作数据库管理系统、用Apache作Web服务器程序,用Perl脚本语言编写应用程序接口,上述软件均可免费获得。  相似文献   

6.
通过一组Perl模块(命名为SLtools)实现自动化SAGEmap序列表达谱分析及核酸序列的自动化染色体定位分析,从而使这两种重要的生物学功能的高通量分析成为可能。程序具有良好的可移植性和适应性,可以直接在Windows和Linux系统下使用,无须作任何改动。只须简单编写Perl程序,向该模块提交核酸序列或者序列注册号,就可以得到一系列相应的分析结果。模块可以免费下载:http://bioinfor.cicams.ac.cn/SLtools.tar.gz。  相似文献   

7.
从EST序列着手直接寻找新基因,即利用计算机进行同源性和一致性分析、寻找感兴趣的EST、构建包含着EST的重叠群,再进行ORF判定以及蛋白质同源性分析。最终利用生物信息学技术克隆获得了猪的APOBEC3F序列。  相似文献   

8.
内蒙古白绒山羊VEGF164基因cDNA克隆及组织表达特异性分析   总被引:1,自引:0,他引:1  
旨在克隆内蒙古白绒山羊血管内皮生长因子(vascular endothelial growth factor,VEGF164)基因并分析其基本表达模式。采用RT-PCR技术克隆基因,将得到的基因cDNA序列及其编码的氨基酸序列进行生物信息学分析。利用半定量RT-PCR方法进行组织表达检测。获得了内蒙古白绒山羊VEGF164基因编码区cDNA全长序列,扩增片段全长573 bp,包含了完整的ORF,编码190个氨基酸残基。核苷酸序列与绵羊的VEGF164(EU857623.1)基因同源性为99%,相应的氨基酸序列同源性为99%。SMART程序分析表明,ORF编码的蛋白质具有信号肽序列及血小板衍生和血管内皮生长因子家族(PDGF,VEGF)结构域。Psite程序分析表明,有1个蛋白激酶C磷酸化位点,4个酪蛋白激酶磷酸化位点。ProtComp Version 9.0程序分析将其定位于细胞外。RT-PCR检测表明,VEGF164基因在绒山羊脑、心脏、睾丸、胰腺、脾、肾和肺组织中均有表达。  相似文献   

9.
目前,国际公共数据库中黄瓜EST序列数量的迅速增加为SSR标记的开发提供了极为便利的资源。本研究从葫芦科基因组数据库下载513,801条黄瓜EST,经EST-trimmer软件和CD-HIT程序预处理,共获得381,022条非冗余EST。利用Perl程序MISA搜索到SSR位点15,665个,检出率为4.11%。利用Primer3软件成功设计了9,145对黄瓜EST-SSR引物,随机抽取10对引物对5个黄瓜品种进行多态性分析发现,其中仅2对引物能检测到多态性。该数据为下一步开发新的黄瓜EST-SSR标记奠定了一定的基础。  相似文献   

10.
目的:从超级杂交稻中克隆乙烯反应元件结合蛋白(EREBP)的cDNA。方法:利用模式植物拟南芥中编码乙烯反应元件结合蛋白的cDNA,对现有的水稻基因组数据库进行搜索,获得一条高同源的未知序列。对这条未知序列的核酸序列蛋白质序列及其结构、性质、功能等进行生物信息学分析后,以超级杂交稻为材料,用未知序列设计一对简并引物,用RTPCR技术扩增后进行T-A克隆。结果:生物信息学分析结果表明,这个未知序列应为水稻中编码EREBP的cDNA;克隆后经测序获得一条915bp的cDNA,BLAST表明这条cDNA序列与未知序列的部分核酸序列的同源性达到了99%;提NCBI的GenBank后被接受,登录号为EF507537。结论:以生物信息学分析为基础,结合RT-PCR和T-A克隆技术,成功地从超级杂交稻中克隆了EREBP cDNA。  相似文献   

11.
SEGMENT: identifying compositional domains in DNA sequences   总被引:2,自引:0,他引:2  
MOTIVATION: DNA sequences are formed by patches or domains of different nucleotide composition. In a few simple sequences, domains can simply be identified by eye; however, most DNA sequences show a complex compositional heterogeneity (fractal structure), which cannot be properly detected by current methods. Recently, a computationally efficient segmentation method to analyse such nonstationary sequence structures, based on the Jensen-Shannon entropic divergence, has been described. Specific algorithms implementing this method are now needed. RESULTS: Here we describe a heuristic segmentation algorithm for DNA sequences, which was implemented on a Windows program (SEGMENT). The program divides a DNA sequence into compositionally homogeneous domains by iterating a local optimization procedure at a given statistical significance. Once a sequence is partitioned into domains, a global measure of sequence compositional complexity (SCC), accounting for both the sizes and compositional biases of all the domains in the sequence, is derived. SEGMENT computes SCC as a function of the significance level, which provides a multiscale view of sequence complexity.  相似文献   

12.
Cyanogen bromide cleavage of reductively alkylated homogeneous rat liver dihydropteridine reductase afforded several peptide fragments identifiable by polyacrylamide electrophoresis of which 6 (CB-1 to CB-6) could be individually isolated by C8 reverse phase HPLC. Each was characterised by N-terminal amino acid analysis and sequence information was derived for CB-1, CB-4 and CB-6. The blocked N-terminal of the holoenzyme was identified as pyroglutamate and the C-terminal sequence was obtained by sequential degradation.  相似文献   

13.
14.
15.
Summary The wheat rDNA clone pTA250 was examined in detail to provide a restriction enzyme map and the nucleotide sequence of two of the eleven, 130 bp repeating units found within the spacer region. The 130 bp units showed some sequence heterogeneity. The sequence difference between the two 130 bp units analysed (130.6 and 130.8) was at 7 positions and could be detected as a 4 °C shift in Tm when heterologous and homologous hybrids were compared. This corresponded to a 1.2% change in nucleotide sequence per Tm of 1 °C. The sensitivity of the Tm analysis using cloned sequences facilitated the analysis of small sequence variations in the spacer region of different Triticum aestivum cultivars and natural populations of T. turgidum ssp. dicoccoides (referred to as T. dicoccoides). In addition spacer length variation was assayed by restriction enzyme digestion and hybridization with spacer sequence probes.Extensive polymorphism was observed for the spacer region in various cultivars of T. aestivum, although within each cultivar the rDNA clusters were homogeneous and could be assigned to particular chromosomes. Within natural populations of T. dicoccoides polymorphism was also observed but, once again, within any one individual the rDNA clusters appeared to be homogeneous. The polymorphism, at the sequence level (assayed by Tm analysis), was not so great as to prevent the use of spacer sequence variation as a probe for evolutionary relationships. The length variation as assayed by restriction enzyme digestion did not appear to be as useful in this regard, since its range of variation was extensive even within populations of a species.  相似文献   

16.
一个基于Blast程序的多重序列对齐程序——Mblast   总被引:3,自引:0,他引:3  
核酸序列和蛋白质序列的相似性分析日益成为生物信息学研究的核心内容.NCBI的Blast程序是进行此类分析的最有力工具.虽然它提供了初步的将多条序列进行综合对齐的分析方案,但是实际效果却很不理想.在对Blast程序的输出结果进行仔细分析的基础上,基于“求同存异”的思想,我们编制了一个多重序列对齐程序Mblast.该程序与目前流行的序列多重对齐程序相比,更容易检出序列的同源区.  相似文献   

17.
A new program, PSI Protein Classifier, generalizing the results of both successive and independent iterations of the PSI-BLAST program was developed. The technical opportunities of the program are described and illustrated by two examples. An iterative screening of the amino acid sequence database detected potential evolutionary relationships between GH5, GH13, GH27, GH31, GH36, GH66, GH101 and GH114 families of glycoside hydrolases. Analysis of the statistically significant sequence similarity (E-value analysis) allowed us to divide the family GH31 into 38 subfamilies.  相似文献   

18.
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds.  相似文献   

19.
20.
Analytical DNA ultracentrifugation revealed that eukaryotic genomes are mosaics of isochores: long DNA segments (>300 kb on average) relatively homogeneous in G+C. Important genome features are dependent on this isochore structure, e.g. genes are found predominantly in the GC-richest isochore classes. However, no reliable method is available to rigorously partition the genome sequence into relatively homogeneous regions of different composition, thereby revealing the isochore structure of chromosomes at the sequence level. Homogeneous regions are currently ascertained by plain statistics on moving windows of arbitrary length, or simply by eye on G+C plots. On the contrary, the entropic segmentation method is able to divide a DNA sequence into relatively homogeneous, statistically significant domains. An early version of this algorithm only produced domains having an average length far below the typical isochore size. Here we show that an improved segmentation method, specifically intended to determine the most statistically significant partition of the sequence at each scale, is able to identify the boundaries between long homogeneous genome regions displaying the typical features of isochores. The algorithm precisely locates classes II and III of the human major histocompatibility complex region, two well-characterized isochores at the sequence level, the boundary between them being the first isochore boundary experimentally characterized at the sequence level. The analysis is then extended to a collection of human large contigs. The relatively homogeneous regions we find show many of the features (G+C range, relative proportion of isochore classes, size distribution, and relationship with gene density) of the isochores identified through DNA centrifugation. Isochore chromosome maps, with many potential applications in genomics, are then drawn for all the completely sequenced eukaryotic genomes available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号