共查询到18条相似文献,搜索用时 46 毫秒
1.
基于局部序列相似性比对的数据库搜索系统BLAST是生物信息学领域常用工具之一。本文首先介绍数据库相似性搜索的基本概念,包括计分矩阵、空位罚分,以及灵敏度和特异度等;以血红蛋白alpha和beta亚基为例,说明BLAST搜索基本策略,包括分割种子串、确定近邻串、搜索高分对、延伸高分对、计算期望值等。讨论种子序列字长、计分矩阵、空位罚分等对搜索结果的影响。介绍blastp,blastx,blastn和tblastn四个BLAST通用程序,以及SmartBlast,Primer-Blast和Global Align等专用程序。文末简述BLAST主要用途,列举几个国际国内BLAST网站,介绍FASTA,BLAT,HMMER等其它数据库搜索程序。 相似文献
2.
本研究采集三峡水库库区6目10科41属51种常见鱼类样本,测定了DNA条形码COI基因652 bp同源片段,初步分析条形码序列特征、不同分类水平下K2P遗传距离及系统发育关系.研究发现:属内种间平均遗传距离为0.061,科内属间平均遗传距离为0.110,目内科间平均遗传距离为0.206.DNA条形码具有较高的物种鉴别率... 相似文献
3.
氮循环是微生物和化学过程介导的生物地球化学循环。利用基因测序技术研究环境中参与氮循环的微生物群落、微生物及功能基因,是环境基因组学和微生物生态学的重要研究热点。近年来,各种类型的数据库被开发并应用到功能分析中。本文结合时下最新研究成果,聚焦由微生物引起的同化硝酸盐还原作用、异化硝酸盐还原作用、反硝化作用、固氮作用、硝化作用(包括完全氨氧化作用)和厌氧氨氧化作用等6种无机氮循环途径的功能基因,对比了National Center for Biotechnology Information (NCBI)、Integrated Microbial Genomes (IMG)、Universal Protein (UniProt)、Kyoto Encyclopedia of Genes and Genomes (KEGG)、Protein Families (Pfam)、Functional Gene (FunGene)、Clusters of Orthologous Groups (COG)和NCycDB等数据库的设计理念和功能特点,并结合环境介质、表征基因、分析方法和比对方法等影响因素,分析了以上数据库在氮循环功能基因注释中的选择及应用方式,展望了未来氮循环基因数据库的发展方向,以期为研究人员了解氮循环基因家族和选择合适的数据分析平台提供参考。 相似文献
4.
基因转录调控相关数据库集成系统及其应用 总被引:1,自引:0,他引:1
通过互联网访问的有关基因转录调控的数据库集成系统及其应用 ,包括调控区 (3’和 5’调控区、内显子和外显子调控区等 )、调控单元 (启动子 ,增强子 ,沉默子等 )和转录因子结合位点相关数据库及其数据库系统的性质、组成和功能。也介绍了这些数据库和系统的查询和搜索方法以及相关开发的程序工具。这些生物信息学资源对于从事生物信息学、分子生物学、遗传工程、基因功能、生物技术、代谢工程、药物设计、病理学和药理学研究的机构及人员在教学研究方面具一定的参考价值和帮助。 相似文献
5.
蓝藻与植物叶绿体光合系统基因的生物信息学研究 总被引:1,自引:0,他引:1
用BLAST法比较了蓝藻和叶绿体中编码光合系统蛋白的基因碱基序列的同源性,其中蓝藻来自集胞藻6803和念珠藻7120,叶绿体来自地钱、烟草、水稻、裸藻、黑松、玉米、紫菜、拟南芥等。以集胞藻6803的碱基序列为基准(100%),与其他物种进行同源性比较。在光系统Ⅰ基因中,psaC同源性最高(90.14%),最低的是psaJ,(52.24%)。光系统Ⅱ基因中,同源性最高的是psbD基因(83.7l%),最低的是psbN(49.70%)。ATP合成酶基因中,同源性最高的是atpB基因(79.58%,),最低的是atpF(26.69%)。细胞色素b6/f复合物基因中,同源性最高的是petB(81.66%),最低的是petA(55.27%)。这些数据可为叶绿体的起源和进化提供一些证据。 相似文献
6.
中国基因专利的数据挖掘 总被引:1,自引:0,他引:1
对中国专利基因数据库(NASDAP, http://nasdap.generank.org/)进行了统计和数据挖掘,展示了中国基因专利的全貌;揭示了我国基因专利申请的热点和薄弱面;通过对专利基因生命周期聚类结果的分析,总结出围绕一个专利基因进行二次创新的策略以及我国对此类申请的授权态度。这些数据挖掘结果可为我国药物开发和疾病诊断等生命科学高技术领域知识产权战略制定提供重要参考。 相似文献
7.
目的构建我国烟曲霉cyp51A基因序列BLAST数据库。方法对本课题组收集的143株耐药与非耐药临床烟曲霉进行cyp51A基因测序,使用BLAST工具包汇总测序结果,构建本地数据库。结果此BLAST数据库包含143株耐药与非耐药烟曲霉cyp51A基因序列,以1株耐药与1株非耐药待测烟曲霉标本的cyp51A序列对此数据库进行检索和比对可以确定烟曲霉cyp51A基因型。结论本数据库有望用于cyp51A基因的检索与比对,鉴定烟曲霉cyp51A耐药基因型。 相似文献
8.
9.
10.
基于基因本体论的生物信息个人数据库平台 总被引:3,自引:0,他引:3
论述了一个基于基因本体论(geneontology)的生物信息个人数据库平台BIO.该数据库平台根据自身研究的需要,用基因本体论中的相关的规范术语来对基因序列信息进行注释,从而可以让用户从基因本体论的角度对生物信息序列进行查询.由于因特网上生物数据库中大量的关于基因序列信息的术语不统一、不规范,存在大量的信息冗余,用此方式可最大限度地精确所要查找的结果.文中详细论述了该数据库平台的研究背景、查询功能以及维护. 相似文献
11.
12.
Background
First pass methods based on BLAST match are commonly used as an initial step to separate the different phylogenetic histories of genes in microbial genomes, and target putative horizontal gene transfer (HGT) events. This will continue to be necessary given the rapid growth of genomic data and the technical difficulties in conducting large-scale explicit phylogenetic analyses. However, these methods often produce misleading results due to their inability to resolve indirect phylogenetic links and their vulnerability to stochastic events.Results
A new computational method of rapid, exhaustive and genome-wide detection of HGT was developed, featuring the systematic analysis of BLAST hit distribution patterns in the context of a priori defined hierarchical evolutionary categories. Genes that fall beyond a series of statistically determined thresholds are identified as not adhering to the typical vertical history of the organisms in question, but instead having a putative horizontal origin. Tests on simulated genomic data suggest that this approach effectively targets atypically distributed genes that are highly likely to be HGT-derived, and exhibits robust performance compared to conventional BLAST-based approaches. This method was further tested on real genomic datasets, including Rickettsia genomes, and was compared to previous studies. Results show consistency with currently employed categories of HGT prediction methods. In-depth analysis of both simulated and real genomic data suggests that the method is notably insensitive to stochastic events such as gene loss, rate variation and database error, which are common challenges to the current methodology. An automated pipeline was created to implement this approach and was made publicly available at: https://github.com/DittmarLab/HGTector. The program is versatile, easily deployed, has a low requirement for computational resources.Conclusions
HGTector is an effective tool for initial or standalone large-scale discovery of candidate HGT-derived genes.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-717) contains supplementary material, which is available to authorized users. 相似文献13.
14.
Signature sequences are contiguous patterns of amino acids 10-50 residues long that are associated with a particular structure or function in proteins. These may be of three types (by our nomenclature): superfamily signatures, remnant homologies, and motifs. We have performed a systematic search through a database of protein sequences to automatically and preferentially find remnant homologies and motifs. This was accomplished in three steps: 1. We generated a nonredundant sequence database. 2. We used BLAST3 (Altschul and Lipman, Proc. Natl. Acad. Sci. U.S.A. 87:5509-5513, 1990) to generate local pairwise and triplet sequence alignments for every protein in the database vs. every other. 3. We selected "interesting" alignments and grouped them into clusters. We find that most of the clusters contain segments from proteins which share a common structure or function. Many of them correspond to signatures previously noted in the literature. We discuss three previously recognized motifs in detail (FAD/NAD-binding, ATP/GTP-binding, and cytochrome b5-like domains) to demonstrate how the alignments generated by our procedure are consistent with previous work and make structural and functional sense. We also discuss two signatures (for N-acetyltransferases and glycerol-phosphate binding) which to our knowledge have not been previously recognized. 相似文献
15.
Basic Local Alignment Search Tool, (BLAST) allows the comparison of a query sequence/s
to a database of sequences and identifies those sequences that are similar to the query above a
user-defined threshold. We have developed a user friendly web application, MULTBLAST that runs a
series of BLAST searches on a user-supplied list of proteins against one or more target protein or
nucleotide databases. The application pre-processes the data, launches each individual BLAST search
on the University of Nevada, Reno''s-TimeLogic DeCypher® system (available from
Active Motif, Inc.) and retrieves and combines all the results into a simple, easy to read output file.
The output file presents the list of the query proteins, followed by the BLAST results for the matching
sequences from each target database in consecutive columns. This format is especially useful for
either comparing the results from the different target databases, or analyzing the results while keeping
the identification of each target database separate.
Availability
The application is available at the URLhttp://blastpipe.biochem.unr.edu/ 相似文献16.
Sansom C 《Briefings in bioinformatics》2000,1(1):22-32
This review of sequence database searching aims to set out current practice in the area, in order to give practical guidelines to the experimental biologist. It describes the basic principles behind the programs and enumerates the range of databases available in the public domain. Of these, the most important are the equivalent DNA databases European Molecular Biology Laboratory (EMBL), GenBank and DNA Databank of Japan (DDBJ), and the protein databases Swiss-Prot and TrEMBL. The commonly used BLAST and FASTA algorithms are described in detail and alternative approaches mentioned briefly. Scoring matrices used to compare amino acid types during protein database searches are compared, with an emphasis on the PAM and BLOSUM series of observed substitution matrices. 相似文献
17.
网页方式下的BLAST程序 总被引:1,自引:0,他引:1
NCBI BLAST是(Basic Local Alignment Search Tool,局部对比基本检索工具)方便研究工作的优秀工具。介绍BLAST网页方式下的程序,加深用户对BLAST的了解,利用好BLAST。 相似文献
18.
Zhi-li Pei Xiao-hu Shi Meng Niu Xu-ning Tang Li-sha Liu Ying Kong Yan-chun Liang 《仿生工程学报(英文版)》2007,4(3):177-184
It is very important in the field of bioinformatics to apply computer to perform the function annotation for new sequenced bio-sequences. Based on GO database and BLAST program, a novel method for the function annotation of new biological sequences is presented by using the variable-precision rough set theory. The proposed method is applied to the real data in GO database to examine its effectiveness. Numerical results show that the proposed method has better precision, recall-rate and harmonic mean value compared with existing methods. 相似文献