首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.  相似文献   

2.
Summary A measure of sequence similarity,d t, not requiring prior sequence alignment gave correct results for a variety of computer-generated model sequences without and with gaps for all degrees of substitution,s. Measured was the squared Euclidean distance between vectors of counts of t-tuplets of characters in the two sequences. In models without gaps and without Needleman-Wunsch alignment, averaged was very closely equal to twice average conventional mismatch counts,m. In these models one of each of the conditions on the Jukes-Cantor model was violated in turn: (1) both descendant lineages receive the same number of substitutions, (2) all sites are equally likely to be substituted, (3) all different replacement characters are equally likely to be chosen, and (4) all original characters are equally likely to be substituted. In Jukes-Cantor models with gaps Needleman-Wunsch alignment was necessarily performed, a procedure that generally produced incorrect values ofm. For these models averaged was found to be very closely equal to twice the averagem estimated from the known value ofs using the inverted Jukes-Cantor formula.  相似文献   

3.
The EMBL nucleotide sequence database   总被引:1,自引:0,他引:1  
The European Molecular Biology Laboratory Nucleotide Sequence Database receives sequence and sequence annotation data from genome projects, sequencing centers, individual scientists, and patent offices. Data may be most efficiently submitted to the database using the Internet based submission tool WEBIN or via previously established genome project accounts. Biologist curators will review the data and provide accession numbers within two working days. Non-confidential data are exchanged daily in an international collaboration between EMBL, DDBJ (the DNA Databank of Japan) and GenBank (USA) and may be accessed and retrieved via the Internet with the Sequence Retrieval System (SRS). Sequence database searching algorithms (e.g., Blitz, Fasta, Blast) are available for comparison of query to database sequences.  相似文献   

4.
序列比对是基因序列分析中的一项重要工作.本文以人和鼠的基因为对象,介绍MATLAB 7.X生物信息工具箱中的序列比对方法,内容包括从数据库获取序列信息,查找序列的开放阅读框,将核苷酸序列转换为氨基酸序列,绘制比较两氨基酸序列的散点图,用Needleman-Wunsch算法和Smith-Waterman算法进行比对,以及计算两序列的同一性.  相似文献   

5.
With the advent of high-throughput sequencing technology, sequences from many genomes are being deposited to public databases at a brisk rate. Open access to large amount of expressed sequence tag (EST) data in the public databases has provided a powerful platform for simple sequence repeat (SSR) development in species where sequence information is not available. SSRs are markers of choice for their high reproducibility, abundant polymorphism and high inter-specific transferability. The mining of SSRs from ESTs requires different high-throughput computational tools that need to be executed individually which are computationally intensive and time consuming. To reduce the time lag and to streamline the cumbersome process of SSR mining from ESTs, we have developed a user-friendly, web-based EST-SSR pipeline "EST-SSR-MARKER PIPELINE (ESMP)". This pipeline integrates EST pre-processing, clustering, assembly and subsequently mining of SSRs from assembled EST sequences. The mining of SSRs from ESTs provides valuable information on the abundance of SSRs in ESTs and will facilitate the development of markers for genetic analysis and related applications such as marker-assisted breeding. AVAILABILITY: The database is available for free at http://bioinfo.aau.ac.in/ESMP.  相似文献   

6.
用酚抽提的方法提取乳酸乳球菌的基因组DNA,利用PCR方法从乳酸菌的基因组DNA中扩增出含有苹果酸-乳酸酶基因(malolactic enzyme gene,mle)的约1.6kb的DNA片断,用1%的琼脂糖凝胶分离扩增的片断,用试剂盒回收目的基因。将回收的目的基因与pGEM-T载体连接构建mle-T载体并转化大肠杆菌DH5a,挑取阳性克隆(白色菌落),酶切鉴定并测序。SalI酶切mle-T,回收mle DNA片断,与表达载体pET-28a载体连接,构建细菌Escherichia coli表达载体。  相似文献   

7.
根据影响对序列比较速度的二个重要因素:1.启动硬盘读写时间。2.对盘检索定位和数据译码时间。提出相应的改进措施。达到提高比较速度的目的,并对PCR扩增产物的判定为例,说明方法成功地应用于未知序列判定。  相似文献   

8.
We report here on the cloning, characterization and radiation hybrid mapping of the canine basic keratin gene KRT2p. The gene spans 8.3 kb, consists of nine exons and eight introns, and is characterized by the typical features of both basic keratins and keratins in general, including glycine-rich head and tail domains, which flank an α-helical rod domain of approximately 310 amino acids. Comparisons of sequence and structure reveal that canine KRT2p is strikingly similar to human KRT2p. Alignment of the predicted amino acid sequences for human and dog reveals greater than 80% identity. In the rod domain, the amino acid identity exceeds 90%. We note, however, that canine KRT2p encodes a protein 21 residues longer than human K2p due to the insertion of a glycine repeat motif, GG(G)X, in the head and tail domains of the canine gene. This is the first report of the nearly complete genome sequence for KRT2p of any organism. Radiation hybrid mapping of canine KRT2p to chromosome 27 of the dog is also reported. Electronic Publication  相似文献   

9.
The utility of engineering enzyme activity is expanding with the development of biotechnology. Conventional methods have limited applicability as they require high-throughput screening or three-dimensional structures to direct target residues of activity control. An alternative method uses sequence evolution of natural selection. A repertoire of mutations was selected for fine-tuning enzyme activities to adapt to varying environments during the evolution. Here, we devised a strategy called sequence co-evolutionary analysis to control the efficiency of enzyme reactions (SCANEER), which scans the evolution of protein sequences and direct mutation strategy to improve enzyme activity. We hypothesized that amino acid pairs for various enzyme activity were encoded in the evolutionary history of protein sequences, whereas loss-of-function mutations were avoided since those are depleted during the evolution. SCANEER successfully predicted the enzyme activities of beta-lactamase and aminoglycoside 3′-phosphotransferase. SCANEER was further experimentally validated to control the activities of three different enzymes of great interest in chemical production: cis-aconitate decarboxylase, α-ketoglutaric semialdehyde dehydrogenase, and inositol oxygenase. Activity-enhancing mutations that improve substrate-binding affinity or turnover rate were found at sites distal from known active sites or ligand-binding pockets. We provide SCANEER to control desired enzyme activity through a user-friendly webserver.  相似文献   

10.
11.
The study of sequence diversity under phylogenetic models is now classic. Theoretical studies of diversity under the Kingman coalescent appeared shortly after the introduction of the coalescent. In this paper we revisit this topic under the multispecies coalescent, an extension of the single population model to multiple populations. We derive exact formulas for the sequence dissimilarity of two sequences drawn at random under a basic multispecies setup. The multispecies model uses three parameters—the species tree birth rate under the pure birth process (Yule), the species effective population size and the mutation rate. We also discuss the effects of relaxing some of the model assumptions.  相似文献   

12.
The metallochromic indicator murexide has been used to monitor calcium concentration changes during the dextran-induced, phosphatidylserine-dependent degranulation of rat peritoneal mast cells. The dextran-induced Ca2+-uptake showed an absolute dependence on the presence of phosphatidylserine. The extent of Ca2+-uptake increased with phosphatidylserine in a concentration-dependent manner. At 25 degrees C the half-life of the uptake process equalled 35 +/- 5 s. Exposure of the mast cells to dextran in the presence of Ca2+, but in the absence of phosphatidylserine, desensitized the cells. The subsequent addition of phosphatidylserine failed to restore the Ca2+-uptake activity. However, the Ca2+-ionophore A23187 did promote Ca2+ uptake by the cells without PS.  相似文献   

13.
胸腺素α原基因的克隆与序列分析   总被引:1,自引:0,他引:1  
应用RT-PCR法从正常成人外周血和胎儿胸腺中分别克隆得到了四种胸腺素α原基因,经序列分析结果表明,克隆的四种ProTα基因的核苷酸序列并不一致。与已报道的胸腺素α原基因进行比较,胎儿胸腺中克隆的胸腺素α原基因几乎无变化,而从成人外周血中所克隆的则变化较大,但变化区域有一定规律,109-120位都有GGGAATGCTAAT碱基的缺失,变化较多的氨基酸集中为天冬氨酸和谷氨酸,而胸腺素α原前28个氨基酸、中心酸性区和末端核定位信号区域变化较小。该结果为胸腺素α原的结构、功能和演化研究提供了信息。  相似文献   

14.
牦牛CAPN1基因的克隆与序列分析   总被引:1,自引:0,他引:1  
CAPN1是影响肌肉嫩度的数量性状位点 (QTL)的候选基因。根据GenBank发表的普通牛CAPN1基因序列设计特异性引物,以天祝白牦牛cDNA为模板,分段进行PCR扩增,克隆,测序。应用生物软件BioEdit对各测序结果进行序列拼接共获得牦牛CAPN1 cDNA 片段2267bp,其中包含一个2151bp的完整的开放阅读框(ORF),以及3’和5’末端非编码区的部分序列(77bp和166bp) 。分析表明:牦牛CAPN1基因编码区全长2151bp,共编码716个氨基酸。与已报道的牛,猪,人小鼠的序列进行比较,核苷酸同源性分别为99.3%,93.9%,90.0% ,85.5% 。预测氨基酸的同源性分别为99.4%,96.1%,94.6%,89.0%,并且对牦牛CAPN1四个结构域分别进行NCBI BLAST发现四个结构域在以上四个物种中都显示出很好的保守性,最为保守的在结构域Ⅳ(>96%)。牦牛与牛产生的 14个核苷酸突变中,有3个产生了氨基酸突变,均发生在结构域Ⅲ。构建分子系统进化树表明:聚类结果与传统分类学相符。  相似文献   

15.
含有免疫激活序列的DNA或寡聚核苷酸对脊椎动物的免疫系统可以产生广泛的激活作用。本文综述DNA免疫激活作用的结构基础、细胞及分子作用机制等方面的研究进展。  相似文献   

16.
大腹园蛛(Araneus ventricosus)12SrRNA基因片段序列分析   总被引:2,自引:1,他引:2  
对大腹园蛛(Araneus ventricosus)的12SrRNA基因部分序列进行测定,得到276bp的碱基序列,其中A、T、G、C的含量分别为104bp(37.68%)、96bp(34.78%)、33bp(11.95%)、43bp(15.57%)。并尝试为我国蜘蛛目的分子系统学研究提供一些资料。  相似文献   

17.
Protein trafficking or protein sorting in eukaryotes is a complicated process and is carried out based on the information contaified in the protein. Many methods reported prediction of the subcellular location of proteins from sequence information. However, most of these prediction methods use a flat structure or parallel architecture to perform prediction. In this work, we introduce ensemble classifiers with features that are extracted directly from full length protein sequences to predict locations in the protein-sorting pathway hierarchically. Sequence driven features, sequence mapped features and sequence autocorrelation features were tested with ensemble learners and their performances were compared. When evaluated by independent data testing, ensemble based-bagging algorithms with sequence feature composition, transition and distribution (CTD) successfully classified two datasets with accuracies greater than 90%. We compared our results with similar published methods, and our method equally performed with the others at two levels in the secreted pathway. This study shows that the feature CTD extracted from protein sequences is effective in capturing biological features among compartments in secreted pathways.  相似文献   

18.
We recently introduced a new molecular evolution model called the IDIS model for Insertion Deletion Independent of Substitution  and . In the IDIS model, the three independent processes of substitution, insertion and deletion of residues have constant rates. In order to control the genome expansion during evolution, we generalize here the IDIS   model by introducing an insertion rate which decreases when the sequence grows and tends to 0 for a maximum sequence length nmaxnmax.  相似文献   

19.
综述免疫激活序列 (ISS)应用基础方面的研究进展。ISS- DNA不仅有望成为新一代高效低毒的免疫佐剂 ,而且在抗肿瘤、抗感染、促进造血、治疗免疫缺陷以及防治哮喘等方面均有非常乐观的应用前景。  相似文献   

20.
BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences   总被引:49,自引:0,他引:49  
'BLAST 2 Sequences', a new BLAST-based tool for aligning two protein or nucleotide sequences, is described. While the standard BLAST program is widely used to search for homologous sequences in nucleotide and protein databases, one often needs to compare only two sequences that are already known to be homologous, coming from related species or, e.g. different isolates of the same virus. In such cases searching the entire database would be unnecessarily time-consuming. 'BLAST 2 Sequences' utilizes the BLAST algorithm for pairwise DNA-DNA or protein-protein sequence comparison. A World Wide Web version of the program can be used interactively at the NCBI WWW site (http://www.ncbi.nlm.nih.gov/gorf/bl2.++ +html). The resulting alignments are presented in both graphical and text form. The variants of the program for PC (Windows), Mac and several UNIX-based platforms can be downloaded from the NCBI FTP site (ftp://ncbi.nlm.nih.gov).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号