首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A program package is described for the management and the analysis of DNA sequence data. The programs - with the exception of a few Fortran routines - are written in the programming language APL. They are best used interactively although batch processing is possible. The package has been in constant use for about 3 years and contains programs for most of the routine problems presently found in a DNA sequencing laboratory.  相似文献   

2.
A strategy of DNA sequencing employing computer programs.   总被引:65,自引:31,他引:34       下载免费PDF全文
With modern fast sequencing techniques and suitable computer programs it is now possible to sequence whole genomes without the need of restriction maps. This paper describes computer programs that can be used to order both sequence gel readings and clones. A method of coding for uncertainties in gel readings is described. These programs are available on request.  相似文献   

3.
Informatics for protein identification by mass spectrometry   总被引:3,自引:0,他引:3  
High throughput protein analysis (i.e., proteomics) first became possible when sensitive peptide mass mapping techniques were developed, thereby allowing for the possibility of identifying and cataloging most 2D gel electrophoresis spots. Shortly thereafter a few groups pioneered the idea of identifying proteins by using peptide tandem mass spectra to search protein sequence databases. Hence, it became possible to identify proteins from very complex mixtures. One drawback to these latter techniques is that it is not entirely straightforward to make matches using tandem mass spectra of peptides that are modified or have sequences that differ slightly from what is present in the sequence database that is being searched. This has been part of the motivation behind automated de novo sequencing programs that attempt to derive a peptide sequence regardless of its presence in a sequence database. The sequence candidates thus generated are then subjected to homology-based database search programs (e.g., BLAST or FASTA). These homology search programs, however, were not developed with mass spectrometry in mind, and it became necessary to make minor modifications such that mass spectrometric ambiguities can be taken into account when comparing query and database sequences. Finally, this review will discuss the important issue of validating protein identifications. All of the search programs will produce a top ranked answer; however, only the credulous are willing to accept them carte blanche.  相似文献   

4.
MOTIVATION: Multiple sequence alignments (MSAs) are at the heart of bioinformatics analysis. Recently, a number of multiple protein sequence alignment benchmarks (i.e. BAliBASE, OXBench, PREFAB and SMART) have been released to evaluate new and existing MSA applications. These databases have been well received by researchers and help to quantitatively evaluate MSA programs on protein sequences. Unfortunately, analogous DNA benchmarks are not available, making evaluation of MSA programs difficult for DNA sequences. RESULTS: This work presents the first known multiple DNA sequence alignment benchmarks that are (1) comprised of protein-coding portions of DNA (2) based on biological features such as the tertiary structure of encoded proteins. These reference DNA databases contain a total of 3545 alignments, comprising of 68 581 sequences. Two versions of the database are available: mdsa_100s and mdsa_all. The mdsa_100s version contains the alignments of the data sets that TBLASTN found 100% sequence identity for each sequence. The mdsa_all version includes all hits with an E-value score above the threshold of 0.001. A primary use of these databases is to benchmark the performance of MSA applications on DNA data sets. The first such case study is included in the Supplementary Material.  相似文献   

5.
Computer programs for the assembly of DNA sequences.   总被引:26,自引:20,他引:6       下载免费PDF全文
A collection of user-interactive computer programs is described which aid in the assembly of DNA sequences. This is achieved by searching for the positions of overlapping common nucleotide sequences within the blocks of sequence obtained as primary data. Such overlapping segments are then melded into one continuous string of nucleotides. Strategies for determining the accuracy of the sequence being analyzed and reducing the error rate resulting from the manual manipulation of sequence data are discussed. Sequences mapping from 97.3 to 100% of the Ad2 virus genome were used to demonstrate the performance of these programs.  相似文献   

6.
本文介绍欧洲分子生物学开放软件包EMBOSS序列分析程序应用实例.第1节简单介绍EMBOSS软件包的概况和基本用法.第2节介绍格式转换、序列提取、序列变换和序列显示等常用序列处理程序.第3节介绍序列比对程序,包括双序列比对、多序列比对和点阵图程序.第4节介绍常用核酸序列分析程序,可用于核苷酸组分统计、开放读码框分析、C...  相似文献   

7.
本文介绍了一个在微机(IBM PC)上实现的、用于核酸顺序分析的计算机程序系统.该系统由三个层次和18个功能块构成,菜单及人机对话使得用户能较快地掌握和使用它.在编程中,采用了树结构、先进后出栈和稀疏矩阵等数据结构技巧,运用了Bayes法等统计分析方法,Kruskal算法和Floyd算法等一系列图论方法也被得到应用,这个软件系统的推出对于分子生物学研究具有一定的积极作用.  相似文献   

8.
MicroRNAs (miRNAs) are small non-coding RNA molecules that regulate mRNAs through a sequence-specific mechanism. By virtue of their structure and mechanism of action, computational methods have been devised to investigate the encoding of miRNA genes and the targets of miRNA action. A variety of assumptions have predicated the implementation of these various computational solutions. Evolutionary sequence conservation, secondary structure, and folding energetics are some of the assumptions that have been used. The success of these different computational solutions has been evaluated for both elucidation of new miRNAs and deducing targets of miRNA action. While the focus is on search techniques for new miRNAs, we have compared the programs miRseeker, miRScan, PalGrade, ProMiR, and miRAlign as examples of implementation of these techniques. For these programs, a benchmark comparison between theoretical estimation and actual identification is possible. We have also compared the target prediction programs TargetScanS, PicTar, DIANA-microT, miRanda, and RNAhybrid. However, it is difficult to rigorously assess the benchmark performance of these programs due to the difficulty in confirming their theoretical predictions.  相似文献   

9.
Macintosh sequence analysis software   总被引:3,自引:0,他引:3  
The analysis of information in nucleotide and amino acid sequence data from an investigator’s own laboratory, or from the ever-growing worldwide databases, is critically dependent on well planned and written software. Although the most powerful packages previously have been confined to workstations, there has been a dramatic increase over the last few years in the sophistication of the programs available for personal computers, as the speed and power of these have increased. A wide choice of software is available for the Macintosh, including the LaserGene suite of programs from DNAStar. This review assesses the strengths and weaknesses of LaserGene and concludes that it provides a useful and comprehensive range of sequence analysis tools.  相似文献   

10.
本文报道了在AppleⅡ型微机上实现核酸数据处理的一系列工作程序。应用这些程序,可进行核酸数据的贮存、对指定的核酸数据结构的改造、限制性内切酶识别位点的检索、核酸序列至蛋白序列的翻译、相关核酸序列及蛋白序列的同源性比较、氨基酸密码使用频率的统计和基因的启动子结构的初步探索等方面的工作。  相似文献   

11.
Microcomputer programs for DNA sequence analysis.   总被引:21,自引:5,他引:16       下载免费PDF全文
Computer programs are described which allow (a) analysis of DNA sequences to be performed on a laboratory microcomputer or (b) transfer of DNA sequences between a laboratory microcomputer and another computer system, such as a DNA library. The sequence analysis programs are interactive, do not require prior experience with computers and in many other respects resemble programs which have been written for larger computer systems (1-7). The user enters sequence data into a text file, accesses this file with the programs, and is then able to (a) search for restriction enzyme sites or other specified sequences, (b) translate in one or more reading frames in one or both directions in order to find open reading frames, or (c) determine codon usage in the sequence in one or more given reading frames. The results are given in table format and a restriction map is generated. The modem program permits collection of large amounts of data from a sequence library into a permanent file on the microcomputer disc system, or transfer of laboratory data in the reverse direction to a remote computer system.  相似文献   

12.
基因组学研究中一些常用软件的概述   总被引:1,自引:0,他引:1  
吴清发 《遗传》2003,25(6):708-712
基因组学是以一个物种的全部遗传信息为研究对象,在整体上研究遗传信息的分子组成、组织结构、表达调控和进化等内在机制的基础性学科。基因组学研究中海量数据的存储、管理和检索,以及对这些数据进行挖掘等过程, 必须借助于生物信息学的方法。 目前,大量成熟的软件广泛地应用在基因组学研究中,它们大都可通过互联网免费访问或索取。本文拟对人类基因组计划中常用的一些软件如序列比对、序列组装、重复序列鉴定和基因预测等软件的原理作一介绍,并结合典型软件加以说明。 Abstract:Genomics is a novel subject that has been developed accompanying with the progress of human genome project.Genomics deals with the chemistry component,structure organization and evolution of genome at global level.As genomics associated with huge data,bioinformatics plays an important role in these processes of data production,data management and data mining.At present,many reliable programs have been used in genomic research successfully,which are usually accessible and downloaded freely.We address here the principles of some programs used wildly in genomics such as sequence alignment,sequence assembly,repeat identification and gene prediction,which are exemplified with typical programs respectively.  相似文献   

13.
ACNUC is a database structure and retrieval software for usewith either the GenBank or EMBL nucleic acid sequence data collections.The nucleotide and textual data furnished by both collectionsare each restructured into a database that allows sequence retrievalon a multi-criterion basis. The main selection criteria are:species (or higher order taxon), keyword, reference, journal,author, and organelle; all logical combinations of these criteriacan be used. Direct access to sequence regions that code fora specific product (protein, tRNA or rRNA) is provided. A versatileextraction procedure copies selected sequences, or fragmentsof them, from the database to user files suitable to be analysedby user-supplied application programs. A detailed help mechanismis provided to aid the user at any time during the retrievalsession. All software has been written in FORTRAN 77 which guaranteesa high degree of transportability to minicomputers or mainframes.reference, journal, author, and organelle; all logical combinationsof these criteria can be used. Direct access to sequence regionsthat code for a specific product (protein, tRNA or rRNA) isprovided. A versatile extraction procedure copies selected sequences,or fragments of them, from the database to user files suitableto be analysed by user-supplied application programs. A detailedhelp mechanism is provided to aid the user at any time duringthe retrieval session. All software has been written in FORTRAN77 which guarantees a high degree of transportability to minicomputersor mainframes. Received on May 1, 1985; accepted on June 13, 1985  相似文献   

14.
15.
De novo design of the hydrophobic cores of proteins.   总被引:22,自引:17,他引:5       下载免费PDF全文
We have developed and experimentally tested a novel computational approach for the de novo design of hydrophobic cores. A pair of computer programs has been written, the first of which creates a "custom" rotamer library for potential hydrophobic residues, based on the backbone structure of the protein of interest. The second program uses a genetic algorithm to globally optimize for a low energy core sequence and structure, using the custom rotamer library as input. Success of the programs in predicting the sequences of native proteins indicates that they should be effective tools for protein design. Using these programs, we have designed and engineered several variants of the phage 434 cro protein, containing five, seven, or eight sequence changes in the hydrophobic core. As controls, we have produced a variant consisting of a randomly generated core with six sequence changes but equal volume relative to the native core and a variant with a "minimalist" core containing predominantly leucine residues. Two of the designs, including one with eight core sequence changes, have thermal stabilities comparable to the native protein, whereas the third design and the minimalist protein are significantly destabilized. The randomly designed control is completely unfolded under equivalent conditions. These results suggest that rational de novo design of hydrophobic cores is feasible, and stress the importance of specific packing interactions for the stability of proteins. A surprising aspect of the results is that all of the variants display highly cooperative thermal denaturation curves and reasonably dispersed NMR spectra. This suggests that the non-core residues of a protein play a significant role in determining the uniqueness of the folded structure.  相似文献   

16.
A computer package written in Fortran-IV for the PDP-11 minicomputer is described. The package's novel features are: software for voice-entry of sequence data; a less memory intensive algorithm for optimal sequence alignment; and programs that fit statistical models to nucleic acid and protein sequences.  相似文献   

17.
A new computer search strategy has been devised for high-resolutionnucleotide sequence analysis. The strategy differs from thoseused by earlier sequence analysing programs in that it is exhaustiveand capable of detecting all possible homologies and other typesof relationships between or within sequences irrespective ofthe pattern of matches and mismatches encountered. The implementationof this strategy into a working algorithm is described. Received on March 1, 1985; accepted on April 24, 1985  相似文献   

18.

Background  

Accurate sequence alignments are essential for homology searches and for building three-dimensional structural models of proteins. Since structure is better conserved than sequence, structure alignments have been used to guide sequence alignments and are commonly used as the gold standard for sequence alignment evaluation. Nonetheless, as far as we know, there is no report of a systematic evaluation of pairwise structure alignment programs in terms of the sequence alignment accuracy.  相似文献   

19.
We describe the further development of a widely used package of DNA and protein sequence analysis programs for microcomputers (1,2,3). The package now provides a screen oriented user interface, and an enhanced working environment with powerful formatting, disk access, and memory management tools. The new GenBank floppy disk database is supported transparently to the user and a similar version of the NBRF protein database is provided. The programs can use sequence file annotation to automatically annotate printouts and translate or extract specified regions from sequences by name. The sequence comparison programs can now perform a 5000 X 5000 bp analysis in 12 minutes on an IBM PC. A program to locate potential protein coding regions in nucleic acids, a digitizer interface, and other additions are also described.  相似文献   

20.
Further procedures for sequence analysis by computer   总被引:80,自引:41,他引:39       下载免费PDF全文
A previous paper1 described programs for sequence data handling and analysis by computer. The facilities of this basic set are extended by further easily used programs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号