首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Applications of simulated annealing to peptides   总被引:2,自引:0,他引:2  
S R Wilson  W L Cui 《Biopolymers》1990,29(1):225-235
We report the application of a new conformation searching algorithm called simulated annealing to the location of the global minimum energy conformation of peptides. Simulated annealing is a Metropolis Monte Carlo approach to conformation generation in which both the energy and temperature dependence of the Boltzmann distribution guides the search for the global minimum. Both uphill and downhill moves are possible, which allows the molecule to escape from local minima. Applications to the 20 natural amino acid "dipeptide models" as well as to polyalanines up to Ala80 are very successful in finding the lowest energy conformation. A history file of the simulated annealing process allows reconstruction and examination of the random walk around conformation space. A separate program, Conf-Gen, reads the history file and extracts all low-energy conformations visited during the run.  相似文献   

2.
Next‐generation sequencing technologies are extensively used in the field of molecular microbial ecology to describe taxonomic composition and to infer functionality of microbial communities. In particular, the so‐called barcode or metagenetic applications that are based on PCR amplicon library sequencing are very popular at present. One of the problems, related to the utilization of the data of these libraries, is the analysis of reads quality and removal (trimming) of low‐quality segments, while retaining sufficient information for subsequent analyses (e.g. taxonomic assignment). Here, we present StreamingTrim, a DNA reads trimming software, written in Java, with which researchers are able to analyse the quality of DNA sequences in fastq files and to search for low‐quality zones in a very conservative way. This software has been developed with the aim to provide a tool capable of trimming amplicon library data, retaining as much as taxonomic information as possible. This software is equipped with a graphical user interface for a user‐friendly usage. Moreover, from a computational point of view, StreamingTrim reads and analyses sequences one by one from an input fastq file, without keeping anything in memory, permitting to run the computation on a normal desktop PC or even a laptop. Trimmed sequences are saved in an output file, and a statistics summary is displayed that contains the mean and standard deviation of the length and quality of the whole sequence file. Compiled software, a manual and example data sets are available under the BSD‐2‐Clause License at the GitHub repository at https://github.com/GiBacci/StreamingTrim/ .  相似文献   

3.
基于PC/Linux的核酸序列分析系统的构建及其应用   总被引:13,自引:2,他引:11  
基于PC机和Linux操作系统, 利用Phred/Phrap/Consed软件和Blast软件, 构建了核酸序列大规模自动分析系统. 该套系统可自动完成从测序峰图向核酸序列的转化、载体序列去除、序列自动拼接、重复序列鉴定以及序列的相似性分析, 可加速对大规模测序数据的分析和利用.  相似文献   

4.
MOSIX is a cluster management system that supports preemptive process migration. This paper presents the MOSIX Direct File System Access (DFSA), a provision that can improve the performance of cluster file systems by allowing a migrated process to directly access files in its current location. This capability, when combined with an appropriate file system, could substantially increase the I/O performance and reduce the network congestion by migrating an I/O intensive process to a file server rather than the traditional way of bringing the file's data to the process. DFSA is suitable for clusters that manage a pool of shared disks among multiple machines. With DFSA, it is possible to migrate parallel processes from a client node to file servers for parallel access to different files. Any consistent file system can be adjusted to work with DFSA. To test its performance, we developed the MOSIX File-System (MFS) which allows consistent parallel operations on different files. The paper describes DFSA and presents the performance of MFS with and without DFSA.  相似文献   

5.
Microcomputer programs for DNA sequence analysis.   总被引:21,自引:5,他引:16       下载免费PDF全文
Computer programs are described which allow (a) analysis of DNA sequences to be performed on a laboratory microcomputer or (b) transfer of DNA sequences between a laboratory microcomputer and another computer system, such as a DNA library. The sequence analysis programs are interactive, do not require prior experience with computers and in many other respects resemble programs which have been written for larger computer systems (1-7). The user enters sequence data into a text file, accesses this file with the programs, and is then able to (a) search for restriction enzyme sites or other specified sequences, (b) translate in one or more reading frames in one or both directions in order to find open reading frames, or (c) determine codon usage in the sequence in one or more given reading frames. The results are given in table format and a restriction map is generated. The modem program permits collection of large amounts of data from a sequence library into a permanent file on the microcomputer disc system, or transfer of laboratory data in the reverse direction to a remote computer system.  相似文献   

6.
The web application oriented on identification and visualization of protein regions encoded by exons is presented. The Exon Visualiser can be used for visualisation on different levels of protein structure: at the primary (sequence) level and secondary structures level, as well as at the level of tertiary protein structure. The programme is suitable for processing data for all genes which have protein expressions deposited in the PDB database. The procedure steps implemented in the application: I) loading exons sequences and theirs coordinates from GenBank file as well as protein sequences: CDS from GenBank and aminoacid sequence from PDB II) consensus sequence creation (comparing amino acid sequences form PDB file with the CDS sequence from GenBank file) III) matching exon coordinates IV) visualisation in 2D and 3D protein structures. Presented web-tool among others provides the color-coded graphical display of protein sequences and chains in three dimensional protein structures which are correlated with the corresponding exons.

Availability

http://149.156.12.53/ExonVisualiser/  相似文献   

7.
The partial amino acid sequences of 121 rice proteins separated by two-dimensional gel electrophoresis (2D-PAGE), were determined for a protein sequence data file. In the Rice Genome Research Program (RGP), more than 20,000 cDNA clones randomly selected from rice cDNA libraries have been sequenced to construct a cDNA catalog. Complimentary DNAs encoding about 30% of proteins in the protein sequence data file could be identified in the catalog by computer search. It was deduced that 20,000–40,000 genes are present in the rice genome. Only half of about 20,000 cDNAs sequenced in the RGP, corresponding to 1/4–1/2 of genes present in the entire rice genome, should have unique sequences after considering gene redundancy. This is consistent with the fact that the cDNAs encoding about 30% of the sequenced proteins could be identified in the catalog. If the size of the cDNA catalog is enlarged further, cDNAs encoding all proteins separated by 2D-PAGE could be easily identified from the catalog by using the protein sequence data.  相似文献   

8.
The interaction between rapidly evolving centromere sequences and conserved kinetochore machinery appears to be mediated by centromere-binding proteins. A recent theory proposes that the independent evolution of centromere-binding proteins in isolated populations may be a universal cause of speciation among eukaryotes. In Drosophila the centromere-specific histone, Cid (centromere identifier), shows extensive sequence divergence between D. melanogaster and the D. simulans clade, indicating that centromere machinery incompatibilities may indeed be involved in reproductive isolation and speciation. However, it is presently unclear whether the adaptive evolution of Cid was a cause of the divergence between these species, or merely a product of postspeciation adaptation in the separate lineages. Furthermore, the extent to which divergent centromere identifier proteins provide a barrier to reproduction remains unknown. Interestingly, a small number of rescue lines from both D. melanogaster and D. simulans can restore hybrid fitness. Through comparisons of cid sequence between nonrescue and rescue strains, we show that cid is not involved in restoring hybrid viability or female fertility. Further, we demonstrate that divergent cid alleles are not sufficient to cause inviability or female sterility in hybrid crosses. Our data do not dispute the rapid divergence of cid or the coevolution of centromeric components in Drosophila; however, they do suggest that cid underwent adaptive evolution after D. melanogaster and D. simulans diverged and, consequently, is not a speciation gene.  相似文献   

9.
We have selected 210 mutants able to grow on sucrose in the presence of 2-deoxyglucose. We identified recessive mutations in three major complementation groups that cause constitutive (glucose-insensitive) secreted invertase synthesis. Two groups comprise alleles of the previously identified HXK2 and REG1 genes, and the third group was designated cid1 (constitutive invertase derepression). The effect of cid1 on SUC2 expression is mediated by the SUC2 upstream regulatory region, as judged by the constitutive expression of a SUC2-LEU2-lacZ fusion in which the LEU2 promoter is under control of SUC2 upstream sequences. A cid1 mutation also causes glucose-insensitive expression of maltase. The previously isolated constitutive mutation ssn6 is epistatic to cid1, reg1 and hxk2 for very high level constitutive invertase expression. Mutations in SNF genes that prevent derepression of invertase are epistatic to cid1, reg1 and hxk2; we have previously shown that ssn6 has different epistasis relationships with snf mutations. The constitutive mutation tup1 was found to resemble ssn6 in its genetic interactions with snf mutations. These findings suggest that CID1, REG1 and HXK2 are functionally distinct from SSN6 and TUP1.  相似文献   

10.
A computer software package called 'FasParser' was developed for manipulating sequence data.It can be used on personal computers to perform series of analyses,including counting and viewing differences between two sequences at both DNA and codon levels,identifying overlapping regions between two alignments,sorting of sequences according to their IDs or lengths,concatenating sequences of multiple loci for a particular set of samples,translating nucleotide sequences to amino acids,and constructing alignments in several different formats,as well as some extracting and filtrating of data for a particular FASTA file.Majority of these functions can be run in a batch mode,which is very useful for analyzing large data sets.This package can be used by a broad audience,and is designed for researchers that do not have programming experience in sequence analyses.The GUI version of FasParser can be downloaded from https://github.com/Sun-Yanbo/FasParser,free of charge.  相似文献   

11.
SPLICE, a software tool for the extraction of sequences fromfiles in GenBank tape format, has been developed. The programcan analyze the features table in this format and use any ofthe information provided to write the corresponding sequencesinto a standard sequence file format suitable for use with sequenceanalysis programs. Sequences that are present as several subsequentfragments in a single GenBank file, such as those encoding apeptide, can be spliced together by the program. Further, sequencesthat are present in more than one Genbank file, such as an exonwhich spans several different files, can also be spliced intoone sequence. SPLICE runs under the MS/DOS and Unix operatingsystems, can be called as a sub-process by other programs andcan process batches of files. Received on December 26, 1989; accepted on May 30, 1990  相似文献   

12.
Typically, detection of protein sequences in collision-induced dissociation (CID) tandem MS (MS2) dataset is performed by mapping identified peptide ions back to protein sequence by using the protein database search (PDS) engine. Finding a particular peptide sequence of interest in CID MS2 records very often requires manual evaluation of the spectrum, regardless of whether the peptide-associated MS2 scan is identified by PDS algorithm or not. We have developed a compact cross-platform database-free command-line utility, pepgrep, which helps to find an MS2 fingerprint for a selected peptide sequence by pattern-matching of modelled MS2 data using Peptide-to-MS2 scoring algorithm. pepgrep can incorporate dozens of mass offsets corresponding to a variety of post-translational modifications (PTMs) into the algorithm. Decoy peptide sequences are used with the tested peptide sequence to reduce false-positive results. The engine is capable of screening an MS2 data file at a high rate when using a cluster computing environment. The matched MS2 spectrum can be displayed by using built-in graphical application programming interface (API) or optionally recorded to file. Using this algorithm, we were able to find extra peptide sequences in studied CID spectra that were missed by PDS identification. Also we found pepgrep especially useful for examining a CID of small fractions of peptides resulting from, for example, affinity purification techniques. The peptide sequences in such samples are less likely to be positively identified by using routine protein-centric algorithm implemented in PDS. The software is freely available at http://bsproteomics.essex.ac.uk:8080/data/download/pepgrep-1.4.tgz.  相似文献   

13.
随着流感病毒基因组测序数据的急剧增加,深入挖掘流感病毒基因组大数据蕴含的生物学信息成为研究热点。基于中国流感病毒流行特征数据,建设一个集自动化、一体化和信息化的序列库系统,对于实现流感病毒基因组批量快速翻译、注释、存储、查询、分析具有重要的应用价值。本课题组通过集成一系列软件和工具包,并结合自主研发的其他功能,在底层维护的2个关键的参考数据集基础上另外追加了翻译注释信息最佳匹配的精细化筛选规则,构建具有流感病毒基因组信息存储、自动化翻译、蛋白序列精准注释、同源序列比对和进化树分析等功能的自动化系统。结果显示,通过Web端输入fasta格式的流感病毒基因序列,本系统可针对参考序列片段数据集(blastdb.fasta)进行Blast同源性检索,可以鉴定流感病毒的型别(A、B或C)、亚型和基因片段(1~8片段);在此基础上,通过查询数据库底层用于翻译、注释的基因片段参考数据集,可以获得一组肽段数据集,然后通过循环调用ProSplign软件对其进行预测。结合精细化的筛选准入规则,选出与输入序列匹配最好的翻译后产物,作为该输入序列的预测蛋白,输出为gbk,asn和fasta等通用格式的文件,给出序列长度、是否全长、病毒型别、亚型、片段等信息。基于以上工作,另外自主研发了系统其他的附加功能如进化树分析展示、基因组数据存储等功能,构建成基于Web服务的流感病毒基因组自动化翻译注释系统。本研究提示,系统高度集成系列软件以及自有的注释翻译数据库文件,实现从序列存储、翻译、注释到序列分析和展示的功能,可全面满足我国高通量基因检测数据共享化、本土化、一体化、自动化的需求。  相似文献   

14.
TRFMA provides a Web environment for analyzing T-RFLP results based on molecular weights of the fragments, rather than the numbers of nucleotides, to increase accuracy. The 16S rRNA data are saved as an XML file containing around 650 sequences (light version) and a MySQL database containing around 50 000 sequences (full version), which are connected to Web server via PHP5 and manipulated on an Internet browser. AVAILABILITY: TRFMA is freely available at http://myamagu.dent.kyushu-u.ac.jp/bioinformatics/trfma/index.html and can be downloaded from the same site.  相似文献   

15.
The concentrations of glycolytic intermediates, acid components and adenosine nucleotides were determined at half-weekly intervals during development and ripening of grape berries. Based on distinctive non-equilibrium conditions and enzymic activities which are not controlled by substrate availability at the levels of phosphoenolpyruvate/pyruvate and fructose-6-phosphate/fructose diphosphate it is concluded that these two sites represent the major control points in the reaction sequences between sugar and acid pools in this fruit.  相似文献   

16.
Java editor for biological pathways   总被引:1,自引:0,他引:1  
SUMMARY: A visual Java-based tool for drawing and annotating biological pathways was developed. This tool integrates the possibilities of charting elements with different attributes (size, color, labels), drawing connections between elements in distinct characteristics (color, structure, width, arrows), as well as adding links to molecular biology databases, promoter sequences, information on the function of the genes or gene products, and references. It is easy to use and system independent. The result of the editing process is a PNG (portable network graphics) file for the images and XML (extended markup language) file for the appropriate links.  相似文献   

17.
Java-Dotter (JDotter) is a platform-independent Java interactive interface for the Linux version of Dotter, a widely used program for generating dotplots of large DNA or protein sequences. JDotter runs as a client-server application and can send new sequences to the Dotter program for alignment as well as rapidly access a repository of preprocessed dotplots. JDotter also interfaces with a sequence database or file system to display supplementary feature data. Thus, JDotter greatly simplifies access to dotplot data in laboratories that deal with large numbers of genomes and have a multi-platform organization. AVAILABILITY: Currently, JDotter is used via Java Web Start by the Poxvirus Bioinformatics Resource for examining dotplots of complete poxvirus genomes; http://athena.bioc.uvic.ca/pbr/jdotter/. The software is available for download from the same location. SUPPLEMENTARY INFORMATION: Installation instructions, the User's Manual, screenshots and examples are available at the JDotter home page http://athena.bioc.uvic.ca/pbr/jdotter/. The software and source code is free for non-commercial applications.  相似文献   

18.
Synthetic oligonucleotides have proven to be extremely useful probes for screening cDNA and genomic libraries. Selection of the appropriate probe can be more easily and accurately achieved with the use of the computer program PROBFIND. The user enters the amino acid sequence from a file or from the keyboard, selects the minimum length allowed for the probe and the maximum allowable degeneracy. The computer prints a list of the sequences of potential probes which meet these minimum specifications and the location of the corresponding sequence in the protein to the screen and to a file. The user may modify the specifications for length and degeneracy at any time during the output of data, which allows for rapid selection of the desired probe. The program is interactive, accepts any file format with only a single modification of the file, is written in BASIC, and requires less than 6 kbytes of memory. This makes the program easy to use and adaptable even to unsophisticated microcomputers.  相似文献   

19.
Combined bisulfite restriction analysis (COBRA) is one of the most commonly used methylation quantification methods. However, it focuses on relatively few restriction enzymes. Here, we present Methyl-Typing, a web-based software that provides restriction enzyme mining data for methyl-cytosine-containing sequences following bisulfite-conversion. Gene names, accession numbers, sequences, PCR primers, and file upload are accessible for input. Promoter sequences and restriction enzymes for CpG- and GpC-containing recognition sites are retrieved. Four representative enzymes were tested successfully by COBRA on the experimental work. Therefore, the Methyl-Typing tool provides a comprehensive COBRA-restriction enzyme mining. It is freely available at http://bio.kuas.edu.tw/methyl-typing.  相似文献   

20.
Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号