首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: A simple and fast algorithm is described that calculates a measure of protrusion (cx) for atoms in protein structures, directly useable with the common molecular graphics programs. RESULTS: A sphere of predetermined radius is centered around each non-hydrogen atom, and the volume occupied by the protein and the free volume within the sphere (internal and external volumes, respectively) are calculated. Atoms in protruding regions have a high ratio (cx) between the external and the internal volume. The program reads a PDB file, and writes the output in the same format, with cx values in the B factor field. Output structure files can be directly displayed with standard molecular graphics programs like RASMOL, MOLMOL, Swiss-PDB Viewer and colored according to cx values. We show the potential use of this program in the analysis of two protein-protein complexes and in the prediction of limited proteolysis sites in native proteins. AVAILABILITY: The algorithm is implemented in a standalone program written in C and its source is freely available at ftp.icgeb.trieste.it/pub/CX or on request from the authors.  相似文献   

2.
MOTIVATION: The program ESPript (Easy Sequencing in PostScript) allows the rapid visualization, via PostScript output, of sequences aligned with popular programs such as CLUSTAL-W or GCG PILEUP. It can read secondary structure files (such as that created by the program DSSP) to produce a synthesis of both sequence and structural information. RESULTS: ESPript can be run via a command file or a friendly html-based user interface. The program calculates an homology score by columns of residues and can sort this calculation by groups of sequences. It offers a palette of markers to highlight important regions in the alignment. ESPript can also paste information on residue conservation into coordinate files, for subsequent visualization with a graphics program. AVAILABILITY: ESPript can be accessed on its Web site at http://www.ipbs.fr/ESPript. Sources and helpfiles can be downloaded via anonymous ftp from ftp.ipbs.fr. A tar file is held in the directory pub/ESPript.  相似文献   

3.
4.
SUMMARY: Chimera allows the construction of chimeric protein or nucleic acid sequence files by concatenating sequences from two or more sequence files in PHYLIP formats. It allows the user to interactively select genes and species from the input files. The concatenated result is stored to one single output file in PHYLIP or NEXUS formats. AVAILABILITY: The computer program, including supporting files and example files, is available from http://www.dalicon.com/chimera/.  相似文献   

5.
The program HBAT is a tool to automate the analysis of potential hydrogen bonds and similar type of weak interactions like halogen bonds and non-canonical interactions in macromolecular structures, available in Brookhaven Protein Database (PDB) file format. HBAT is written using PERL and TK languages. The program generates an MSOFFICE Excel compatible output file for statistical analysis. HBAT identify potential interactions based on geometrical criteria. A series of analysis reports like frequency tables, geometry distribution tables, furcations list are generated. A user friendly GUI offers freedom to select several parameters and options. Graphviz based visualization of hydrogen bond networks in 2D helps to study the cooperativity and anticooperativity geometry in hydrogen bond. HBAT supports post docking interaction analysis between PDB files for any target/receptor (in PDB files) and docked ligands/poses (in SDF). This tool can be implemented in active site interaction analysis, structure based drug design and molecular dynamics simulations.  相似文献   

6.
This paper presents a pipeline, implemented in an open‐source program called GB→TNT (GenBank‐to‐TNT), for creating large molecular matrices, starting from GenBank files and finishing with TNT matrices which incorporate taxonomic information in the terminal names. GB→TNT is designed to retrieve a defined genomic region from a bulk of sequences included in a GenBank file. The user defines the genomic region to be retrieved and several filters (genome, length of the sequence, taxonomic group, etc.); each genomic region represents a different data block in the final TNT matrix. GB→TNT first generates Fasta files from the input GenBank files, then creates an alignment for each of those (by calling an alignment program), and finally merges all the aligned files into a single TNT matrix. The new version of TNT can make use of the taxonomic information contained in the terminal names, allowing easy diagnosis of results, evaluation of fit between the trees and the taxonomy, and automatic labelling or colouring of tree branches with the taxonomic groups they represent. © The Willi Hennig Society 2012.  相似文献   

7.
SPLICE, a software tool for the extraction of sequences fromfiles in GenBank tape format, has been developed. The programcan analyze the features table in this format and use any ofthe information provided to write the corresponding sequencesinto a standard sequence file format suitable for use with sequenceanalysis programs. Sequences that are present as several subsequentfragments in a single GenBank file, such as those encoding apeptide, can be spliced together by the program. Further, sequencesthat are present in more than one Genbank file, such as an exonwhich spans several different files, can also be spliced intoone sequence. SPLICE runs under the MS/DOS and Unix operatingsystems, can be called as a sub-process by other programs andcan process batches of files. Received on December 26, 1989; accepted on May 30, 1990  相似文献   

8.
Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.  相似文献   

9.
Battye F 《Cytometry》2001,43(2):143-149
BACKGROUND: The obvious benefits of centralized data storage notwithstanding, the size of modern flow cytometry data files discourages their transmission over commonly used telephone modem connections. The proposed solution is to install at the central location a web servlet that can extract compact data arrays, of a form dependent on the requested display type, from the stored files and transmit them to a remote client computer program for display. METHODS: A client program and a web servlet, both written in the Java programming language, were designed to communicate over standard network connections. The client program creates familiar numerical and graphical display types and allows the creation of gates from combinations of user-defined regions. Data compression techniques further reduce transmission times for data arrays that are already much smaller than the data file itself. RESULTS: For typical data files, network transmission times were reduced more than 700-fold for extraction of one-dimensional (1-D) histograms, between 18 and 120-fold for 2-D histograms, and 6-fold for color-coded dot plots. Numerous display formats are possible without further access to the data file. CONCLUSIONS: This scheme enables telephone modem access to centrally stored data without restricting flexibility of display format or preventing comparisons with locally stored files.  相似文献   

10.
There are many ftp or http servers storing data required for biological research. While some download applications are available, there is no user-friendly download application with a graphical interface specifically designed and adapted to meet the requirements of bioinformatics. BioDownloader is a program for downloading and updating files from ftp and http servers. It is optimized to work robustly with large numbers of files. It allows the selective retrieval of only the required files (batch downloads, multiple file masks, ls-lR file parsing, recursive search, recent updates, etc.). BioDownloader has a built-in repository containing the settings for common bioinformatics file-synchronization needs, including the Protein Data Bank (PDB) and National Center for Biotechnology Information (NCBI) databases. It can post-process downloaded files, including archive extraction and file conversions. AVAILABILITY: The program can be installed from http://dunbrack.fccc.edu/BioDownloader. The software is freely available for both non-commercial and commercial users under the BSD license.  相似文献   

11.
SUMMARY: SCide is a program to identify stabilization centers from known protein structures. These are residues involved in cooperative long-range contacts, which can be formed between various regions of a single polypeptide chain, or they can belong to different peptides or polypeptides in a complex. The server takes a PDB file as an input, and the result is presented in graphical or text format. AVAILABILITY: SCide is available on the web at http://www.enzim.hu/scide. The source code can be obtained from the authors on request.  相似文献   

12.
A BASIC program has been devised for the hydropathic analysisof protein sequences according to the method of Kyte and Doolittle(1982). The program uses sequence data from input files thatare created with a word processor and produces two types ofoutput file: one contains a bar graph of the hydropathic profilein a format that can be easily edited; the other is a tabulationof hydropathic indices along a protein's sequence that can beused as input by the program for the production of a bar graphor as input into other graphics and analysis software. An MS-DOSmicrocomputer, operating under IBM BASICA or GWBASIC and a dotmatrix printer with block graphics capabilities are the onlyhardware requirements for graphic display of hydropathy profiles.The program is capable of unattended analysis from a list ofup to 15 input files. ; accepted on March 10, 1986  相似文献   

13.
We describe a program (and a website) to reformat the ClustalX/ClustalW outputs to a format that is widely used in the presentation of sequence alignment data in SNP analysis and molecular systematic studies. This program, CLOURE, CLustal OUtput REformatter, takes the multiple sequence alignment file (nucleic acid or protein) generated from Clustal as input files. The CLOURE-D format presents the Clustal alignment in a format that highlights only the different nucleotides/residues relative to the first query sequence. The program has been written in Visual Basic and will run on a Windows platform. The downloadable program, as well as a web-based server which has also been developed, can be accessed at http://imtech.res.in/~anand/cloure.html.  相似文献   

14.
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot  相似文献   

15.
Predict7, a program for protein structure prediction   总被引:4,自引:0,他引:4  
We describe a program for protein sequence analysis which runs in IBM PC computers. Protein sequences are loaded from files in Mount-Conrad and Lipman-Pearson format. Seven features are analyzed: hydrophilicity, hydropathy, surface probability, side chain flexibility, antigenicity, secondary structure and N-glycosylation sites. Numeric results can be shown, printed or stored in files exportable to other programs. Graphics of up to four predictions can be displayed on the screen, printed out or plotted, with several definable options. This program has been designed to be fast, user-friendly and to be shared with the scientific community.  相似文献   

16.
Ecological research relies increasingly on the use of previously collected data. Use of existing datasets allows questions to be addressed more quickly, more generally, and at larger scales than would otherwise be possible. As a result of large-scale data collection efforts, and an increasing emphasis on data publication by journals and funding agencies, a large and ever-increasing amount of ecological data is now publicly available via the internet. Most ecological datasets do not adhere to any agreed-upon standards in format, data structure or method of access. Some may be broken up across multiple files, stored in compressed archives, and violate basic principles of data structure. As a result acquiring and utilizing available datasets can be a time consuming and error prone process. The EcoData Retriever is an extensible software framework which automates the tasks of discovering, downloading, and reformatting ecological data files for storage in a local data file or relational database. The automation of these tasks saves significant time for researchers and substantially reduces the likelihood of errors resulting from manual data manipulation and unfamiliarity with the complexities of individual datasets.  相似文献   

17.
A computer program has been designed to aid development of synthetic strategies for oligonucleotides produced by solid-phase chemical techniques. The program reduces the time required to develop a strategy and a data file from hours to minutes. The program contains inventories, provides cost analyses, and generates and stores other associated data. The program searches an inventory of sequences for that sequence to avoid duplicate synthesis. If the sequence is not in the inventory the program devises a synthetic strategy, calculates the amounts of reagents and labor costs necessary to complete the synthetic oligonucleotide. The program also deducts the reagents from inventory files. Physical data is also calculated. A file is generated in a sequence inventory for storage of the data as well as other data that will be generated during the purification processes. All variable parameters can be easily edited. The programs were designed to provide a cross-referencing feature for data analysis and can use several parameters as a constant.  相似文献   

18.
19.
Most existing Mass Spectra (MS) analysis programs are automatic and provide limited opportunity for editing during the interpretation. Furthermore, they rely entirely on publicly available databases for interpretation. VEMS (Virtual Expert Mass Spectrometrist) is a program for interactive analysis of peptide MS/MS spectra imported in text file format. Peaks are annotated, the monoisotopic peaks retained, and the b-and y-ion series identified in an interactive manner. The called peptide sequence is searched against a local protein database for sequence identity and peptide mass. The report compares the calculated and the experimental mass spectrum of the called peptide. The program package includes four accessory programs. VEMStrans creates protein databases in FASTA format from EST or cDNA sequence files. VEMSdata creates a virtual peptide database from FASTA files. VEMSdist displays the distribution of masses up to 5000 Da. VEMSmaldi searches singly charged peptide masses against the local database.  相似文献   

20.
The program phase is widely used for Bayesian inference of haplotypes from diploid genotypes; however, manually creating phase input files from sequence alignments is an error-prone and time-consuming process, especially when dealing with numerous variable sites and/or individuals. Here, a web tool called seqphase is presented that generates phase input files from fasta sequence alignments and converts phase output files back into fasta. During the production of the phase input file, several consistency checks are performed on the dataset and suitable command line options to be used for the actual phase data analysis are suggested. seqphase was written in perl and is freely accessible over the Internet at the address http://www.mnhn.fr/jfflot/seqphase.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号