首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The use of bioinformatics tools require different sequence formats at various instances. Every tool uses specific set of formats for processing. Sequence in one format is often required in another format. Thus, there is a need for sequence format conversion. A number of such tools are available in the public domain. Here, we describe BIOFFORC as a file format converter. The tool is developed with a graphical user interface in PERL.

Availability

http://www.winningpath.com/biofforc/  相似文献   

2.
3.
With continued efforts towards a single MSI data format, data conversion routines must be made universally available. The benefits of a common imaging format, imzML, are slowly becoming more widely appreciated but the format remains to be used by only a small proportion of imaging groups. Increased awareness amongst researchers and continued support from major MS vendors in providing tools for converting proprietary formats into imzML are likely to result in a rapidly increasing uptake of the format. It is important that this does not lead to the exclusion of researchers using older or unsupported instruments. We describe an open source converter, imzMLConverter, to ensure against this. We propose that proprietary formats should first be converted to mzML using one of the widely available converters, such as msconvert and then use imzMLConverter to convert mzML to imzML. This will allow a wider audience to benefit from the imzML format immediately.  相似文献   

4.
Puah WC  Cheok LP  Biro M  Ng WT  Wasser M 《BioTechniques》2011,51(1):49-50, 52-3
Automated microscopy enables in vivo studies in developmental biology over long periods of time. Time-lapse recordings in three or more dimensions to study the dynamics of developmental processes can produce huge data sets that extend into the terabyte range. However, depending on the available computational resources and software design, downstream processing of very large image data sets can become highly inefficient, if not impossible. To address the lack of available open source and commercial software tools to efficiently reorganize time-lapse data on a desktop computer with limited system resources, we developed TLM-Converter. The software either fragments oversized files or concatenates multiple files representing single time frames and saves the output files in open standard formats. Our application is undemanding on system resources as it does not require the whole data set to be loaded into the system memory. We tested our tool on time-lapse data sets of live Drosophila specimens recorded by laser scanning confocal microscopy. Image data reorganization dramatically enhances the productivity of time-lapse data processing and allows the use of downstream image analysis software that is unable to handle large data sets of ≥2 GB. In addition, saving the outputs in open standard image file formats enables data sharing between independently developed software tools.  相似文献   

5.
Falkner JA  Hill JA  Andrews PC 《Proteomics》2008,8(9):1756-1757
A FASTA file archive and reference resource has been added to ProteomeCommons.org. Motivation for this new functionality derives from two primary sources. The first is the recent FASTA standardization work done by the Human Proteome Organization's Proteomics Standards Initiative (HUPO-PSI). Second is the general lack of a uniform mechanism to properly cite FASTA files used in a study, and to publicly access such FASTA files post-publication. An extension to the Tranche data sharing network has been developed that includes web-pages, documentation, and tools for facilitating the use of FASTA files. These include conversion to the new HUPO-PSI format, and provisions for both citing and publicly archiving FASTA files. This new resource is available immediately, free of charge, and can be accessed at http://www.proteomecommons.org/data/fasta/. Source-code for related tools is also freely available under the BSD license.  相似文献   

6.
Nmrglue, an open source Python package for working with multidimensional NMR data, is described. When used in combination with other Python scientific libraries, nmrglue provides a highly flexible and robust environment for spectral processing, analysis and visualization and includes a number of common utilities such as linear prediction, peak picking and lineshape fitting. The package also enables existing NMR software programs to be readily tied together, currently facilitating the reading, writing and conversion of data stored in Bruker, Agilent/Varian, NMRPipe, Sparky, SIMPSON, and Rowland NMR Toolkit file formats. In addition to standard applications, the versatility offered by nmrglue makes the package particularly suitable for tasks that include manipulating raw spectrometer data files, automated quantitative analysis of multidimensional NMR spectra with irregular lineshapes such as those frequently encountered in the context of biomacromolecular solid-state NMR, and rapid implementation and development of unconventional data processing methods such as covariance NMR and other non-Fourier approaches. Detailed documentation, install files and source code for nmrglue are freely available at http://nmrglue.com. The source code can be redistributed and modified under the New BSD license.  相似文献   

7.
The application of mass spectrometry imaging (MS imaging) is rapidly growing with a constantly increasing number of different instrumental systems and software tools. The data format imzML was developed to allow the flexible and efficient exchange of MS imaging data between different instruments and data analysis software. imzML data is divided in two files which are linked by a universally unique identifier (UUID). Experimental details are stored in an XML file which is based on the HUPO-PSI format mzML. Information is provided in the form of a 'controlled vocabulary' (CV) in order to unequivocally describe the parameters and to avoid redundancy in nomenclature. Mass spectral data are stored in a binary file in order to allow efficient storage. imzML is supported by a growing number of software tools. Users will be no longer limited to proprietary software, but are able to use the processing software best suited for a specific question or application. MS imaging data from different instruments can be converted to imzML and displayed with identical parameters in one software package for easier comparison. All technical details necessary to implement imzML and additional background information is available at www.imzml.org.  相似文献   

8.
9.
During a meeting of the SYSGENET working group 'Bioinformatics', currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future compatibility of software and exchange of already developed software modules, a strong recommendation was made by the group to integrate HAPPY and R/qtl analysis toolboxes, GeneNetwork and XGAP database platforms, and TIQS and xQTL processing platforms. R should be used as the principal computer language for QTL data analysis in all platforms and a 'cloud' should be used for software dissemination to the community. Furthermore, the working group recommended that all data models and software source code should be made visible in public repositories to allow a coordinated effort on the use of common data structures and file formats.  相似文献   

10.
The analysis of genetic data often requires a combination of several approaches using different and sometimes incompatible programs. In order to facilitate data exchange and file conversions between population genetics programs, we introduce PGDSpider, a Java program that can read 27 different file formats and export data into 29, partially overlapping, other file formats. The PGDSpider package includes both an intuitive graphical user interface and a command-line version allowing its integration in complex data analysis pipelines. AVAILABILITY: PGDSpider is freely available under the BSD 3-Clause license on http://cmpg.unibe.ch/software/PGDSpider/.  相似文献   

11.
SUMMARY: Affymetrix GeneChip microarrays are increasingly used in gene expression studies and in greater number. A software library was developed that supports Affymetrix file formats and implements two popular summary algorithms (MAS5.0 and RMA). The library is modular in design for integration into larger systems and processing pipelines. Additionally, a graphical interface (GENE) was developed to allow end-user access to the functionality within the library. AVAILABILITY: libaffy is free to use under the GNU GPL license. The source code and Windows binaries can be freely accessed from the website http://src.moffitt.usf.edu/libaffy. Additional API documentation and user manual are available.  相似文献   

12.
13.
MOTIVATION: Effective use of proteomics data, specifically mass spectrometry data, relies on the ability to read and write the many mass spectrometer file formats. Even with mass spectrometer vendor-specific libraries and vendor-neutral file formats, such as mzXML and mzData it can be difficult to extract raw data files in a form suitable for batch processing and basic research. Introduced here are the ProteomeCommons.org Input and Output Framework, abbreviated to IO Framework, which is designed to abstractly represent mass spectrometry data. This project is a public, open-source, free-to-use framework that supports most of the mass spectrometry data formats, including current formats, legacy formats and proprietary formats that require a vendor-specific library in order to operate. The IO Framework includes an on-line tool for non-programmers and a set of libraries that developers may use to convert between various proteomics file formats. AVAILABILITY: The current source-code and documentation for the ProteomeCommons.org IO Framework is freely available at http://www.proteomecommons.org/current/531/  相似文献   

14.
15.
SUMMARY: Chimera allows the construction of chimeric protein or nucleic acid sequence files by concatenating sequences from two or more sequence files in PHYLIP formats. It allows the user to interactively select genes and species from the input files. The concatenated result is stored to one single output file in PHYLIP or NEXUS formats. AVAILABILITY: The computer program, including supporting files and example files, is available from http://www.dalicon.com/chimera/.  相似文献   

16.
The increasing role of metabolomics in system biology is driving the development of tools for comprehensive analysis of high-resolution NMR spectral datasets. This task is quite challenging since unlike the datasets resulting from other 'omics', a substantial preprocessing of the data is needed to allow successful identification of spectral patterns associated with relevant biological variability. HiRes is a unique stand-alone software tool that combines standard NMR spectral processing functionalities with techniques for multi-spectral dataset analysis, such as principal component analysis and non-negative matrix factorization. In addition, HiRes contains extensive abilities for data cleansing, such as baseline correction, solvent peak suppression, removal of frequency shifts owing to experimental conditions as well as auxiliary information management. Integration of these components together with multivariate analytical procedures makes HiRes very capable of addressing the challenges for assessment and interpretation of large metabolomic datasets, greatly simplifying this otherwise lengthy and difficult process and assuring optimal information retrieval. AVAILABILITY: HiRes is freely available for research purposes at http://hatch.cpmc.columbia.edu/highresmrs.html  相似文献   

17.
ABSTRACT: BACKGROUND: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. RESULTS: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. CONCLUSIONS: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org.  相似文献   

18.
SUMMARY: Accurate and complete mapping of short-read sequencing to a reference genome greatly enhances the discovery of biological results and improves statistical predictions. We recently presented RNA-MATE, a pipeline for the recursive mapping of RNA-Seq datasets. With the rapid increase in genome re-sequencing projects, progression of available mapping software and the evolution of file formats, we now present X-MATE, an updated version of RNA-MATE, capable of mapping both RNA-Seq and DNA datasets and with improved performance, output file formats, configuration files, and flexibility in core mapping software. AVAILABILITY: Executables, source code, junction libraries, test data and results and the user manual are available from http://grimmond.imb.uq.edu.au/X-MATE/.  相似文献   

19.
HGVbase (Human Genome Variation database; http://hgvbase.cgb.ki.se, formerly known as HGBASE) is an academic effort to provide a high quality and non-redundant database of available genomic variation data of all types, mostly comprising single nucleotide polymorphisms (SNPs). Records include neutral polymorphisms as well as disease-related mutations. Online search tools facilitate data interrogation by sequence similarity and keyword queries, and searching by genome coordinates is now being implemented. Downloads are freely available in XML, Fasta, SRS, SQL and tagged-text file formats. Each entry is presented in the context of its surrounding sequence and many records are related to neighboring human genes and affected features therein. Population allele frequencies are included wherever available. Thorough semi-automated data checking ensures internal consistency and addresses common errors in the source information. To keep pace with recent growth in the field, we have developed tools for fully automated annotation. All variants have been uniquely mapped to the draft genome sequence and are referenced to positions in EMBL/GenBank files. Data utility is enhanced by provision of genotyping assays and functional predictions. Recent data structure extensions allow the capture of haplotype and genotype information, and a new initiative (along with BiSC and HUGO-MDI) aims to create a central repository for the broad collection of clinical mutations and associated disease phenotypes of interest.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号