首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 609 毫秒
1.
The analysis of genetic data often requires a combination of several approaches using different and sometimes incompatible programs. In order to facilitate data exchange and file conversions between population genetics programs, we introduce PGDSpider, a Java program that can read 27 different file formats and export data into 29, partially overlapping, other file formats. The PGDSpider package includes both an intuitive graphical user interface and a command-line version allowing its integration in complex data analysis pipelines. AVAILABILITY: PGDSpider is freely available under the BSD 3-Clause license on http://cmpg.unibe.ch/software/PGDSpider/.  相似文献   

2.
BioJava: an open-source framework for bioinformatics   总被引:1,自引:0,他引:1  
SUMMARY: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. AVAILABILITY: BioJava is an open-source project distributed under the Lesser GPL (LGPL). BioJava can be downloaded from the BioJava website (http://www.biojava.org). BioJava requires Java 1.5 or higher. All queries should be directed to the BioJava mailing lists. Details are available at http://biojava.org/wiki/BioJava:MailingLists.  相似文献   

3.
SUMMARY: Combo is a comparative genome browser that provides a dynamic view of whole genome alignments along with their associated annotations. Combo provides two different visualization perspectives. The perpendicular (dot plot) view provides a dot plot of genome alignments synchronized with a display of genome annotations along each axis. The parallel view displays two genome annotations horizontally, synchronized through a panel displaying local alignments as trapezoids. Users can zoom to any resolution, from whole chromosomes to individual bases. They can select, highlight and view detailed information from specific alignments and annotations. Combo is an organism agnostic and can import data from a variety of file formats. AVAILABILITY: Combo is integrated as part of the Argo Genome Browser which also provides single-genome browsing and editing capabilities. Argo is written in Java, runs on multiple platforms and is freely available for download at http://www.broad.mit.edu/annotation/argo/.  相似文献   

4.
We present a Java application programming interface (API), jmzIdentML, for the Human Proteome Organisation (HUPO) Proteomics Standards Initiative (PSI) mzIdentML standard for peptide and protein identification data. The API combines the power of Java Architecture of XML Binding (JAXB) and an XPath-based random-access indexer to allow a fast and efficient mapping of extensible markup language (XML) elements to Java objects. The internal references in the mzIdentML files are resolved in an on-demand manner, where the whole file is accessed as a random-access swap file, and only the relevant piece of XMLis selected for mapping to its corresponding Java object. The APIis highly efficient in its memory usage and can handle files of arbitrary sizes. The APIfollows the official release of the mzIdentML (version 1.1) specifications and is available in the public domain under a permissive licence at http://www.code.google.com/p/jmzidentml/.  相似文献   

5.
We here present the jmzReader library: a collection of Java application programming interfaces (APIs) to parse the most commonly used peak list and XML-based mass spectrometry (MS) data formats: DTA, MS2, MGF, PKL, mzXML, mzData, and mzML (based on the already existing API jmzML). The library is optimized to be used in conjunction with mzIdentML, the recently released standard data format for reporting protein and peptide identifications, developed by the HUPO proteomics standards initiative (PSI). mzIdentML files do not contain spectra data but contain references to different kinds of external MS data files. As a key functionality, all parsers implement a common interface that supports the various methods used by mzIdentML to reference external spectra. Thus, when developing software for mzIdentML, programmers no longer have to support multiple MS data file formats but only this one interface. The library (which includes a viewer) is open source and, together with detailed documentation, can be downloaded from http://code.google.com/p/jmzreader/.  相似文献   

6.
SUMMARY: Accurate and complete mapping of short-read sequencing to a reference genome greatly enhances the discovery of biological results and improves statistical predictions. We recently presented RNA-MATE, a pipeline for the recursive mapping of RNA-Seq datasets. With the rapid increase in genome re-sequencing projects, progression of available mapping software and the evolution of file formats, we now present X-MATE, an updated version of RNA-MATE, capable of mapping both RNA-Seq and DNA datasets and with improved performance, output file formats, configuration files, and flexibility in core mapping software. AVAILABILITY: Executables, source code, junction libraries, test data and results and the user manual are available from http://grimmond.imb.uq.edu.au/X-MATE/.  相似文献   

7.
Storing biological sequence databases in relational form   总被引:2,自引:0,他引:2  
SUMMARY: We have created a set of applications using Perl and Java in combination with XML technology to install biological sequence databases into an Oracle RDBMS. An easy-to-use interface using Java has been created for database query and other tools developed to integrate with our in-house bioinformatics applications. AVAILIBILITY: The database schema, DTD file, and source codes are available from the authors via email. CONTACT: guochun_ xie@merck. com  相似文献   

8.
Mass spectrometry-based proteomics is increasingly being used in biomedical research. These experiments typically generate a large volume of highly complex data, and the volume and complexity are only increasing with time. There exist many software pipelines for analyzing these data (each typically with its own file formats), and as technology improves, these file formats change and new formats are developed. Files produced from these myriad software programs may accumulate on hard disks or tape drives over time, with older files being rendered progressively more obsolete and unusable with each successive technical advancement and data format change. Although initiatives exist to standardize the file formats used in proteomics, they do not address the core failings of a file-based data management system: (1) files are typically poorly annotated experimentally, (2) files are "organically" distributed across laboratory file systems in an ad hoc manner, (3) files formats become obsolete, and (4) searching the data and comparing and contrasting results across separate experiments is very inefficient (if possible at all). Here we present a relational database architecture and accompanying web application dubbed Mass Spectrometry Data Platform that is designed to address the failings of the file-based mass spectrometry data management approach. The database is designed such that the output of disparate software pipelines may be imported into a core set of unified tables, with these core tables being extended to support data generated by specific pipelines. Because the data are unified, they may be queried, viewed, and compared across multiple experiments using a common web interface. Mass Spectrometry Data Platform is open source and freely available at http://code.google.com/p/msdapl/.  相似文献   

9.
MOTIVATION: BLAST programs are very efficient in finding similarities for sequences. However for large datasets such as ESTs, manual extraction of the information from the batch BLAST output is needed. This can be time consuming, insufficient, and inaccurate. Therefore implementation of a parser application would be extremely useful in extracting information from BLAST outputs. RESULTS: We have developed a java application, Batch Blast Extractor, with a user friendly graphical interface to extract information from BLAST output. The application generates a tab delimited text file that can be easily imported into any statistical package such as Excel or SPSS for further analysis. For each BLAST hit, the program obtains and saves the essential features from the BLAST output file that would allow further analysis. The program was written in Java and therefore is OS independent. It works on both Windows and Linux OS with java 1.4 and higher. It is freely available from: http://mcbc.usm.edu/BatchBlastExtractor/  相似文献   

10.
We describe PerlMAT, a Perl microarray toolkit providing easy to use object-oriented methods for the simplified manipulation, management and analysis of microarray data. The toolkit provides objects for the encapsulation of microarray spots and reporters, several common microarray data file formats and GAL files. In addition, an analysis object provides methods for data processing, and an image object enables the visualisation of microarray data. This important addition to the Perl developer's library will facilitate more widespread use of Perl for microarray application development within the bioinformatics community. The coherent interface and well-documented code enables rapid analysis by even inexperienced Perl developers. AVAILABILITY: Software is available at http://sourceforge.net/projects/perlmat  相似文献   

11.
SUMMARY: ACGT (a comparative genomics tool) is a genomic DNA sequence comparison viewer and analyzer. It can read a pair of DNA sequences in GenBank, Embl or Fasta formats, with or without a comparison file, and provide users with many options to view and analyze the similarities between the input sequences. It is written in Java and can be run on Unix, Linux and Windows platforms. AVAILABILITY: The ACGT program is freely available with documentation and examples at website: http://db.systemsbiology.net/projects/local/mhc/acgt/  相似文献   

12.
Java-Dotter (JDotter) is a platform-independent Java interactive interface for the Linux version of Dotter, a widely used program for generating dotplots of large DNA or protein sequences. JDotter runs as a client-server application and can send new sequences to the Dotter program for alignment as well as rapidly access a repository of preprocessed dotplots. JDotter also interfaces with a sequence database or file system to display supplementary feature data. Thus, JDotter greatly simplifies access to dotplot data in laboratories that deal with large numbers of genomes and have a multi-platform organization. AVAILABILITY: Currently, JDotter is used via Java Web Start by the Poxvirus Bioinformatics Resource for examining dotplots of complete poxvirus genomes; http://athena.bioc.uvic.ca/pbr/jdotter/. The software is available for download from the same location. SUPPLEMENTARY INFORMATION: Installation instructions, the User's Manual, screenshots and examples are available at the JDotter home page http://athena.bioc.uvic.ca/pbr/jdotter/. The software and source code is free for non-commercial applications.  相似文献   

13.
SUMMARY: TOPALi is a new Java graphical analysis application that allows the user to identify recombinant sequences within a DNA multiple alignment (either automatically or via manual investigation). TOPALi allows a choice of three statistical methods to predict the positions of breakpoints due to past recombination. The breakpoint predictions are then used to identify putative recombinant sequences and their relationships to other sequences. In addition to its sophisticated interface, TOPALi can import many sequence formats, estimate and display phylogenetic trees and allow interactive analysis and/or automatic HTML report generation. AVAILABILITY: TOPALi is freely available from http://www.bioss.ac.uk/software.html  相似文献   

14.
15.
We describe multiple methods for accessing and querying the complex and integrated cellular data in the BioCyc family of databases: access through multiple file formats, access through Application Program Interfaces (APIs) for LISP, Perl and Java, and SQL access through the BioWarehouse relational database.  相似文献   

16.
MOTIVATION: Effective use of proteomics data, specifically mass spectrometry data, relies on the ability to read and write the many mass spectrometer file formats. Even with mass spectrometer vendor-specific libraries and vendor-neutral file formats, such as mzXML and mzData it can be difficult to extract raw data files in a form suitable for batch processing and basic research. Introduced here are the ProteomeCommons.org Input and Output Framework, abbreviated to IO Framework, which is designed to abstractly represent mass spectrometry data. This project is a public, open-source, free-to-use framework that supports most of the mass spectrometry data formats, including current formats, legacy formats and proprietary formats that require a vendor-specific library in order to operate. The IO Framework includes an on-line tool for non-programmers and a set of libraries that developers may use to convert between various proteomics file formats. AVAILABILITY: The current source-code and documentation for the ProteomeCommons.org IO Framework is freely available at http://www.proteomecommons.org/current/531/  相似文献   

17.
MOTIVATION: The availability of increasing amounts of sequence data about completely sequenced genomes spurs the development of new methods in the fields of automated annotation, and of comparative genomics. Tools allowing the visualization of results produced by analysis methods, superimposed on possibly annotated sequence data, and enabling synchronized navigation in multiple genomes, provide new means for interactive genome exploration. This kind of visual inspection can be used as a basis to assess the quality of new analysis algorithms, or to discover genome portions to be subjected to in-depth studies. RESULTS: We propose a software package, MuGeN, built for navigating through multiple annotated genomes. It is capable of retrieving annotated sequences in several formats, stored in local files, or available in databases over the network. From these, it then generates an interactive display, or an image file, in most common formats suitable for printing, further editing or integrating in Web pages. Genome maps may be mixed with computer analysis results loaded from XML files, whose format is generic enough to be adapted to a majority of sequence oriented analysis methods. AVAILABILITY: MuGeN is available at http://www-mig.jouy.inra.fr/bdsi/MuGeN.  相似文献   

18.

Background  

DNA Microarrays have become the standard method for large scale analyses of gene expression and epigenomics. The increasing complexity and inherent noisiness of the generated data makes visual data exploration ever more important. Fast deployment of new methods as well as a combination of predefined, easy to apply methods with programmer's access to the data are important requirements for any analysis framework. Mayday is an open source platform with emphasis on visual data exploration and analysis. Many built-in methods for clustering, machine learning and classification are provided for dissecting complex datasets. Plugins can easily be written to extend Mayday's functionality in a large number of ways. As Java program, Mayday is platform-independent and can be used as Java WebStart application without any installation. Mayday can import data from several file formats, database connectivity is included for efficient data organization. Numerous interactive visualization tools, including box plots, profile plots, principal component plots and a heatmap are available, can be enhanced with metadata and exported as publication quality vector files.  相似文献   

19.
COMPAM is a tool for visualizing relationships among multiple whole genomes by combining all pairwise genome alignments. It displays shared conserved regions (blocks) and where these blocks occur (edges) as block relation graphs which can be explored interactively. An unannotated genome, e.g. can then be explored using information from well-annotated genomes, COG-based genome annotation and genes. COMPAM can run either as a stand-alone application or through an applet that is provided as service to PLATCOM, a toolset for whole genome comparative analysis, where a wide variety of genomes can be easily selected. Features provided by COMPAM include the ability to export genome relationship information into file formats that can be used by other existing tools. AVAILABILITY: http://bio.informatics.indiana.edu/projects/compam/  相似文献   

20.
The design of Jemboss: a graphical user interface to EMBOSS   总被引:2,自引:0,他引:2  
DESIGN: Jemboss is a graphical user interface (GUI) for the European Molecular Biology Open Software Suite (EMBOSS). It is being developed at the MRC UK HGMP-RC as part of the EMBOSS project. This paper explains the technical aspects of the Jemboss client-server design. The client-server model optionally allows that a Jemboss user have an account on the remote server. The Jemboss client is written in Java and is downloaded automatically to a user's workstation via Java Web Start using the HTML protocol. The client then communicates with the remote server using SOAP (Simple Object Access Protocol). A Tomcat server listens on the remote machine and communicates the SOAP requests to a Jemboss server, again written in Java. This Java server interprets the client requests and executes them through Java Native Interface (JNI) code written in the C language. Another C program having setuid privilege, jembossctl, is called by the JNI code to perform the client requests under the user's account on the server. The commands include execution of EMBOSS applications, file management and project management tasks. Jemboss allows the use of JSSE for encryption of communication between the client and server. The GUI parses the EMBOSS Ajax Command Definition language for form generation and maximum input flexibility. Jemboss interacts directly with the EMBOSS libraries to allow dynamic generation of application default settings. RESULTS: This interface is part of the EMBOSS distribution and has attracted much interest. It has been set up at many other sites globally as well as being used at the HGMP-RC for registered users. AVAILABILITY: The software, EMBOSS and Jemboss, is freely available to academics and commercial users under the GPL licence. It can be downloaded from the EMBOSS ftp server: http://www.uk.embnet.org/Software/EMBOSS/, ftp://ftp.uk.embnet.org/pub/EMBOSS/. Registered HGMP-RC users can access an installed server from: http://www.uk.embnet.org/Software/EMBOSS/Jemboss/  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号