首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: Effective use of proteomics data, specifically mass spectrometry data, relies on the ability to read and write the many mass spectrometer file formats. Even with mass spectrometer vendor-specific libraries and vendor-neutral file formats, such as mzXML and mzData it can be difficult to extract raw data files in a form suitable for batch processing and basic research. Introduced here are the ProteomeCommons.org Input and Output Framework, abbreviated to IO Framework, which is designed to abstractly represent mass spectrometry data. This project is a public, open-source, free-to-use framework that supports most of the mass spectrometry data formats, including current formats, legacy formats and proprietary formats that require a vendor-specific library in order to operate. The IO Framework includes an on-line tool for non-programmers and a set of libraries that developers may use to convert between various proteomics file formats. AVAILABILITY: The current source-code and documentation for the ProteomeCommons.org IO Framework is freely available at http://www.proteomecommons.org/current/531/  相似文献   

2.
We developed JVirGel, a collection of tools for the simulation and analysis of proteomics data. The software creates and visualizes virtual two-dimensional (2D) protein gels based on the migration behaviour of proteins in dependence of their theoretical molecular weights in combination with their calculated isoelectric points. The utilization of all proteins of an organism of interest deduced from genes of the corresponding genome project in combination with the elimination of obvious membrane proteins permits the creation of an optimized calculated proteome map. The electrophoretic separation behaviour of single proteins is accessible interactively in a Java(TM) applet (small application in a web browser) by selecting a pI/MW range and an electrophoretic timescale of interest. The calculated pattern of protein spots helps to identify unknown proteins and to localize known proteins during experimental proteomics approaches. Differences between the experimentally observed and the calculated migration behaviour of certain proteins provide first indications for potential protein modification events. When possible, the protein spots are directly linked via a mouse click to the public databases SWISS-PROT and PRODORIC. Additionally, we provide tools for the serial calculation and visualization of specific protein properties like pH dependent charge curves and hydrophobicity profiles. These values are helpful for the rational establishment of protein purification procedures. The proteomics tools are available on the World Wide Web at http://prodoric.tu-bs.de/proteomics.php.  相似文献   

3.
The mzQuantML standard from the HUPO Proteomics Standards Initiative has recently been released, capturing quantitative data about peptides and proteins, following analysis of MS data. We present a Java application programming interface (API) for mzQuantML called jmzQuantML. The API provides robust bridges between Java classes and elements in mzQuantML files and allows random access to any part of the file. The API provides read and write capabilities, and is designed to be embedded in other software packages, enabling mzQuantML support to be added to proteomics software tools ( http://code.google.com/p/jmzquantml/ ). The mzQuantML standard is designed around a multilevel validation system to ensure that files are structurally and semantically correct for different proteomics quantitative techniques. In this article, we also describe a Java software tool ( http://code.google.com/p/mzquantml‐validator/ ) for validating mzQuantML files, which is a formal part of the data standard.  相似文献   

4.
SUMMARY: Besides classical clustering methods such as hierarchical clustering, in recent years biclustering has become a popular approach to analyze biological data sets, e.g. gene expression data. The Biclustering Analysis Toolbox (BicAT) is a software platform for clustering-based data analysis that integrates various biclustering and clustering techniques in terms of a common graphical user interface. Furthermore, BicAT provides different facilities for data preparation, inspection and postprocessing such as discretization, filtering of biclusters according to specific criteria or gene pair analysis for constructing gene interconnection graphs. The possibility to use different biclustering algorithms inside a single graphical tool allows the user to compare clustering results and choose the algorithm that best fits a specific biological scenario. The toolbox is described in the context of gene expression analysis, but is also applicable to other types of data, e.g. data from proteomics or synthetic lethal experiments. AVAILABILITY: The BicAT toolbox is freely available at http://www.tik.ee.ethz.ch/sop/bicat and runs on all operating systems. The Java source code of the program and a developer's guide is provided on the website as well. Therefore, users may modify the program and add further algorithms or extensions.  相似文献   

5.
MOTIVATION: After the publication of JVirGel 1.0 in 2003 we got many requests and suggestions from the proteomics community to further improve the performance of the software and to add additional useful new features. RESULTS: The integration of the PrediSi algorithm for the prediction of signal peptides for the Sec-dependent protein export into JVirGel 2.0 allows the exclusion of most exported preproteins from calculated proteomic maps and provides the basis for the calculation of Sec-based secretomes. A tool for the identification of transmembrane helices carrying proteins (JCaMelix) and the prediction of the corresponding membrane proteome was added. Finally, in order to directly compare experimental and calculated proteome data, a function to overlay and evaluate predicted and experimental two-dimensional gels was included. AVAILABILITY: JVirGel 2.0 is freely available as precompiled package for the installation on Windows or Linux operating systems. Furthermore, there is a completely platform-independent Java version available for download. Additionally, we provide a Java Server Pages based version of JVirGel 2.0 which can be operated in nearly all web browsers. All versions are accessible at http://www.jvirgel.de  相似文献   

6.
Many top‐down proteomics experiments focus on identifying and localizing PTMs and other potential sources of “mass shift” on a known protein sequence. A simple application to match ion masses and facilitate the iterative hypothesis testing of PTM presence and location would assist with the data analysis in these experiments. ProSight Lite is a free software tool for matching a single candidate sequence against a set of mass spectrometric observations. Fixed or variable modifications, including both PTMs and a select number of glycosylations, can be applied to the amino acid sequence. The application reports multiple scores and a matching fragment list. Fragmentation maps can be exported for publication in either portable network graphic (PNG) or scalable vector graphic (SVG) format. ProSight Lite can be freely downloaded from http://prosightlite.northwestern.edu , installs and updates from the web, and requires Windows 7 or a higher version.  相似文献   

7.
8.
9.
In shot-gun proteomics raw tandem MS data are processed with extraction tools to produce condensed peak lists that can be uploaded to database search engines. Many extraction tools are available but to our knowledge, a systematic comparison of such tools has not yet been carried out. Using raw data containing more than 400,000 tandem MS spectra acquired using an Orbitrap Velos we compared 9 tandem MS extraction tools, freely available as well as commercial. We compared the tools with respect to number of extracted MS/MS events, fragment ion information, number of matches, precursor mass accuracies and agreement in-between tools. Processing a primary data set with 9 different tandem MS extraction tools resulted in a low overlap of identified peptides. The tools differ by assigned charge states of precursors, precursor and fragment ion masses, and we show that peptides identified very confidently using one extraction tool might not be matched when using another tool. We also found a bias towards peptides of lower charge state when extracting fragment ion data from higher resolution raw data without deconvolution. Collecting and comparing the extracted data from the same raw data allow adjusting parameters and expectations and selecting the right tool for extraction of tandem MS data.  相似文献   

10.
New developments in proteomics enable scientists to examine hundreds to thousands of proteins in parallel. Quantitative proteomics allows the comparison of different proteomes of cells, tissues, or body fluids with each other. Analyzing and especially organizing these data sets is often a Herculean task. Pathway Analysis software tools aim to take over this task based on present knowledge. Companies promise that their algorithms help to understand the significance of scientist's data, but the benefit remains questionable, and a fundamental systematic evaluation of the potential of such tools has not been performed until now. Here, we tested the commercial Ingenuity Pathway Analysis tool as well as the freely available software STRING using a well-defined study design in regard to the applicability and value of their results for proteome studies. It was our goal to cover a wide range of scientific issues by simulating different established pathways including mitochondrial apoptosis, tau phosphorylation, and Insulin-, App-, and Wnt-signaling. Next to a general assessment and comparison of the pathway analysis tools, we provide recommendations for users as well as for software developers to improve the added value of a pathway study implementation in proteomic pipelines.  相似文献   

11.
SUMMARY: AVA (Array Visual Analyzer) is a Java program that provides a graphical environment for visualization and analysis of gene expression microarray data. Together with its interactive visualization tools and a variety of built-in data analysis and filtration methods, AVA effectively integrates microarray data normalization, quality assessment, and data mining into one application. AVAILABILITY: The software is freely available for academic users on request from the authors.  相似文献   

12.
Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.  相似文献   

13.
Recent improvements in proteomic technologies have collectively yielded data sets that far exceed the capabilities of typical low‐throughput interpretation strategies. Unfortunately, tools designed to leverage the “peptide‐centric” content of MS‐based proteomics lag the current rate of data production. Here, we describe Pathway Palette ( http://blaispathways.dfci.harvard.edu ), a freely accessible internet application that enables researchers to easily transition from peptides to biological pathways, while simultaneously retaining the qualitative and quantitative aspects of the underlying MS data.  相似文献   

14.
Methods for treating MS/MS data to achieve accurate peptide identification are currently the subject of much research activity. In this study we describe a new method for filtering MS/MS data and refining precursor masses that provides highly accurate analyses of massive sets of proteomics data. This method, coined "postexperiment monoisotopic mass filtering and refinement" (PE-MMR), consists of several data processing steps: 1) generation of lists of all monoisotopic masses observed in a whole LC/MS experiment, 2) clusterization of monoisotopic masses of a peptide into unique mass classes (UMCs) based on their masses and LC elution times, 3) matching the precursor masses of the MS/MS data to a representative mass of a UMC, and 4) filtration of the MS/MS data based on the presence of corresponding monoisotopic masses and refinement of the precursor ion masses by the UMC mass. PE-MMR increases the throughput of proteomics data analysis, by efficiently removing "garbage" MS/MS data prior to database searching, and improves the mass measurement accuracies (i.e. 0.05 +/- 1.49 ppm for yeast data (from 4.46 +/- 2.81 ppm) and 0.03 +/- 3.41 ppm for glycopeptide data (from 4.8 +/- 7.4 ppm)) for an increased number of identified peptides. In proteomics analyses of glycopeptide-enriched samples, PE-MMR processing greatly reduces the degree of false glycopeptide identification by correctly assigning the monoisotopic masses for the precursor ions prior to database searching. By applying this technique to analyses of proteome samples of varying complexities, we demonstrate herein that PE-MMR is an effective and accurate method for treating massive sets of proteomics data.  相似文献   

15.
MOTIVATION: Many tools have been developed to visualize protein structures. Tools that have been based on Java 3D((TM)) are compatible among different systems and they can be run remotely through web browsers. However, using Java 3D for visualization has some performance issues with it. The primary concerns about molecular visualization tools based on Java 3D are in their being slow in terms of interaction speed and in their inability to load large molecules. This behavior is especially apparent when the number of atoms to be displayed is huge, or when several proteins are to be displayed simultaneously for comparison. RESULTS: In this paper we present techniques for organizing a Java 3D scene graph to tackle these problems. We have developed a protein visualization system based on Java 3D and these techniques. We demonstrate the effectiveness of the proposed method by comparing the visualization component of our system with two other Java 3D based molecular visualization tools. In particular, for van der Waals display mode, with the efficient organization of the scene graph, we could achieve up to eight times improvement in rendering speed and could load molecules three times as large as the previous systems could. AVAILABILITY: EPV is freely available with source code at the following URL: http://www.cs.ucsb.edu/~tcan/fpv/  相似文献   

16.
KEGG: Kyoto Encyclopedia of Genes and Genomes.   总被引:14,自引:0,他引:14       下载免费PDF全文
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).  相似文献   

17.
The availability of user‐friendly software to annotate biological datasets and experimental details is becoming essential in data management practices, both in local storage systems and in public databases. The Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols ) is a popular centralized service to query, browse and navigate biomedical ontologies and controlled vocabularies. Recently, the OLS framework has been completely redeveloped (version 3.0), including enhancements in the data model, like the added support for Web Ontology Language based ontologies, among many other improvements. However, the new OLS is not backwards compatible and new software tools are needed to enable access to this widely used framework now that the previous version is no longer available. We here present the OLS Client as a free, open‐source Java library to retrieve information from the new version of the OLS. It enables rapid tool creation by providing a robust, pluggable programming interface and common data model to programmatically access the OLS. The library has already been integrated and is routinely used by several bioinformatics resources and related data annotation tools. Secondly, we also introduce an updated version of the OLS Dialog (version 2.0), a Java graphical user interface that can be easily plugged into Java desktop applications to access the OLS. The software and related documentation are freely available at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog .  相似文献   

18.
PRIDE: the proteomics identifications database   总被引:2,自引:0,他引:2  
The advent of high-throughput proteomics has enabled the identification of ever increasing numbers of proteins. Correspondingly, the number of publications centered on these protein identifications has increased dramatically. With the first results of the HUPO Plasma Proteome Project being analyzed and many other large-scale proteomics projects about to disseminate their data, this trend is not likely to flatten out any time soon. However, the publication mechanism of these identified proteins has lagged behind in technical terms. Often very long lists of identifications are either published directly with the article, resulting in both a voluminous and rather tedious read, or are included on the publisher's website as supplementary information. In either case, these lists are typically only provided as portable document format documents with a custom-made layout, making it practically impossible for computer programs to interpret them, let alone efficiently query them. Here we propose the proteomics identifications (PRIDE) database (http://www.ebi.ac.uk/pride) as a means to finally turn publicly available data into publicly accessible data. PRIDE offers a web-based query interface, a user-friendly data upload facility, and a documented application programming interface for direct computational access. The complete PRIDE database, source code, data, and support tools are freely available for web access or download and local installation.  相似文献   

19.
Complete analysis of the phosphorylation of serine and threonine residues directly from biological extracts is still at an early stage and will remain a challenging goal for many years. Analysis of phosphorylated proteins and identification of the phosphorylated sites in a crude biological extract is a major topic in proteomics, since phosphorylation plays a dominant role in post-translational protein modification. Beta elimination of the serine/threonine-bound phosphate by alkali action generates (methyl)dehydroalanine. The reactivity of this group susceptible of nucleophilic attacks might be used as a tool for phosphoproteome analysis. Most of the known serine/threonine kinases recognize motifs in protein targets that are rich in lysine(s) and/or arginine(s). The (methyl)dehydroalanine resulting from beta elimination of the serine/threonine-bound phosphate by alkali action is likely to react with the amino groups of these neighboring amino acids. Furthermore, the addition reaction of dehydroalanine-peptides with a nucleophilic group more likely generates diastereoisomers derivatives. The internal cyclic bonds and/or the stereoisomer peptide derivatives thus generated confer resistance to trypsin cleavage and/or constitute stop signals for exopeptidases such as carboxypeptidase. This might form the basis of a method to facilitate the systematic identification of phosphorylated peptides.  相似文献   

20.
Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号