首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
Producing gene fusions through genomic structural rearrangements is a major mechanism for tumor evolution. Therefore, accurately detecting gene fusions and the originating rearrangements is of great importance for personalized cancer diagnosis and targeted therapy. We present a tool, BreakTrans, that systematically maps predicted gene fusions to structural rearrangements. Thus, BreakTrans not only validates both types of predictions, but also provides mechanistic interpretations. BreakTrans effectively validates known fusions and discovers novel events in a breast cancer cell line. Applying BreakTrans to 43 breast cancer samples in The Cancer Genome Atlas identifies 90 genomically validated gene fusions. BreakTrans is available at http://bioinformatics.mdanderson.org/main/BreakTrans  相似文献   

2.
Introduction: Mass spectrometry (MS)-based proteomics has become an indispensable tool for the characterization of the proteome and its post-translational modifications (PTM). In addition to standard protein sequence databases, proteogenomics strategies search the spectral data against the theoretical spectra obtained from customized protein sequence databases. Up to date, there are no published proteogenomics studies on acute myeloid leukemia (AML) samples.

Areas covered: Proteogenomics involves the understanding of genomic and proteomic data. The intersection of both datatypes requires advanced bioinformatics skills. A standard proteogenomics workflow that could be used for the study of AML samples is described. The generation of customized protein sequence databases as well as bioinformatics tools and pipelines commonly used in proteogenomics are discussed in detail.

Expert commentary: Drawing on evidence from recent cancer proteogenomics studies and taking into account the public availability of AML genomic data, the interpretation of present and future MS-based AML proteomic data using AML-specific protein sequence databases could discover new biological mechanisms and targets in AML. However, proteogenomics workflows including bioinformatics guidelines can be challenging for the wide AML research community. It is expected that further automation and simplification of the bioinformatics procedures might attract AML investigators to adopt the proteogenomics strategy.  相似文献   


3.
Introduction: The accurate and comprehensive determination of peptide hormones from biological fluids has represented a considerable challenge to analytical chemists for decades. Besides long-established bioanalytical ligand binding assays (or ELISA, RIA, etc.), more and more mass spectrometry-based methods have been developed recently for purposes commonly referred to as targeted proteomics. Eventually the combination of both, analyte extraction by immunoaffinity and subsequent detection by mass spectrometry, has shown to synergistically enhance the test methods’ performance characteristics.

Areas covered: The review provides an overview about the actual state of existing methods and applications concerning the analysis of endogenous peptide hormones. Here, special focus is on recent developments considering the extraction procedures with immobilized antibodies, the subsequent separation of target analytes, and their detection by mass spectrometry.

Expert commentary: Key aspects of procedures aiming at the detection and/or quantification of peptidic analytes in biological matrices have experienced considerable improvements in the last decade, particularly in terms of the assays’ sensitivity, the option of multiplexing target compounds, automatization, and high throughput operation. Despite these advances and progress as expected to be seen in the near future, immunoaffinity purification coupled to mass spectrometry is not yet a standard procedure in routine analysis compared to ELISA/RIA.  相似文献   


4.
In recent years, mass spectrometry has become one of the core technologies for high throughput proteomic profiling in biomedical research. However, reproducibility of the results using this technology was in question. It has been realized that sophisticated automatic signal processing algorithms using advanced statistical procedures are needed to analyze high resolution and high dimensional proteomic data, e.g., Matrix-Assisted Laser Desorption/Ionization Time-of-Flight (MALDI-TOF) data. In this paper we present a software package-pkDACLASS based on R which provides a complete data analysis solution for users of MALDITOF raw data. Complete data analysis comprises data preprocessing, monoisotopic peak detection through statistical model fitting and testing, alignment of the monoisotopic peaks for multiple samples and classification of the normal and diseased samples through the detected peaks. The software provides flexibility to the users to accomplish the complete and integrated analysis in one step or conduct analysis as a flexible platform and reveal the results at each and every step of the analysis. AVAILABILITY: The database is available for free at http://cran.r-project.org/web/packages/pkDACLASS/index.html.  相似文献   

5.
Warp2D is a novel time alignment approach, which uses the overlapping peak volume of the reference and sample peak lists to correct misleading peak shifts. Here, we present an easy-to-use web interface for high-throughput Warp2D batch processing time alignment service using the Dutch Life Science Grid, reducing processing time from days to hours. This service provides the warping function, the sample chromatogram peak list with adjusted retention times and normalized quality scores based on the sum of overlapping peak volume of all peaks. Heat maps before and after time alignment are created from the arithmetic mean of the sum of overlapping peak area rearranged with hierarchical clustering, allowing the quality control of the time alignment procedure. Taverna workflow and command line tool are provided for remote processing of local user data. AVAILABILITY: online data processing service is available at http://www.nbpp.nl/warp2d.html. Taverna workflow is available at myExperiment with title '2D Time Alignment-Webservice and Workflow' at http://www.myexperiment.org/workflows/1283.html. Command line tool is available at http://www.nbpp.nl/Warp2D_commandline.zip. CONTACT: p.l.horvatovich@rug.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

6.
BackgroundMass spectrometry (MS) is becoming the gold standard for biomarker discovery. Several MS-based bioinformatics methods have been proposed for this application, but the divergence of the findings by different research groups on the same MS data suggests that the definition of a reliable method has not been achieved yet. In this work, we propose an integrated software platform, MASCAP, intended for comparative biomarker detection from MALDI-TOF MS data.ResultsMASCAP integrates denoising and feature extraction algorithms, which have already shown to provide consistent peaks across mass spectra; furthermore, it relies on statistical analysis and graphical tools to compare the results between groups. The effectiveness in mass spectrum processing is demonstrated using MALDI-TOF data, as well as SELDI-TOF data. The usefulness in detecting potential protein biomarkers is shown comparing MALDI-TOF mass spectra collected from serum and plasma samples belonging to the same clinical population.ConclusionsThe analysis approach implemented in MASCAP may simplify biomarker detection, by assisting the recognition of proteomic expression signatures of the disease. A MATLAB implementation of the software and the data used for its validation are available at http://www.unich.it/proteomica/bioinf.  相似文献   

7.
Mass spectrometry data are often corrupted by noise. It is very difficult to simultaneously detect low-abundance peaks and reduce false-positive peak detection caused by noise. In this paper, we propose to improve peak detection using an additional constraint: the consistent appearance of similar true peaks across multiple spectra. We observe that false -positive peaks in general do not repeat themselves well across multiple spectra. When we align all the identified peaks (including false-positive ones) from multiple spectra together, those false-positive peaks are not as consistent as true peaks. Thus, we propose to use information from other spectra in order to reduce false-positive peaks. The new method improves the detection of peaks over the traditional single spectrum based peak detection methods. Consequently, the discovery of cancer biomarkers also benefits from this improvement. Source code and additional data are available at: http://www.ece.ust.hk/ approximately eeyu/mspeak.htm.  相似文献   

8.
SUMMARY: New additional methods are presented for processing and visualizing mass spectrometry based molecular profile data, implemented as part of the recently introduced MZmine software. They include new features and extensions such as support for mzXML data format, capability to perform batch processing for large number of files, support for parallel processing, new methods for calculating peak areas using post-alignment peak picking algorithm and implementation of Sammon's mapping and curvilinear distance analysis for data visualization and exploratory analysis. AVAILABILITY: MZmine is available under GNU Public license from http://mzmine.sourceforge.net/.  相似文献   

9.
Introduction: Despite the rapid evolution of proteomic methods, protein interactions and their participation in protein complexes – an important aspect of their function – has rarely been investigated on the proteome-wide level. Disease states, such as muscular dystrophy or viral infection, are induced by interference in protein-protein interactions within complexes. The purpose of this review is to describe the current methods for global complexome analysis and to critically discuss the challenges and opportunities for the application of these methods in biomedical research.

Areas covered: We discuss advancements in experimental techniques and computational tools that facilitate profiling of the complexome. The main focus is on the separation of native protein complexes via size exclusion chromatography and gel electrophoresis, which has recently been combined with quantitative mass spectrometry, for a global protein-complex profiling. The development of this approach has been supported by advanced bioinformatics strategies and fast and sensitive mass spectrometers that have allowed the analysis of whole cell lysates. The application of this technique to biomedical research is assessed, and future directions are anticipated.

Expert commentary: The methodology is quite new, and has already shown great potential when combined with complementary methods for detection of protein complexes.  相似文献   


10.
MOTIVATION: Surface-enhanced laser desorption and ionization (SELDI) time of flight (TOF) is a mass spectrometry technology. The key features in a mass spectrum are its peaks. In order to locate the peaks and quantify their intensities, several pre-processing steps are required. Though different approaches to perform pre-processing have been proposed, there is no systematic study that compares their performance. RESULTS: In this article, we present the results of a systematic comparison of various popular packages for pre-processing of SELDI-TOF data. We evaluate their performance in terms of two of their primary functions: peak detection and peak quantification. Regarding peak quantification, the performance of the algorithms is measured in terms of reproducibility. For peak detection, the comparison is based on sensitivity and false discovery rate. Our results show that for spectra generated with low laser intensity, the software developed by Ciphergen Biosystems (ProteinChip Software 3.1 with the additional tool Biomarker Wizard) produces relatively good results for both peak quantification and detection. On the other hand, for the data produced with either medium or high laser intensity, none of the methods show uniformly better performances under both criteria. Our analysis suggests that an advantageous combination is the use of the packages MassSpecWavelet and PROcess, the former for peak detection and the latter for peak quantification.  相似文献   

11.
Shotgun proteomics experiments are dependent upon database search engines to identify peptides from tandem mass spectra. Many of these algorithms score potential identifications by evaluating the number of fragment ions matched between each peptide sequence and an observed spectrum. These systems, however, generally do not distinguish between matching an intense peak and matching a minor peak. We have developed a statistical model to score peptide matches that is based upon the multivariate hypergeometric distribution. This scorer, part of the "MyriMatch" database search engine, places greater emphasis on matching intense peaks. The probability that the best match for each spectrum has occurred by random chance can be employed to separate correct matches from random ones. We evaluated this software on data sets from three different laboratories employing three different ion trap instruments. Employing a novel system for testing discrimination, we demonstrate that stratifying peaks into multiple intensity classes improves the discrimination of scoring. We compare MyriMatch results to those of Sequest and X!Tandem, revealing that it is capable of higher discrimination than either of these algorithms. When minimal peak filtering is employed, performance plummets for a scoring model that does not stratify matched peaks by intensity. On the other hand, we find that MyriMatch discrimination improves as more peaks are retained in each spectrum. MyriMatch also scales well to tandem mass spectra from high-resolution mass analyzers. These findings may indicate limitations for existing database search scorers that count matched peaks without differentiating them by intensity. This software and source code is available under Mozilla Public License at this URL: http://www.mc.vanderbilt.edu/msrc/bioinformatics/.  相似文献   

12.
Meta-DP: domain prediction meta-server   总被引:1,自引:0,他引:1  
SUMMARY: Meta-DP, a domain prediction meta-server provides a simple interface to predict domains in a given protein sequence using a number of domain prediction methods. The Meta-DP is a convenient resource because through accessing a single site, users automatically obtain the results of the various domain prediction methods along with a consensus prediction. The Meta-DP is currently coupled to 10 domain prediction servers and can be extended to include any number of methods. Meta-DP can thus become a centralized repository of available methods. Meta-DP was also used to evaluate the performance of 13 domain prediction methods in the context of CAFASP-DP. AVAILABILITY: The Meta-DP server is freely available at http://meta-dp.bioinformatics.buffalo.edu and the CAFASP-DP evaluation results are available at http://cafasp4.bioinformatics.buffalo.edu/dp/update.html CONTACT: hkaur@bioinformatics.buffalo.edu SUPPLEMENTARY INFORMATION: Available at http://cafasp4.bioinformatics.buffalo.edu/dp/update.html.  相似文献   

13.
Introduction: Structural characterization of low molecular weight heparin (LMWH) is critical to meet biosimilarity standards. In this context, the review focuses on structural analysis of labile sulfates attached to the side-groups of LMWH using mass spectrometry. A comprehensive review of this topic will help readers to identify key strategies for tackling the problem related to sulfate loss. At the same time, various mass spectrometry techniques are presented to facilitate compositional analysis of LMWH, mainly enoxaparin.

Areas covered: This review summarizes findings on mass spectrometry application for LMWH, including modulation of sulfates, using enzymology and sample preparation approaches. Furthermore, popular open-source software packages for automated spectral data interpretation are also discussed. Successful use of LC/MS can decipher structural composition for LMWH and help evaluate their sameness or biosimilarity with the innovator molecule. Overall, the literature has been searched using PubMed by typing various search queries such as ‘enoxaparin’, ‘mass spectrometry’, ‘low molecular weight heparin’, ‘structural characterization’, etc.

Expert commentary: This section highlights clinically relevant areas that need improvement to achieve satisfactory commercialization of LMWHs. It also primarily emphasizes the advancements in instrumentation related to mass spectrometry, and discusses building automated software for data interpretation and analysis.  相似文献   


14.
Introduction: Integral membrane proteins and lipids constitute the bilayer membranes that surround cells and sub-cellular compartments, and modulate movements of molecules and information between them. Since membrane protein drug targets represent a disproportionately large segment of the proteome, technical developments need timely review.

Areas covered: Publically available resources such as Pubmed were surveyed. Bottom-up proteomics analyses now allow efficient extraction and digestion such that membrane protein coverage is essentially complete, making up around one third of the proteome. However, this coverage relies upon hydrophilic loop regions while transmembrane domains are generally poorly covered in peptide-based strategies. Top-down mass spectrometry where the intact membrane protein is fragmented in the gas phase gives good coverage in transmembrane regions, and membrane fractions are yielding to high-throughput top-down proteomics. Exciting progress in native mass spectrometry of membrane protein complexes is providing insights into subunit stoichiometry and lipid binding, and cross-linking strategies are contributing critical in-vivo information.

Expert commentary: It is clear from the literature that integral membrane proteins have yielded to advanced techniques in protein chemistry and mass spectrometry, with applications limited only by the imagination of investigators. Key advances toward translation to the clinic are emphasized.  相似文献   


15.
Clustering millions of tandem mass spectra   总被引:1,自引:0,他引:1  
Tandem mass spectrometry (MS/MS) experiments often generate redundant data sets containing multiple spectra of the same peptides. Clustering of MS/MS spectra takes advantage of this redundancy by identifying multiple spectra of the same peptide and replacing them with a single representative spectrum. Analyzing only representative spectra results in significant speed-up of MS/MS database searches. We present an efficient clustering approach for analyzing large MS/MS data sets (over 10 million spectra) with a capability to reduce the number of spectra submitted to further analysis by an order of magnitude. The MS/MS database search of clustered spectra results in fewer spurious hits to the database and increases number of peptide identifications as compared to regular nonclustered searches. Our open source software MS-Clustering is available for download at http://peptide.ucsd.edu or can be run online at http://proteomics.bioprojects.org/MassSpec.  相似文献   

16.

Metabolomics Ion-based Data Extraction Algorithm (MET-IDEA) is a computer program for processing large-scale metabolomics data. MET-IDEA utilizes network Common Data Form (netCDF) data files available from a diversity of chromatographically coupled mass spectrometry (MS) systems, utilizes the sensitivity and selectivity associated with selected ion quantification, and greatly reduces the time and effort necessary to obtain large-scale organized data. This article reports on recent improvements to MET-IDEA which include new visualization of peak integrations, display of mass spectra associated with integrated peaks, and optional manual peak integration. The computational performance of MET-IDEA has also been improved to avoid memory overflow during the processing of large data sets and the software made compatible with 64 bit CPUs and operating systems. These new functions improve the performance of MET-IDEA, and they allow users to visualize peak integrations and curate the results through manual integration if desired. The improved version of MET-IDEA better facilitates the quantitative analysis of complex MS-based metabolomics data. MET-IDEA is freely available for academic and non commercial use at (http://bioinfo.noble.org/gateway/index.php?option=com_wrapper&;Itemid=57). Commercial use is available via licensing agreement.

  相似文献   

17.
In the Arabidopsis thaliana regulatory element analyzer (AtREA) server, we have integrated sequence data, genome-wide expression data and functional annotation data in three application modules which will be useful to identify major regulatory targets of a user-provided cis-regulatory element (CRE), study different features of CRE distribution and evaluate the role of a set of CREs in the regulation of gene expression--independently as well as in combination with other user-provided CREs. AVAILABILITY: AtREA is freely available at http://www.bioinformatics.org/grn/atrea.html.  相似文献   

18.
RATIONALE: Modern molecular biology is generating data of unprecedented quantity and quality. Particularly exciting for biochemical pathway modeling and proteomics are comprehensive, time-dense profiles of metabolites or proteins that are measurable, for instance, with mass spectrometry, nuclear magnetic resonance or protein kinase phosphorylation. These profiles contain a wealth of information about the structure and dynamics of the pathway or network from which the data were obtained. The retrieval of this information requires a combination of computational methods and mathematical models, which are typically represented as systems of ordinary differential equations. RESULTS: We show that, for the purpose of structure identification, the substitution of differentials with estimated slopes in non-linear network models reduces the coupled system of differential equations to several sets of decoupled algebraic equations, which can be processed efficiently in parallel or sequentially. The estimation of slopes for each time series of the metabolic or proteomic profile is accomplished with a 'universal function' that is computed directly from the data by cross-validated training of an artificial neural network (ANN). CONCLUSIONS: Without preprocessing, the inverse problem of determining structure from metabolic or proteomic profile data is challenging and computationally expensive. The combination of system decoupling and data fitting with universal functions simplifies this inverse problem very significantly. Examples show successful estimations and current limitations of the method. AVAILABILITY: A preliminary Web-based application for ANN smoothing is accessible at http://bioinformatics.musc.edu/webmetabol/. S-systems can be interactively analyzed with the user-friendly freeware PLAS (http://correio.cc.fc.ul.pt/~aenf/plas.html) or with the MATLAB module BSTLab (http://bioinformatics.musc.edu/bstlab/), which is currently being beta-tested.  相似文献   

19.
MOTIVATION: There has been much interest in using patterns derived from surface-enhanced laser desorption and ionization (SELDI) protein mass spectra from serum to differentiate samples from patients both with and without disease. Such patterns have been used without identification of the underlying proteins responsible. However, there are questions as to the stability of this procedure over multiple experiments. RESULTS: We compared SELDI proteomic spectra from serum from three experiments by the same group on separating ovarian cancer from normal tissue. These spectra are available on the web at http://clinicalproteomics.steem.com. In general, the results were not reproducible across experiments. Baseline correction prevents reproduction of the results for two of the experiments. In one experiment, there is evidence of a major shift in protocol mid-experiment which could bias the results. In another, structure in the noise regions of the spectra allows us to distinguish normal from cancer, suggesting that the normals and cancers were processed differently. Sets of features found to discriminate well in one experiment do not generalize to other experiments. Finally, the mass calibration in all three experiments appears suspect. Taken together, these and other concerns suggest that much of the structure uncovered in these experiments could be due to artifacts of sample processing, not to the underlying biology of cancer. We provide some guidelines for design and analysis in experiments like these to ensure better reproducible, biologically meaningfully results. AVAILABILITY: The MATLAB and Perl code used in our analyses is available at http://bioinformatics.mdanderson.org  相似文献   

20.
Motivation: As the use of microarrays in human studies continuesto increase, stringent quality assurance is necessary to ensureaccurate experimental interpretation. We present a formal approachfor microarray quality assessment that is based on dimensionreduction of established measures of signal and noise componentsof expression followed by parametric multivariate outlier testing. Results: We applied our approach to several data resources.First, as a negative control, we found that the Affymetrix andIllumina contributions to MAQC data were free from outliersat a nominal outlier flagging rate of =0.01. Second, we createda tunable framework for artificially corrupting intensity datafrom the Affymetrix Latin Square spike-in experiment to allowinvestigation of sensitivity and specificity of quality assurance(QA) criteria. Third, we applied the procedure to 507 Affymetrixmicroarray GeneChips processed with RNA from human peripheralblood samples. We show that exclusion of arrays by this approachsubstantially increases inferential power, or the ability todetect differential expression, in large clinical studies. Availability: http://bioconductor.org/packages/2.3/bioc/html/arrayMvout.htmland http://bioconductor.org/packages/2.3/bioc/html/affyContam.htmlaffyContam (credentials: readonly/readonly) Contact: aasare{at}immunetolerance.org; stvjc{at}channing.harvard.edu The authors wish it to be known that, in their opinion, thefirst two authors should be regarded as joint First Authors. Associate Editor: Trey Ideker  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号