期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Accurate Assignment of Significance to Neuropeptide Identifications Using Monte Carlo K-Permuted Decoy Databases

Malik N. Akhtar Bruce R. Southey Per E. Andrén Jonathan V. Sweedler Sandra L. Rodriguez-Zas 《PloS one》2014,9(10)

In support of accurate neuropeptide identification in mass spectrometry experiments, novel Monte Carlo permutation testing was used to compute significance values. Testing was based on k-permuted decoy databases, where k denotes the number of permutations. These databases were integrated with a range of peptide identification indicators from three popular open-source database search software (OMSSA, Crux, and X! Tandem) to assess the statistical significance of neuropeptide spectra matches. Significance p-values were computed as the fraction of the sequences in the database with match indicator value better than or equal to the true target spectra. When applied to a test-bed of all known manually annotated mouse neuropeptides, permutation tests with k-permuted decoy databases identified up to 100% of the neuropeptides at p-value < 10⁻⁵. The permutation test p-values using hyperscore (X! Tandem), E-value (OMSSA) and Sp score (Crux) match indicators outperformed all other match indicators. The robust performance to detect peptides of the intuitive indicator “number of matched ions between the experimental and theoretical spectra” highlights the importance of considering this indicator when the p-value was borderline significant. Our findings suggest permutation decoy databases of size 1×10⁵ are adequate to accurately detect neuropeptides and this can be exploited to increase the speed of the search. The straightforward Monte Carlo permutation testing (comparable to a zero order Markov model) can be easily combined with existing peptide identification software to enable accurate and effective neuropeptide detection. The source code is available at http://stagbeetle.animal.uiuc.edu/pepshop/MSMSpermutationtesting. 相似文献

2.

Meta-analysis of untargeted metabolomic data from multiple profiling experiments

Patti GJ Tautenhahn R Siuzdak G 《Nature protocols》2012,7(3):508-516

metaXCMS is a software program for the analysis of liquid chromatography/mass spectrometry-based untargeted metabolomic data. It is designed to identify the differences between metabolic profiles across multiple sample groups (e.g., 'healthy' versus 'active disease' versus 'inactive disease'). Although performing pairwise comparisons alone can provide physiologically relevant data, these experiments often result in hundreds of differences, and comparison with additional biologically meaningful sample groups can allow for substantial data reduction. By performing second-order (meta-) analysis, metaXCMS facilitates the prioritization of interesting metabolite features from large untargeted metabolomic data sets before the rate-limiting step of structural identification. Here we provide a detailed step-by-step protocol for going from raw mass spectrometry data to metaXCMS results, visualized as Venn diagrams and exported Microsoft Excel spreadsheets. There is no upper limit to the number of sample groups or individual samples that can be compared with the software, and data from most commercial mass spectrometers are supported. The speed of the analysis depends on computational resources and data volume, but will generally be less than 1 d for most users. metaXCMS is freely available at http://metlin.scripps.edu/metaxcms/. 相似文献

3.

TBI Server: A Web Server for Predicting Ion Effects in RNA Folding

Yuhong Zhu Zhaojian He Shi-Jie Chen 《PloS one》2015,10(3)

Background

Metal ions play a critical role in the stabilization of RNA structures. Therefore, accurate prediction of the ion effects in RNA folding can have a far-reaching impact on our understanding of RNA structure and function. Multivalent ions, especially Mg²⁺, are essential for RNA tertiary structure formation. These ions can possibly become strongly correlated in the close vicinity of RNA surface. Most of the currently available software packages, which have widespread success in predicting ion effects in biomolecular systems, however, do not explicitly account for the ion correlation effect. Therefore, it is important to develop a software package/web server for the prediction of ion electrostatics in RNA folding by including ion correlation effects.

Results

The TBI web server http://rna.physics.missouri.edu/tbi_index.html provides predictions for the total electrostatic free energy, the different free energy components, and the mean number and the most probable distributions of the bound ions. A novel feature of the TBI server is its ability to account for ion correlation and ion distribution fluctuation effects.

Conclusions

By accounting for the ion correlation and fluctuation effects, the TBI server is a unique online tool for computing ion-mediated electrostatic properties for given RNA structures. The results can provide important data for in-depth analysis for ion effects in RNA folding including the ion-dependence of folding stability, ion uptake in the folding process, and the interplay between the different energetic components. 相似文献

4.

Kraken: ultrafast metagenomic sequence classification using exact alignments

Derrick E Wood Steven L Salzberg 《Genome biology》2014,15(3):R46

Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of k-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at http://ccb.jhu.edu/software/kraken/. 相似文献

5.

MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis

Tabb DL Fernando CG Chambers MC 《Journal of proteome research》2007,6(2):654-661

Shotgun proteomics experiments are dependent upon database search engines to identify peptides from tandem mass spectra. Many of these algorithms score potential identifications by evaluating the number of fragment ions matched between each peptide sequence and an observed spectrum. These systems, however, generally do not distinguish between matching an intense peak and matching a minor peak. We have developed a statistical model to score peptide matches that is based upon the multivariate hypergeometric distribution. This scorer, part of the "MyriMatch" database search engine, places greater emphasis on matching intense peaks. The probability that the best match for each spectrum has occurred by random chance can be employed to separate correct matches from random ones. We evaluated this software on data sets from three different laboratories employing three different ion trap instruments. Employing a novel system for testing discrimination, we demonstrate that stratifying peaks into multiple intensity classes improves the discrimination of scoring. We compare MyriMatch results to those of Sequest and X!Tandem, revealing that it is capable of higher discrimination than either of these algorithms. When minimal peak filtering is employed, performance plummets for a scoring model that does not stratify matched peaks by intensity. On the other hand, we find that MyriMatch discrimination improves as more peaks are retained in each spectrum. MyriMatch also scales well to tandem mass spectra from high-resolution mass analyzers. These findings may indicate limitations for existing database search scorers that count matched peaks without differentiating them by intensity. This software and source code is available under Mozilla Public License at this URL: http://www.mc.vanderbilt.edu/msrc/bioinformatics/. 相似文献

6.

BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions

Kasper D Hansen Benjamin Langmead Rafael A Irizarry 《Genome biology》2012,13(10):R83

DNA methylation is an important epigenetic modification involved in gene regulation, which can now be measured using whole-genome bisulfite sequencing. However, cost, complexity of the data, and lack of comprehensive analytical tools are major challenges that keep this technology from becoming widely applied. Here we present BSmooth, an alignment, quality control and analysis pipeline that provides accurate and precise results even with low coverage data, appropriately handling biological replicates. BSmooth is open source software, and can be downloaded from http://rafalab.jhsph.edu/bsmooth. 相似文献

7.

A Daily-Updated Database and Tools for Comprehensive SARS-CoV-2 Mutation-Annotated Trees

Jakob McBroome Bryan Thornlow Angie S Hinrichs Alexander Kramer Nicola De Maio Nick Goldman David Haussler Russell Corbett-Detig Yatish Turakhia 《Molecular biology and evolution》2021,38(12):5819

The vast scale of SARS-CoV-2 sequencing data has made it increasingly challenging to comprehensively analyze all available data using existing tools and file formats. To address this, we present a database of SARS-CoV-2 phylogenetic trees inferred with unrestricted public sequences, which we update daily to incorporate new sequences. Our database uses the recently proposed mutation-annotated tree (MAT) format to efficiently encode the tree with branches labeled with parsimony-inferred mutations, as well as Nextstrain clade and Pango lineage labels at clade roots. As of June 9, 2021, our SARS-CoV-2 MAT consists of 834,521 sequences and provides a comprehensive view of the virus’ evolutionary history using public data. We also present matUtils—a command-line utility for rapidly querying, interpreting, and manipulating the MATs. Our daily-updated SARS-CoV-2 MAT database and matUtils software are available at http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/ and https://github.com/yatisht/usher, respectively. 相似文献

8.

TopHat2: accurate alignment of transcriptomes in the presence of insertions,deletions and gene fusions

Daehwan Kim Geo Pertea Cole Trapnell Harold Pimentel Ryan Kelley Steven L Salzberg 《Genome biology》2013,14(4):R36

相似文献

9.

CGAL: computing genome assembly likelihoods

Atif Rahman Lior Pachter 《Genome biology》2013,14(1):R8

Assembly algorithms have been extensively benchmarked using simulated data so that results can be compared to ground truth. However, in de novo assembly, only crude metrics such as contig number and size are typically used to evaluate assembly quality. We present CGAL, a novel likelihood-based approach to assembly assessment in the absence of a ground truth. We show that likelihood is more accurate than other metrics currently used for evaluating assemblies, and describe its application to the optimization and comparison of assembly algorithms. Our methods are implemented in software that is freely available at http://bio.math.berkeley.edu/cgal/. 相似文献

10.

DBGC: A Database of Human Gastric Cancer

Chao Wang Jun Zhang Mingdeng Cai Zhenggang Zhu Wenjie Gu Yingyan Yu Xiaoyan Zhang 《PloS one》2015,10(11)

相似文献

11.

MUSiCC: a marker genes based framework for metagenomic normalization and accurate profiling of gene abundances in the microbiome

Ohad Manor Elhanan Borenstein 《Genome biology》2015,16(1)

Functional metagenomic analyses commonly involve a normalization step, where measured levels of genes or pathways are converted into relative abundances. Here, we demonstrate that this normalization scheme introduces marked biases both across and within human microbiome samples, and identify sample- and gene-specific properties that contribute to these biases. We introduce an alternative normalization paradigm, MUSiCC, which combines universal single-copy genes with machine learning methods to correct these biases and to obtain an accurate and biologically meaningful measure of gene abundances. Finally, we demonstrate that MUSiCC significantly improves downstream discovery of functional shifts in the microbiome.MUSiCC is available at http://elbo.gs.washington.edu/software.html.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0610-8) contains supplementary material, which is available to authorized users. 相似文献

12.

SCNProDB: A database for the identification of soybean cyst nematode proteins

Savithiry Natarajan Mona Tavakolan Nadim W Alkharouf Benjamin F Matthews 《Bioinformation》2014,10(6):387-389

Soybean cyst nematode (Heterodera glycines, SCN) is the most destructive pathogen of soybean around the world. Crop rotation and resistant cultivars are used to mitigate the damage of SCN, but these approaches are not completely successful because of the varied SCN populations. Thus, the limitations of these practices with soybean dictate investigation of other avenues of protection of soybean against SCN, perhaps through genetically engineering of broad resistance to SCN. For better understanding of the consequences of genetic manipulation, elucidation of SCN protein composition at the subunit level is necessary. We have conducted studies to determine the composition of SCN proteins using a proteomics approach in our laboratory using twodimensional polyacrylamide gel electrophoresis (2D-PAGE) to separate SCN proteins and to characterize the proteins further using mass spectrometry. Our analysis resulted in the identification of several hundred proteins. In this investigation, we developed a web based database (SCNProDB) containing protein information obtained from our previous published studies. This database will be useful to scientists who wish to develop SCN resistant soybean varieties through genetic manipulation and breeding efforts. The database is freely accessible from: http://bioinformatics.towson.edu/Soybean_SCN_proteins_2D_Gel_DB/Gel1.aspx 相似文献

13.

SFESA: a web server for pairwise alignment refinement by secondary structure shifts

Jing Tong Jimin Pei Nick V. Grishin 《BMC bioinformatics》2015,16(1)

Background

Protein sequence alignment is essential for a variety of tasks such as homology modeling and active site prediction. Alignment errors remain the main cause of low-quality structure models. A bioinformatics tool to refine alignments is needed to make protein alignments more accurate.

Results

We developed the SFESA web server to refine pairwise protein sequence alignments. Compared to the previous version of SFESA, which required a set of 3D coordinates for a protein, the new server will search a sequence database for the closest homolog with an available 3D structure to be used as a template. For each alignment block defined by secondary structure elements in the template, SFESA evaluates alignment variants generated by local shifts and selects the best-scoring alignment variant. A scoring function that combines the sequence score of profile-profile comparison and the structure score of template-derived contact energy is used for evaluation of alignments. PROMALS pairwise alignments refined by SFESA are more accurate than those produced by current advanced alignment methods such as HHpred and CNFpred. In addition, SFESA also improves alignments generated by other software.

Conclusions

SFESA is a web-based tool for alignment refinement, designed for researchers to compute, refine, and evaluate pairwise alignments with a combined sequence and structure scoring of alignment blocks. To our knowledge, the SFESA web server is the only tool that refines alignments by evaluating local shifts of secondary structure elements. The SFESA web server is available at http://prodata.swmed.edu/sfesa. 相似文献

14.

Metabolomic and transcriptomic analysis of the rice response to the bacterial blight pathogen <Emphasis Type="Italic">Xanthomonas oryzae</Emphasis> pv. <Emphasis Type="Italic">oryzae</Emphasis>

Theodore R. Sana Steve Fischer Gert Wohlgemuth Anjali Katrekar Ki-hong Jung Pam C. Ronald Oliver Fiehn 《Metabolomics : Official journal of the Metabolomic Society》2010,6(3):451-465

相似文献

15.

Minimalistic Predictor of Protein Binding Energy: Contribution of Solvation Factor to Protein Binding

Jeong-Mo Choi Adrian?W.R. Serohijos Sean Murphy Dennis Lucarelli Leo?L. Lofranco Andrew Feldman Eugene?I. Shakhnovich 《Biophysical journal》2015,108(4):795-798

It has long been known that solvation plays an important role in protein-protein interactions. Here, we use a minimalistic solvation-based model for predicting protein binding energy to estimate quantitatively the contribution of the solvation factor in protein binding. The factor is described by a simple linear combination of buried surface areas according to amino-acid types. Even without structural optimization, our minimalistic model demonstrates a predictive power comparable to more complex methods, making the proposed approach the basis for high throughput applications. Application of the model to a proteomic database shows that receptor-substrate complexes involved in signaling have lower affinities than enzyme-inhibitor and antibody-antigen complexes, and they differ by chemical compositions on interfaces. Also, we found that protein complexes with components that come from the same genes generally have lower affinities than complexes formed by proteins from different genes, but in this case the difference originates from different interface areas. The model was implemented in the software PYTHON, and the source code can be found on the Shakhnovich group webpage: http://faculty.chemistry.harvard.edu/shakhnovich/software. 相似文献

16.

GMEnzy: A Genetically Modified Enzybiotic Database

Hongyu Wu Jinjiang Huang Hairong Lu Guodong Li Qingshan Huang 《PloS one》2014,9(8)

GMEs are genetically modified enzybiotics created through molecular engineering approaches to deal with the increasing problem of antibiotic resistance prevalence. We present a fully manually curated database, GMEnzy, which focuses on GMEs and their design strategies, production and purification methods, and biological activity data. GMEnzy collects and integrates all available GMEs and their related information into one web based database. Currently GMEnzy holds 186 GMEs from published literature. The GMEnzy interface is easy to use, and allows users to rapidly retrieve data according to desired search criteria. GMEnzy’s construction will increase the efficiency and convenience of improving these bioactive proteins for specific requirements, and will expand the arsenal available for researches to control drug-resistant pathogens. This database will prove valuable for researchers interested in genetically modified enzybiotics studies. GMEnzy is freely available on the Web at http://biotechlab.fudan.edu.cn/database/gmenzy/. 相似文献

17.

Image Alignment for Tomography Reconstruction from Synchrotron X-Ray Microscopic Images

Chang-Chieh Cheng Chia-Chi Chien Hsiang-Hsin Chen Yeukuang Hwu Yu-Tai Ching 《PloS one》2014,9(1)

A synchrotron X-ray microscope is a powerful imaging apparatus for taking high-resolution and high-contrast X-ray images of nanoscale objects. A sufficient number of X-ray projection images from different angles is required for constructing 3D volume images of an object. Because a synchrotron light source is immobile, a rotational object holder is required for tomography. At a resolution of 10 nm per pixel, the vibration of the holder caused by rotating the object cannot be disregarded if tomographic images are to be reconstructed accurately. This paper presents a computer method to compensate for the vibration of the rotational holder by aligning neighboring X-ray images. This alignment process involves two steps. The first step is to match the “projected feature points” in the sequence of images. The matched projected feature points in the - plane should form a set of sine-shaped loci. The second step is to fit the loci to a set of sine waves to compute the parameters required for alignment. The experimental results show that the proposed method outperforms two previously proposed methods, Xradia and SPIDER. The developed software system can be downloaded from the URL, http://www.cs.nctu.edu.tw/~chengchc/SCTA or http://goo.gl/s4AMx. 相似文献

18.

Evaluation of de novo transcriptome assemblies from RNA-Seq data

Bo Li Nathanael Fillmore Yongsheng Bai Mike Collins James A Thomson Ron Stewart Colin N Dewey 《Genome biology》2014,15(12)

相似文献

19.

THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data

Layla Oesper Ahmad Mahmoody Benjamin J Raphael 《Genome biology》2013,14(7):R80

Tumor samples are typically heterogeneous, containing admixture by normal, non-cancerous cells and one or more subpopulations of cancerous cells. Whole-genome sequencing of a tumor sample yields reads from this mixture, but does not directly reveal the cell of origin for each read. We introduce THetA (Tumor Heterogeneity Analysis), an algorithm that infers the most likely collection of genomes and their proportions in a sample, for the case where copy number aberrations distinguish subpopulations. THetA successfully estimates normal admixture and recovers clonal and subclonal copy number aberrations in real and simulated sequencing data. THetA is available at http://compbio.cs.brown.edu/software/. 相似文献

20.

De novo assembly of bacterial transcriptomes from RNA-seq data

Brian Tjaden 《Genome biology》2015,16(1)

相似文献