首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Paul A. Rudnick 《Proteomics》2013,13(22):3247-3250
Spectral library searching has many advantages over sequence database searching, yet it has not been widely adopted. One possible reason for this is that users are unsure exactly how to interpret the similarity scores (e.g., “dot products” are not probability‐based scores). Methods to create decoys have been proposed, but, as developers caution, may produce proxies that are not equivalent to reversed sequences. In this issue, Shao et al. (Proteomics 2013, 13, 3273–3283) report advances in spectral library searching where the focus is not on improving the performance of their search engine, SpectraST, but is instead on improving the statistical meaningfulness of its discriminant score and removing the need for decoys. The results in their paper indicate that by “standardizing” the input and library spectra, sensitivity is not lost but is, surprisingly, gained. Their tests also show that false discovery rate (FDR) estimates, derived from their new score, track better with “ground truth” than decoy searching. It is possible that their work strikes a good balance between the theory of library searching and its application. And as such, they hope to have removed a major entrance barrier for some researchers previously unwilling to try library searching.  相似文献   

2.
Hu Y  Li Y  Lam H 《Proteomics》2011,11(24):4702-4711
Spectral library searching is a promising alternative to sequence database searching in peptide identification from MS/MS spectra. The key advantage of spectral library searching is the utilization of more spectral features to improve score discrimination between good and bad matches, and hence sensitivity. However, the coverage of reference spectral library is limited by current experimental and computational methods. We developed a computational approach to expand the coverage of spectral libraries with semi-empirical spectra predicted from perturbing known spectra of similar sequences, such as those with single amino acid substitutions. We hypothesized that the peptide of similar sequences should produce similar fragmentation patterns, at least in most cases. Our results confirm our hypothesis and specify when this approach can be applied. In actual spectral searching of real data sets, the sensitivity advantage of spectral library searching over sequence database searching can be mostly retained even when all real spectra are replaced by semi-empirical ones. We demonstrated the applicability of this approach by detecting several known non-synonymous single-nucleotide polymorphisms in three large human data sets by spectral searching.  相似文献   

3.
Spectral library searching is an emerging approach in peptide identifications from tandem mass spectra, a critical step in proteomic data analysis. In spectral library searching, a spectral library is first meticulously compiled from a large collection of previously observed peptide MS/MS spectra that are conclusively assigned to their corresponding amino acid sequence. An unknown spectrum is then identified by comparing it to all the candidates in the spectral library for the most similar match. This review discusses the basic principles of spectral library building and searching, describes its advantages and limitations, and provides a primer for researchers interested in adopting this new approach in their data analysis. It will also discuss the future outlook on the evolution and utility of spectral libraries in the field of proteomics.  相似文献   

4.
A notable inefficiency of shotgun proteomics experiments is the repeated rediscovery of the same identifiable peptides by sequence database searching methods, which often are time-consuming and error-prone. A more precise and efficient method, in which previously observed and identified peptide MS/MS spectra are catalogued and condensed into searchable spectral libraries to allow new identifications by spectral matching, is seen as a promising alternative. To that end, an open-source, functionally complete, high-throughput and readily extensible MS/MS spectral searching tool, SpectraST, was developed. A high-quality spectral library was constructed by combining the high-confidence identifications of millions of spectra taken from various data repositories and searched using four sequence search engines. The resulting library consists of over 30,000 spectra for Saccharomyces cerevisiae. Using this library, SpectraST vastly outperforms the sequence search engine SEQUEST in terms of speed and the ability to discriminate good and bad hits. A unique advantage of SpectraST is its full integration into the popular Trans Proteomic Pipeline suite of software, which facilitates user adoption and provides important functionalities such as peptide and protein probability assignment, quantification, and data visualization. This method of spectral library searching is especially suited for targeted proteomics applications, offering superior performance to traditional sequence searching.  相似文献   

5.
基于质谱的蛋白质组学快速发展,蛋白质质谱数据也呈指数式增长。寻找速度快、准确度高以及重复性好的鉴定方法是该领域的一项重要任务。谱图库搜索策略直接比较实验谱图与谱图库中的真实谱图,充分利用了谱图中的丰度、非常规碎裂模式和其他的一些特征,使得搜索更加快速和准确,成为蛋白质组学的主流鉴定方法之一。文中介绍基于谱图库的蛋白质组质谱数据鉴定策略,并针对其中两个关键步骤——谱图库构建方法和谱图库搜索方法进行深入介绍,探讨了谱图库策略的进展和挑战。  相似文献   

6.
Searching spectral libraries in MS/MS is an important new approach to improving the quality of peptide and protein identification. The idea relies on the observation that ion intensities in an MS/MS spectrum of a given peptide are generally reproducible across experiments, and thus, matching between spectra from an experiment and the spectra of previously identified peptides stored in a spectral library can lead to better peptide identification compared to the traditional database search. However, the use of libraries is greatly limited by their coverage of peptide sequences: even for well‐studied organisms a large fraction of peptides have not been previously identified. To address this issue, we propose to expand spectral libraries by predicting the MS/MS spectra of peptides based on the spectra of peptides with similar sequences. We first demonstrate that the intensity patterns of dominant fragment ions between similar peptides tend to be similar. In accordance with this observation, we develop a neighbor‐based approach that first selects peptides that are likely to have spectra similar to the target peptide and then combines their spectra using a weighted K‐nearest neighbor method to accurately predict fragment ion intensities corresponding to the target peptide. This approach has the potential to predict spectra for every peptide in the proteome. When rigorous quality criteria are applied, we estimate that the method increases the coverage of spectral libraries available from the National Institute of Standards and Technology by 20–60%, although the values vary with peptide length and charge state. We find that the overall best search performance is achieved when spectral libraries are supplemented by the high quality predicted spectra.  相似文献   

7.
Here we describe the updated MolProbity rotamer‐library distributions derived from an order‐of‐magnitude larger and more stringently quality‐filtered dataset of about 8000 (vs. 500) protein chains, and we explain the resulting changes and improvements to model validation as seen by users. To include only side‐chains with satisfactory justification for their given conformation, we added residue‐specific filters for electron‐density value and model‐to‐density fit. The combined new protocol retains a million residues of data, while cleaning up false‐positive noise in the multi‐ datapoint distributions. It enables unambiguous characterization of conformational clusters nearly 1000‐fold less frequent than the most common ones. We describe examples of local interactions that favor these rare conformations, including the role of authentic covalent bond‐angle deviations in enabling presumably strained side‐chain conformations. Further, along with favored and outlier, an allowed category (0.3–2.0% occurrence in reference data) has been added, analogous to Ramachandran validation categories. The new rotamer distributions are used for current rotamer validation in MolProbity and PHENIX, and for rotamer choice in PHENIX model‐building and refinement. The multi‐dimensional distributions and Top8000 reference dataset are freely available on GitHub. These rotamers are termed “ultimate” because data sampling and quality are now fully adequate for this task, and also because we believe the future of conformational validation should integrate side‐chain with backbone criteria. Proteins 2016; 84:1177–1189. © 2016 Wiley Periodicals, Inc.  相似文献   

8.
9.
Searching a spectral library for the identification of protein MS/MS data has proven to be a fast and accurate method, while yielding a high identification rate. We investigated the potential to increase peptide discovery rate, with little increase in computational time, by constructing a workflow based on a sequence search with Phenyx followed by a library search with SpectraST. Searching a consensus library compiled from the search results of the prior Phenyx search increased the number of confidently matched spectra by up to 156%. Additionally matched spectra by SpectraST included noisy spectra, spectra representing missed cleaved peptides as well as spectra from post‐translationally modified peptides.  相似文献   

10.
As proteomic data sets increase in size and complexity, the necessity for database‐centric software systems able to organize, compare, and visualize all the proteomic experiments in a lab grows. We recently developed an integrated platform called high‐throughput autonomous proteomic pipeline (HTAPP) for the automated acquisition and processing of quantitative proteomic data, and integration of proteomic results with existing external protein information resources within a lab‐based relational database called PeptideDepot. Here, we introduce the peptide validation software component of this system, which combines relational database‐integrated electronic manual spectral annotation in Java with a new software tool in the R programming language for the generation of logistic regression spectral models from user‐supplied validated data sets and flexible application of these user‐generated models in automated proteomic workflows. This logistic regression spectral model uses both variables computed directly from SEQUEST output in addition to deterministic variables based on expert manual validation criteria of spectral quality. In the case of linear quadrupole ion trap (LTQ) or LTQ‐FTICR LC/MS data, our logistic spectral model outperformed both XCorr (242% more peptides identified on average) and the X!Tandem E‐value (87% more peptides identified on average) at a 1% false discovery rate estimated by decoy database approach.  相似文献   

11.
pyOpenMS is an open‐source, Python‐based interface to the C++ OpenMS library, providing facile access to a feature‐rich, open‐source algorithm library for MS‐based proteomics analysis. It contains Python bindings that allow raw access to the data structures and algorithms implemented in OpenMS, specifically those for file access (mzXML, mzML, TraML, mzIdentML among others), basic signal processing (smoothing, filtering, de‐isotoping, and peak‐picking) and complex data analysis (including label‐free, SILAC, iTRAQ, and SWATH analysis tools). pyOpenMS thus allows fast prototyping and efficient workflow development in a fully interactive manner (using the interactive Python interpreter) and is also ideally suited for researchers not proficient in C++. In addition, our code to wrap a complex C++ library is completely open‐source, allowing other projects to create similar bindings with ease. The pyOpenMS framework is freely available at https://pypi.python.org/pypi/pyopenms while the autowrap tool to create Cython code automatically is available at https://pypi.python.org/pypi/autowrap (both released under the 3‐clause BSD licence).  相似文献   

12.
The quantification of changes in protein abundance in complex biological specimens is essential for proteomic studies in basic and applied research. Here we report on the development and validation of the DeepQuanTR software for identification and quantification of differentially expressed proteins using LC‐MALDI‐MS. Following enzymatic digestion, HPLC peptide separation and normalization of MALDI‐MS signal intensities to the ones of internal standards, the software extracts peptide features, adjusts differences in HPLC retention times and performs a relative quantification of features. The annotation of multiple peptides to the corresponding parent protein allows the definition of a Protein Quant Value, which is related to protein abundance and which allows inter‐sample comparisons. The performance of DeepQuanTR was evaluated by analyzing 24 samples deriving from human serum spiked with different amounts of four proteins and eight complex samples of vascular proteins, derived from surgically resected human kidneys with cancer following ex vivo perfusion with a reactive ester biotin derivative. The identification and experimental validation of proteins, which were differentially regulated in cancerous lesions as compared with normal kidney, was used to demonstrate the power of DeepQuanTR. This software, which can easily be used with established proteomic methodologies, facilitates the relative quantification of proteins derived from a wide variety of different samples.  相似文献   

13.
Protein identification by MS is an important technique in both gel‐based and gel‐free proteome studies. The Open Mass Spectrometry Search Algorithm (OMSSA) ( http://pubchem.ncbi.nlm.nih.gov/omssa ) is an open‐source search engine that can be used to identify MS/MS spectra acquired in these experiments. We here present a lightweight, open‐source Java software library, OMSSA Parser ( http://code.google.com/p/omssa‐parser ), which parses OMSSA omx result files into easy accessible and fully functional object models. In addition, we also provide examples illustrating the usage of our library.  相似文献   

14.
Identification of proteins by MS plays an important role in proteomics. A crucial step concerns the identification of peptides from MS/MS spectra. The X!Tandem Project ( http://www.thegpm.org/tandem ) supplies an open‐source search engine for this purpose. In this study, we present an open‐source Java library called XTandem Parser that parses X!Tandem XML result files into an easily accessible and fully functional object model ( http://xtandem‐parser.googlecode.com ). In addition, a graphical user interface is provided that functions as a usage example and an end‐user visualization tool.  相似文献   

15.
16.
Designed ankyrin repeat proteins (DARPins) are well‐established binding molecules based on a highly stable nonantibody scaffold. Building on 13 crystal structures of DARPin‐target complexes and stability measurements of DARPin mutants, we have generated a new DARPin library containing an extended randomized surface. To counteract the enrichment of unspecific hydrophobic binders during selections against difficult targets containing hydrophobic surfaces such as membrane proteins, the frequency of apolar residues at diversified positions was drastically reduced and substituted by an increased number of tyrosines. Ribosome display selections against two human caspases and membrane transporter AcrB yielded highly enriched pools of unique and strong DARPin binders which were mainly monomeric. We noted a prominent enrichment of tryptophan residues during binder selections. A crystal structure of a representative of this library in complex with caspase‐7 visualizes the key roles of both tryptophans and tyrosines in providing target contacts. These aromatic and polar side chains thus substitute the apolar residues valine, leucine, isoleucine, methionine, and phenylalanine of the original DARPins. Our work describes biophysical and structural analyses required to extend existing binder scaffolds and simplifies an existing protocol for the assembly of highly diverse synthetic binder libraries.  相似文献   

17.
18.
An important part of understanding the evolution of behavior is understanding how and why behavior develops and changes throughout ontogeny. Patterns of behavior are shaped by an animal's capabilities as well as its motivations, both of which are subject to selection. We ran an experiment to see how spiders' efforts to recover lost prey change with age and to determine the relative contributions of shifts in capability and motivation. We found that as spiders mature, they spend less time searching to recover lost prey, and they discriminate less between prey of different sizes. We also found that even the youngest, least experienced spiders are cognitively equipped to search for lost prey. Thus, predatory behavior in spiders fluctuated primarily with each age group's motivations to capture and consume prey, and did not seem to be hindered by behavioral or cognitive limitations at young ages.  相似文献   

19.
20.
The relation of vitiligo/non‐segmental vitiligo (NSV) to Koebner's phenomenon is variably appreciated. Our objective was to develop and validate a simple clinical score for Koebner's phenomenon (KP) in patients with vitiligo/NSV. The study population was composed of 351 individuals in the development sample and 285 patients in the validation sample. Seven variables were independently associated with the presence of KP: disease duration of more than 3 yr, forehead + scalp areas, eyelids, wrists, genital + belt areas, knees and tibial crests. The score computed by the weighted sum of the rounded coefficients of these seven variables ranged from 0 to 56 (mean 38.39 ± 22.93). The probability of having KP was computed as follows: exp (?2.37 + 0.1*score)/exp [1 + (?2.37 + 0.1*score)]. When applying the score to each patient in the validation and the development sample, the score maintained adequate discrimination and calibration (AUC‐ROC = 0.78), arguing that KP can be adequately predicted using our score. Further studies should evaluate KP assessed by the K‐VSCOR in clinical practice with the aim to determine its association with clinical profile, course and treatment response of vitiligo.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号