期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Algorithms for the de novo sequencing of peptides from tandem mass spectra

Allmer J 《Expert review of proteomics》2011,8(5):645-657

Proteomics is the study of proteins, their time- and location-dependent expression profiles, as well as their modifications and interactions. Mass spectrometry is useful to investigate many of the questions asked in proteomics. Database search methods are typically employed to identify proteins from complex mixtures. However, databases are not often available or, despite their availability, some sequences are not readily found therein. To overcome this problem, de novo sequencing can be used to directly assign a peptide sequence to a tandem mass spectrometry spectrum. Many algorithms have been proposed for de novo sequencing and a selection of them are detailed in this article. Although a standard accuracy measure has not been agreed upon in the field, relative algorithm performance is discussed. The current state of the de novo sequencing is assessed thereafter and, finally, examples are used to construct possible future perspectives of the field. 相似文献

2.

Algorithms for the de novo sequencing of peptides from tandem mass spectra

《Expert review of proteomics》2013,10(5):645-657

Proteomics is the study of proteins, their time- and location-dependent expression profiles, as well as their modifications and interactions. Mass spectrometry is useful to investigate many of the questions asked in proteomics. Database search methods are typically employed to identify proteins from complex mixtures. However, databases are not often available or, despite their availability, some sequences are not readily found therein. To overcome this problem, de novo sequencing can be used to directly assign a peptide sequence to a tandem mass spectrometry spectrum. Many algorithms have been proposed for de novo sequencing and a selection of them are detailed in this article. Although a standard accuracy measure has not been agreed upon in the field, relative algorithm performance is discussed. The current state of the de novo sequencing is assessed thereafter and, finally, examples are used to construct possible future perspectives of the field. 相似文献

3.

ADEPTS: advanced peptide de novo sequencing with a pair of tandem mass spectra

He L Ma B 《Journal of bioinformatics and computational biology》2010,8(6):981-994

De novo sequencing is an important task in proteomics to identify novel peptide sequences. Traditionally, only one MS/MS spectrum is used for the sequencing of a peptide; however, the use of multiple spectra of the same peptide with different types of fragmentation has the potential to significantly increase the accuracy and practicality of de novo sequencing. Research into the use of multiple spectra is in a nascent stage. We propose a general framework to combine the two different types of MS/MS data. Experiments demonstrate that our method significantly improves the de novo sequencing of existing software. 相似文献

4.

De novo peptide sequencing and identification with precision mass spectrometry 总被引：1，自引：0，他引：1

Frank AM Savitski MM Nielsen ML Zubarev RA Pevzner PA 《Journal of proteome research》2007,6(1):114-123

The recent proliferation of novel mass spectrometers such as Fourier transform, QTOF, and OrbiTrap marks a transition into the era of precision mass spectrometry, providing a 2 orders of magnitude boost to the mass resolution, as compared to low-precision ion-trap detectors. We investigate peptide de novo sequencing by precision mass spectrometry and explore some of the differences when compared to analysis of low-precision data. We demonstrate how the dramatically improved performance of de novo sequencing with precision mass spectrometry paves the way for novel approaches to peptide identification that are based on direct sequence lookups, rather than comparisons of spectra to a database. With the direct sequence lookup, it is not only possible to search a database very efficiently, but also to use the database in novel ways, such as searching for products of alternative splicing or products of fusion proteins in cancer. Our de novo sequencing software is available for download at http://peptide.ucsd.edu/. 相似文献

5.

SPIDER: software for protein identification from sequence tags with de novo sequencing error

Han Y Ma B Zhang K 《Journal of bioinformatics and computational biology》2005,3(3):697-716

For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software. 相似文献

6.

A suboptimal algorithm for de novo peptide sequencing via tandem mass spectrometry. 总被引：4，自引：0，他引：4

Bingwen Lu Ting Chen 《Journal of computational biology》2003,10(1):1-12

Tandem mass spectrometry has emerged to be one of the most powerful high-throughput techniques for protein identification. Tandem mass spectrometry selects and fragments peptides of interest into N-terminal ions and C-terminal ions, and it measures the mass/charge ratios of these ions. The de novo peptide sequencing problem is to derive the peptide sequences from given tandem mass spectral data of k ion peaks without searching against protein databases. By transforming the spectral data into a matrix spectrum graph G = (V, E), where |V| = O(k(2)) and |E| = O(k(3)), we give the first polynomial time suboptimal algorithm that finds all the suboptimal solutions (peptides) in O(p|E|) time, where p is the number of solutions. The algorithm has been implemented and tested on experimental data. The program is available at http://hto-c.usc.edu:8000/msms/menu/denovo.htm. 相似文献

7.

Towards de novo identification of metabolites by analyzing tandem mass spectra

Böcker S Rasche F 《Bioinformatics (Oxford, England)》2008,24(16):i49-i55

相似文献

8.

Enhancing TOF/TOF-based de novo sequencing capability for high throughput protein identification with amino acid-coded mass tagging

Shui W Liu Y Fan H Bao H Liang S Yang P Chen X 《Journal of proteome research》2005,4(1):83-90

Because of the intrinsic physical properties of single- or double-charged ions, MALDI-based CID on these peptide precursor ions tends to be incomplete, resulting in a large number of MS/MS spectra unassigned or ambiguously identified. Consequently, the TOF/TOF high throughput capability may not be fully explored and utilized. Here, we describe a novel method for de novo sequence assignment of those MALDI TOF/TOF MS/MS spectra with incomplete or weak fragment ion series. In this approach, the deuterium-labeled lysine and leucine precursors were used in parallel to mass-tag the proteome of a metastatic human hepatocellular carcinoma (HCC) cell line during in vivo cell culturing. These stable isotope precursor markers not only position at terminal but at internal MS/MS fragment ions with the characteristic isotope pattern induced by multiple mass tagging in parallel. This enhanced signal specificity evidently resolved ambiguities in those sparse poor-quality TOF/TOF spectra by providing critical sequential links among MS/MS fragment ions. Our data-dependent approach was able to reduce many false-positives in current genome sequence-based peptide sequencing. With developing new algorithms accordingly, our approach is amenable for automation that will lead to more comprehensive and reliable identification for proteomes. 相似文献

9.

Rapid and accurate peptide identification from tandem mass spectra

Park CY Klammer AA Käll L MacCoss MJ Noble WS 《Journal of proteome research》2008,7(7):3022-3027

Mass spectrometry, the core technology in the field of proteomics, promises to enable scientists to identify and quantify the entire complement of proteins in a complex biological sample. Currently, the primary bottleneck in this type of experiment is computational. Existing algorithms for interpreting mass spectra are slow and fail to identify a large proportion of the given spectra. We describe a database search program called Crux that reimplements and extends the widely used database search program Sequest. For speed, Crux uses a peptide indexing scheme to rapidly retrieve candidate peptides for a given spectrum. For each peptide in the target database, Crux generates shuffled decoy peptides on the fly, providing a good null model and, hence, accurate false discovery rate estimates. Crux also implements two recently described postprocessing methods: a p value calculation based upon fitting a Weibull distribution to the observed scores, and a semisupervised method that learns to discriminate between target and decoy matches. Both methods significantly improve the overall rate of peptide identification. Crux is implemented in C and is distributed with source code freely to noncommercial users. 相似文献

10.

Faster SEQUEST searching for peptide identification from tandem mass spectra

Diament BJ Noble WS 《Journal of proteome research》2011,10(9):3871-3879

Computational analysis of mass spectra remains the bottleneck in many proteomics experiments. SEQUEST was one of the earliest software packages to identify peptides from mass spectra by searching a database of known peptides. Though still popular, SEQUEST performs slowly. Crux and TurboSEQUEST have successfully sped up SEQUEST by adding a precomputed index to the search, but the demand for ever-faster peptide identification software continues to grow. Tide, introduced here, is a software program that implements the SEQUEST algorithm for peptide identification and that achieves a dramatic speedup over Crux and SEQUEST. The optimization strategies detailed here employ a combination of algorithmic and software engineering techniques to achieve speeds up to 170 times faster than a recent version of SEQUEST that uses indexing. For example, on a single Xeon CPU, Tide searches 10,000 spectra against a tryptic database of 27,499 Caenorhabditis elegans proteins at a rate of 1550 spectra per second, which compares favorably with a rate of 8.8 spectra per second for a recent version of SEQUEST with index running on the same hardware. 相似文献

11.

Modeling and characterization of multi-charge mass spectra for peptide sequencing

Chong KF Ning K Leong HW Pevzner P 《Journal of bioinformatics and computational biology》2006,4(6):1329-1352

Peptide sequencing using tandem mass spectrometry data is an important and challenging problem in proteomics. We address the problem of peptide sequencing for multi-charge spectra. Most peptide sequencing algorithms currently consider only charge one or two ions even for higher-charge spectra. We give a characterization of multi-charge spectra by generalizing existing models. Using our models, we analyzed spectra from Global Proteome Machine (GPM) [Craig R, Cortens JP, Beavis RC, J Proteome Res 3:1234-1242, 2004.] (with charges 1-5), Institute for Systems Biology (ISB) [Keller A, Purvine S, Nesvizhskii AI, Stolyar S, Goodlett DR, Kolker E, OMICS 6:207-212, 2002.] and Orbitrap (both with charges 1-3). Our analysis for the GPM dataset shows that higher charge peaks contribute significantly to prediction of the complete peptide. They also help to explain why existing algorithms do not perform well on multi-charge spectra. Based on these analyses, we claim that peptide sequencing algorithms can achieve higher sensitivity results if they also consider higher charge ions. We verify this claim by proposing a de novo sequencing algorithm called the greedy best strong tag (GBST) algorithm that is simple but considers higher charge ions based on our new model. Evaluation on multi-charge spectra shows that our simple GBST algorithm outperforms Lutefisk and PepNovo, especially for the GPM spectra of charge three or more. 相似文献

12.

De novo analysis of peptide tandem mass spectra by spectral graph partitioning.

Marshall Bern David Goldberg 《Journal of computational biology》2006,13(2):364-378

We report on a new de novo peptide sequencing algorithm that uses spectral graph partitioning. In this approach, relationships between m/z peaks are represented by attractive and repulsive springs, and the vibrational modes of the spring system are used to infer information about the peaks (such as "likely b-ion" or "likely y-ion"). We demonstrate the effectiveness of this approach by comparison with other de novo sequencers on test sets of ion-trap and QTOF spectra, including spectra of mixtures of peptides. On all datasets, we outperform the other sequencers. Along with spectral graph theory techniques, the new de novo sequencer EigenMS incorporates another improvement of independent interest: robust statistical methods for recalibration of time-of-flight mass measurements. Robust recalibration greatly outperforms simple least-squares recalibration, achieving about three times the accuracy for one QTOF dataset. 相似文献

13.

Selection of the peptide mass tolerance value for protein identification with peptide mass fingerprinting

A. L. Chernobrovkin O. P. Trifonova N. A. Petushkova E. A. Ponomarenko A. V. Lisitsa 《Russian Journal of Bioorganic Chemistry》2011,37(1):119-122

Peptide mass fingerprinting (PMF) is widely used for protein identification while studying proteome via time-of-flight mass spectrometer or via 1D or 2D electrophoresis. Peptide mass tolerance indicating the fit of theoretical peptide mass to an experimental one signifcantly influences protein identification. The role of peptide mass tolerance could be estimated by counting the number of correctly identified proteins for the reference set of mass spectra. The reference set of 400 Ultraflex (Bruker Daltonics, Germany) protein mass spectra was obtained for liver microsomes slices hydrolyzed via 1D gel electrophoresis. Using a Mascot server for protein identification, the peptide mass tolerance value varied within 0.02–0.40 Da with a step of 0.01 Da. The number of identified proteins changed up to 10 times depending on the tolerance. The maximal number of identified proteins was reported for the tolerance value of 0.15 Da (120 ppm) known to be 1.5–2-fold higher than the recommended values for such a type of mass spectrometer. The software program PMFScan was developed to obtain the dependence between the number of identified proteins and the tolerance values. 相似文献

14.

CIRI: an efficient and unbiased algorithm for de novo circular RNA identification 总被引：2，自引：0，他引：2

Yuan Gao Jinfeng Wang Fangqing Zhao 《Genome biology》2015,16(1)

相似文献

15.

Highly informative proteome analysis by combining improved N-terminal sulfonation for de novo peptide sequencing and online capillary reverse-phase liquid chromatography/tandem mass spectrometry 总被引：2，自引：0，他引：2

Lee YH Kim MS Choie WS Min HK Lee SW 《Proteomics》2004,4(6):1684-1694

Recently, various chemical modifications of peptides have been incorporated into mass spectrometric analyses of proteome samples, predominantly in conjunction with matrix-assisted laser desorption/ionization mass spectrometry (MALDI MS), to facilitate de novo sequencing of peptides. In this work, we investigate systematically the utility of N-terminal sulfonation of tryptic peptides by 4-sulfophenyl isothiocyanate (SPITC) for proteome analysis by capillary reverse-phase liquid chromatography/tandem mass spectrometry (cRPLC/MS/MS). The experimental conditions for the sulfonation were carefully adjusted so that SPITC reacts selectively with the N-terminal amino groups, even in the presence of the epsilon-amino groups of lysine residues. Mass spectrometric analyses of the modified peptides by cRPLC/MS/MS indicated that SPITC derivatization proceeded toward near completion under the experimental conditions employed here. The SPITC-derivatized peptides underwent facile fragmentation, predominantly resulting in y-series ions in the MS/MS spectra. Combining SPITC derivatization and cRPLC/MS/MS analyses facilitated the acquisition of sequence information for lysine-terminated tryptic peptides as well as arginine-terminated peptides without the need for additional peptide pretreatment, such as guanidination of lysine amino group. This process alleviated the biased detection of arginine-terminated peptides that is often observed in MALDI MS experiments. We will discuss the utility of the technique as a viable method for proteome analyses and present examples of its application in analyzing samples having different levels of complexity. 相似文献

16.

Cloneless genomic DNA analysis: an efficient and simple methods for de novo genomic sequencing projects and gap filling

Nguyen G Bukanov N Oshimura M Smith CL 《Biomolecular engineering》2005,21(6):135-144

The utility of using genomic DNA directly in agarose, i.e. cloneless libraries, in place of large clone libraries, radiation hybrid panels, or chromosome dissection was demonstrated. The advantage of the cloneless library approach is that, in principle, a targeted genomic resource can be developed rapidly for any genomic region using any genomic DNA sample. Here, a human chromosome 20 Not I fragment library was generated by slicing a pulsed field gel lane containing fractionating Not I cleaved DNA from a monosomic hybrid cell line into 2 mm pieces. A reliable PCR method using agarose embedded DNA was developed. InterAlu PCR generated unique patterns of products from adjacent slices (e.g. fractions). Further, the specificity of the interAlu products was demonstrated by FISH analysis and in other hybridization experiments to arrayed interAlu products. STS content mapping was used to order the fractions and also demonstrate the unique content of the library fractions. 相似文献

17.

Selection of the peptide mass tolerance value for the protein identification with peptide mass fingerprinting

Chernobrovkin AL Trifonova OP Petushkova NA Ponomarenko EA Lisitsa AV 《Bioorganicheskaia khimiia》2011,37(1):132-136

Peptide mass-fingerprint is widely used for protein identification while studying proteome with the use of 1D or 2D electrophoresis. Peptide mass tolerance indicates the fit of theoretical peptide mass with the experimental measurements, and choice of this parameter sufficiently influences the protein identification. The role of peptide mass tolerance was estimated by counting the number of identified proteins for the reference set of mass-spectra. The reference set of 400 Ultraflex (Bruker Daltonics, Germany) mass-spectra was obtained for the slices of 1D gel of liver microsomes. Using Mascot server for protein identification, the peptide mass tolerance value was varied in the range from 0.02 to 0.40 Da with a step 0.01 Da. Depending on the tolerance the number of identified protein changes up to 10 times. Maximal number of identified proteins was reported for the tolerance value of 0.15 Da (120 ppm), which is 1.5 - 2 times higher than the recommended values for such type of mass-spectrometers. The software program PMFScan was developed to obtain the dependence of number of identified proteins of the tolerance values. 相似文献

18.

Sequencing from compomers: using mass spectrometry for DNA de novo sequencing of 200+ nt.

Sebastian B?cker 《Journal of computational biology》2004,11(6):1110-1134

One of the main endeavors in today's life science remains the efficient sequencing of long DNA molecules. Today, most de novo sequencing of DNA is still performed using the electrophoresis-based Sanger concept of 1977, in spite of certain restrictions of this method. Methods using mass spectrometry to acquire the Sanger sequencing data are limited by short sequencing lengths of 15-25 nt. We propose a new method for DNA sequencing using base-specific cleavage and mass spectrometry that appears to be a promising alternative to classical DNA sequencing approaches. A single stranded DNA or RNA molecule is cleaved by a base-specific (bio-)chemical reaction using, for example, RNAses. The cleavage reaction is modified such that not all, but only a certain percentage of bases are cleaved. The resulting mixture of fragments is then analyzed using MALDI-TOF mass spectrometry, whereby we acquire the molecular masses of fragments. For every peak in the mass spectrum, we calculate those base compositions that will potentially create a peak of the observed mass and, repeating the cleavage reaction for all four bases, finally try to uniquely reconstruct the underlying sequence from these observed spectra. This leads us to the combinatorial problem of sequencing from compomers and, finally, to the graph-theoretical problem of finding a walk in a subgraph of the de Bruijn graph. Application of this method to simulated data indicates that it might be capable of sequencing DNA molecules with 200+ nt. 相似文献

19.

Antimicrobial peptides from the skin of the Asian frog, Odorrana jingdongensis: De novo sequencing and analysis of tandem mass spectrometry data

Liu J Jiang J Wu Z Xie F 《Journal of Proteomics》2012,75(18):5807-5821

Eight intact antimicrobial peptides were identified from the skin of Odorrana jingdongensis by de novo sequencing following low energy ESI CID Q-TOF MS/MS in positive-mode with the help of Edman degradation and structural similarity analysis. We devised exact mass measurements to discriminate the K/Q amino acid residue in the peptides between 2.0kDa to 3.8kDa. Moreover, the cleavage at the CS bond at the side chain of Met was observed in all the spectra of the peptides containing Met residue. And we found unusual cleavages within the intramolecular disulfide loop with high frequency. Our data revealed that the cleavage pathways are significantly different from those reported previously which are similar to the cycle peptide cleavage mode followed by the secondary cleavage at the CS bond on oxidized Cys. Thus, our results highly suggest that ion series generated from the cleavages within the intramolecular disulfide loop should be considered in both the top-down sequencing and the disulfide bridge location with the presence of a relatively high intensity of MH(+)-28 ion marker. Furthermore, our activity data implied that different AMPs may use different strategies to kill microbes. 相似文献

20.

Mapping the proteome of thylakoid membranes by de novo sequencing of intermembrane peptide domains

Granvogl B Reisinger V Eichacker LA 《Proteomics》2006,6(12):3681-3695

The proteome of a membrane compartment has been investigated by de novo sequence analysis after tryptic in gel digestion. Protein complexes and corresponding protein subunits were separated by a 2-D Blue Native (BN)/SDS-PAGE system. The transmembrane proteins of thylakoid membranes from a higher plant (Hordeum vulgare L.) were identified by the primary sequence of hydrophilic intermembrane peptide domains using nano ESI-MS/MS-analysis. Peptide analysis revealed that lysine residues of membrane proteins are primarily situated in the intermembrane domains. We concluded that esterification of lysine residues with fluorescent dyes may open the opportunity to label membrane proteins still localized in native protein complexes within the membrane phase. We demonstrate that covalent labelling of membrane proteins with the fluorescent dye Cy3 allows high sensitive visualization of protein complexes after 2-D BN/SDS-PAGE. We show that pre-electrophoretic labelling of protein subunits supplements detection of proteins by post-electrophoretic staining with silver and CBB and assists in completing the identification of the membrane proteome. 相似文献