首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Reversed-phase liquid chromatography (LC) directly coupled with electrospray-tandem mass spectrometry (MS/MS) is a successful choice to obtain a large number of product ion spectra from a complex peptide mixture. We describe a search validation program, ScoreRidge, developed for analysis of LC-MS/MS data. The program validates peptide assignments to product ion spectra resulting from usual probability-based searches against primary structure databases. The validation is based only on correlation between the measured LC elution time of each peptide and the deduced elution time from the amino acid sequence assigned to product ion spectra obtained from the MS/MS analysis of the peptide. Sufficient numbers of probable assignments gave a highly correlative curve. Any peptide assignments within a certain tolerance from the correlation curve were accepted for the following arrangement step to list identified proteins. Using this data validation program, host protein candidates responsible for interaction with human hepatitis B virus core protein were identified from a partially purified protein mixture. The present simple and practical program complements protein identification from usual product ion search algorithms and reduces manual interpretation of the search result data. It will lead to more explicit protein identification from complex peptide mixtures such as whole proteome digests from tissue samples.  相似文献   

2.
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId's knowledge database to include proteotypic information, utilized RAId's statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId's programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those that occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches.  相似文献   

3.
A novel hybrid methodology for the automated identification of peptides via de novo integer linear optimization, local database search, and tandem mass spectrometry is presented in this article. A modified version of the de novo identification algorithm PILOT, is utilized to construct accurate de novo peptide sequences. A modified version of the local database search tool FASTA is used to query these de novo predictions against the nonredundant protein database to resolve any low-confidence amino acids in the candidate sequences. The computational burden associated with performing several alignments is alleviated with the use of distributive computing. Extensive computational studies are presented for this new hybrid methodology, as well as comparisons with MASCOT for a set of 38 quadrupole time-of-flight (QTOF) and 380 OrbiTrap tandem mass spectra. The results for our proposed hybrid method for the OrbiTrap spectra are also compared with a modified version of PepNovo, which was trained for use on high-precision tandem mass spectra, and the tag-based method InsPecT. The de novo sequences of PILOT and PepNovo are also searched against the nonredundant protein database using CIDentify to compare with the alignments achieved by our modifications of FASTA. The comparative studies demonstrate the excellent peptide identification accuracy gained from combining the strengths of our de novo method, which is based on integer linear optimization, and database driven search methods.  相似文献   

4.
Xia D  Ghali F  Gaskell SJ  O'Cualain R  Sims PF  Jones AR 《Proteomics》2012,12(12):1912-1916
The development of ion mobility (IM) MS instruments has the capability to provide an added dimension to peptide analysis pipelines in proteomics, but, as yet, there are few software tools available for analysing such data. IM can be used to provide additional separation of parent ions or product ions following fragmentation. In this work, we have created a set of software tools that are capable of converting three dimensional IM data generated from analysis of fragment ions into a variety of formats used in proteomics. We demonstrate that IM can be used to calculate the charge state of a fragment ion, demonstrating the potential to improve peptide identification by excluding non-informative ions from a database search. We also provide preliminary evidence of structural differences between b and y ions for certain peptide sequences but not others. All software tools and data sets are made available in the public domain at http://code.google.com/p/ion-mobility-ms-tools/.  相似文献   

5.
Recently, we carried out a statistical analysis of a 'tryptic' peptide tandem mass spectrometry database in order to identify sequence-dependent patterns for the gas-phase fragmentation behavior of protonated peptide ions, and to improve the models for peptide fragmentation currently incorporated into peptide sequencing and database search algorithms [Kapp, E. A., Schutz, F., Reid, G. E., Eddes, J. S., Moritz, R. L., O'Hair, R. A. J., Speed, T. P. and Simpson, R. J. Anal. Chem. 2003, 75, 6251-6264.]. Here, we have reexamined this database in order to determine the effect of a common post-translational or process induced modification, methionine oxidation, on the appearance and relative abundances of the product ions formed by low energy collision induced dissociation of peptide ions containing this modification. The results from this study indicate that the structurally diagnostic neutral loss of methane sulfenic acid (CH3SOH, 64Da) from the side chain of methionine sulfoxide residues is the dominant fragmentation process for methionine sulfoxide containing peptide ions under conditions of low proton mobility, i.e., when ionizing proton(s) are sequestered at strongly basic amino acids such as arginine, lysine or histidine. The product ion abundances resulting from this neutral loss were found to be approximately 2-fold greater than those resulting from the cleavage C-terminal to aspartic acid, which has previously been shown to be enhanced under the same conditions. In close agreement with these statistical trends, experimental and theoretical studies, employing synthetic "tryptic" peptides and model methionine sulfoxide containing peptide ions, have determined that the mechanism for enhanced methionine sulfoxide side chain cleavage proceeds primarily via a 'charge remote' process. However, the mechanism for dissociation of the side chain for these ions was observed to change as a function of proton mobility. Finally, the transition state barrier for the charge remote side chain cleavage mechanism is predicted to be energetically more favorable than that for charge remote cleavage C-terminal to aspartic acid.  相似文献   

6.
Chromatographed peptide signals form the basis of further data processing that eventually results in functional information derived from data‐dependent bottom‐up proteomics assays. We seek to rank LC/MS parent ions by the quality of their extracted ion chromatograms. Ranked extracted ion chromatograms act as an intuitive physical/chemical preselection filter to improve the quality of MS/MS fragment scans submitted for database search. We identify more than 4900 proteins when considering detector shifts of less than 7 ppm. High quality parent ions for which the database search yields no hits become candidates for subsequent unrestricted analysis for PTMs. Following this rational approach, we prioritize identification of more than 5000 spectrum matches from modified peptides and confirmed the presence of acetylaldehyde‐modified His/Lys. We present a logical workflow that scores data‐dependent selected ion chromatograms and leverage information about semianalytical LC/LC dimension prior to MS. Our method can be successfully used to identify unexpected modifications in peptides with excellent chromatography characteristics, independent of fragmentation pattern and activation methods. We illustrate analysis of ion chromatograms detected in two different modes by RF linear ion trap and electrostatic field orbitrap.  相似文献   

7.
Protein identification has been greatly facilitated by database searches against protein sequences derived from product ion spectra of peptides. This approach is primarily based on the use of fragment ion mass information contained in a MS/MS spectrum. Unambiguous protein identification from a spectrum with low sequence coverage or poor spectral quality can be a major challenge. We present a two-dimensional (2D) mass spectrometric method in which the numbers of nitrogen atoms in the molecular ion and the fragment ions are used to provide additional discriminating power for much improved protein identification and de novo peptide sequencing. The nitrogen number is determined by analyzing the mass difference of corresponding peak pairs in overlaid spectra of (15)N-labeled and unlabeled peptides. These peptides are produced by enzymatic or chemical cleavage of proteins from cells grown in (15)N-enriched and normal media, respectively. It is demonstrated that, using 2D information, i.e., m/z and its associated nitrogen number, this method can, not only confirm protein identification results generated by MS/MS database searching, but also identify peptides that are not possible to identify by database searching alone. Examples are presented of analyzing Escherichia coli K12 extracts that yielded relatively poor MS/MS spectra, presumably from the digests of low abundance proteins, which can still give positive protein identification using this method. Additionally, this 2D MS method can facilitate spectral interpretation for de novo peptide sequencing and identification of posttranslational or other chemical modifications. We envision that this method should be particularly useful for proteome expression profiling of organelles or cells that can be grown in (15)N-enriched media.  相似文献   

8.
Protein sequence database searching of tandem mass spectrometry data is commonly employed to identify post-translational modifications (PTMs) to peptides in global proteomic studies. In these studies, the accurate identification of these modified peptides relies on strategies to ensure high-confidence results from sequence database searching in which differential mass shift parameters are employed to identify PTMs to specific amino acids. Using lysine acetylation as an example PTM, we have observed that the inclusion of differential modification information in sequence database searching dramatically increases the potential for false-positive sequence matches to modified peptides, making the confident identification of true sequence matches difficult. In a proof-of-principle study of whole cell yeast lysates, we demonstrate the combination of preparative isoelectric focusing using free-flow electrophoresis, and an adjusted peptide isoelectric point prediction algorithm, as an effective means to increase the confidence of lysine-acetylated peptide identification. These results demonstrate the potential utility of this general strategy for improving the identification of PTMs which cause a shift to the intrinsic isoelectric point of peptides.  相似文献   

9.
Mass spectrometers that provide high mass accuracy such as FT-ICR instruments are increasingly used in proteomic studies. Although the importance of accurately determined molecular masses for the identification of biomolecules is generally accepted, its role in the analysis of shotgun proteomic data has not been thoroughly studied. To gain insight into this role, we used a hybrid linear quadrupole ion trap/FT-ICR (LTQ FT) mass spectrometer for LC-MS/MS analysis of a highly complex peptide mixture derived from a fraction of the yeast proteome. We applied three data-dependent MS/MS acquisition methods. The FT-ICR part of the hybrid mass spectrometer was either not exploited, used only for survey MS scans, or also used for acquiring selected ion monitoring scans to optimize mass accuracy. MS/MS data were assigned with the SEQUEST algorithm, and peptide identifications were validated by estimating the number of incorrect assignments using the composite target/decoy database search strategy. We developed a simple mass calibration strategy exploiting polydimethylcyclosiloxane background ions as calibrant ions. This strategy allowed us to substantially improve mass accuracy without reducing the number of MS/MS spectra acquired in an LC-MS/MS run. The benefits of high mass accuracy were greatest for assigning MS/MS spectra with low signal-to-noise ratios and for assigning phosphopeptides. Confident peptide identification rates from these data sets could be doubled by the use of mass accuracy information. It was also shown that improving mass accuracy at a cost to the MS/MS acquisition rate substantially lowered the sensitivity of LC-MS/MS analyses. The use of FT-ICR selected ion monitoring scans to maximize mass accuracy reduced the number of protein identifications by 40%.  相似文献   

10.
Two-dimensional gel electrophoresis-separated and excised haptoglobin alpha2-chain protein spots were subjected to in-gel digestion with trypsin. Previously unassigned peptide ion signals observed in mass spectrometric fingerprinting experiments were sequenced using the matrix-assisted laser desorption/ionization-quadrupole ion trap-time of flight (MALDI-QIT-TOF) mass spectrometer and showed that the haptoglobin alpha-chain derivative under study was cleaved by trypsin unspecifically. Abundant cleavages occurred C-terminal to histidine residues at H23, H28, and H87. In addition, mild acidic hydrolysis leading to cleavage after aspartic acid residues at D13 was observed. The uninterpreted tandem mass spectrometry (MS/MS) spectrum of the peptide with ion signal at 2620.19 was submitted to database search and yielded the identification of the corresponding peptide sequence comprising amino acids (aa) aa65-87 from the haptoglobin alpha-chain protein. Also, the presence of a mixture of two tryptic peptides (mass to charge ratio m/z 1708.8; aa40-54, and aa99-113, respectively), that is caused by a tiny sequence variation between the two repeats in the haptoglobin alpha2-chain protein was resolved by MS/MS fragmentation using the MALDI-QIT-TOF mass spectrometer instrument. Advantageous features such as (i) easy parent ion creation, (ii) minimal sample consumption, and (iii) real collision induced dissociation conditions, were combined successfully to determine the amino acid sequences of the previously unassigned peptides. Hence, the novel mass spectrometric sequencing method applied here has proven effective for identification of distinct molecular protein structures.  相似文献   

11.
MOTIVATION: Tandem mass spectrometry combined with sequence database searching is one of the most powerful tools for protein identification. As thousands of spectra are generated by a mass spectrometer in one hour, the speed of database searching is critical, especially when searching against a large sequence database, or when the peptide is generated by some unknown or non-specific enzyme, even or when the target peptides have post-translational modifications (PTM). In practice, about 70-90% of the spectra have no match in the database. Many believe that a significant portion of them are due to peptides of non-specific digestions by unknown enzymes or amino acid modifications. In another case, scientists may choose to use some non-specific enzymes such as pepsin or thermolysin for proteolysis in proteomic study, in that not all proteins are amenable to be digested by some site-specific enzymes, and furthermore many digested peptides may not fall within the rang of molecular weight suitable for mass spectrometry analysis. Interpreting mass spectra of these kinds will cost a lot of computational time of database search engines. OVERVIEW: The present study was designed to speed up the database searching process for both cases. More specifically speaking, we employed an approach combining suffix tree data structure and spectrum graph. The suffix tree is used to preprocess the protein sequence database, while the spectrum graph is used to preprocess the tandem mass spectrum. We then search the suffix tree against the spectrum graph for candidate peptides. We design an efficient algorithm to compute a matching threshold with some statistical significance level, e.g. p = 0.01, for each spectrum, and use it to select candidate peptides. Then we rank these peptides using a SEQUEST-like scoring function. The algorithms were implemented and tested on experimental data. For post-translational modifications, we allow arbitrary number of any modification to a protein. AVAILABILITY: The executable program and other supplementary materials are available online at: http://hto-c.usc.edu:8000/msms/suffix/.  相似文献   

12.
Proteomic identifications hinge on the measurement of both parent and fragment masses and matching these to amino acid sequences via database search engines. The correctness of the identifications is assessed by statistical means. Here we present an experimental approach to test identifications. Chemical modification of all peptides in a sample leads to shifts in masses depending on the chemical properties of each peptide. The identification of a native peptide sequence and its perturbed version with a different parent mass and fragment ion masses provides valuable information. Labeling all peptides using reductive alkylation with formaldehyde is one such perturbation where the ensemble of peptides shifts mass depending on the number of reactive amine groups. Matching covalently perturbed fragmentation patterns from the same underlying peptide sequence increases confidence in the assignments and can salvage low scoring post‐translationally modified peptides. Applying this strategy to bovine alpha‐crystallin, we identify 9 lysine acetylation sites, 4 O‐GlcNAc sites and 13 phosphorylation sites.  相似文献   

13.
MOTIVATION: The identification of T-cell epitopes can be crucial for vaccine development. An epitope is a peptide segment that binds to both a T-cell receptor and a major histocompatibility complex (MHC) molecule. Predicting which peptide segments bind MHC molecules is the first step in epitope prediction. RESULTS: An iterative stepwise discriminant analysis meta-algorithm explores a large molecular database to derive quantitative motifs for peptide binding. The applications presented here demonstrate the algorithm's versatility by producing four closely related models for HLA-DR1. Two models use an expert initial estimate and two do not; two models use amino acid residues as the only predictors and two use amino acid groupings as additional predictors. Each model correctly classifies >90% of the peptides in the database. AVAILABILITY: Software is available commercially; data are free over the Internet.  相似文献   

14.
In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higherenergy collision induced dissociation (HCD) of charged peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.  相似文献   

15.
A prototype linear octopole ion trap/ion mobility/tandem mass spectrometer has been coupled with a nanoflow liquid chromatography separation approach and used to separate and characterize a complicated peptide mixture from digestion of soluble proteins extracted from human urine. In this approach, two dimensions of separation (nanoflow liquid chromatography and ion mobility) are followed by collision induced dissociation (CID) and mass spectrometry (MS) analysis. From a preliminary analysis of the most intense CID-MS features in a part of the dataset, it is possible to assign 27 peptide ions which correspond to 13 proteins. The data contain many additional CID-MS features for less intense ions. A limited discussion of these features and their potential utility in identifying complicated peptide mixtures required for proteomics study is presented.  相似文献   

16.
This report examines the analytical benefits of high-field asymmetric waveform ion mobility spectrometry (FAIMS) coupled to liquid chromatography mass spectrometry (LC-MS) for phosphoproteomics analyses. The ability of FAIMS to separate multiply charged peptide ions from chemical interferences confers a unique advantage in phosphoproteomics by enhancing the detection of low abundance phosphopeptides. LC-FAIMS-MS experiments performed on TiO(2)-enriched tryptic digests from Drosophila melanogaster provided a 50% increase in phosphopeptide identification compared to conventional LC-MS analysis. Also, FAIMS can be used to select different population of multiply charged phosphopeptide ions prior to their activation with either collision activated dissociation (CAD) or electron transfer dissociation (ETD). Importantly, FAIMS enabled the resolution of coeluting phosphoisomers of different abundances to facilitate their unambiguous identification using conventional database search engines. The benefits of FAIMS in large-scale phosphoproteomics of D. melanogaster are further investigated using label-free quantitation to identify differentially regulated phosphoproteins in response to insulin stimulation.  相似文献   

17.
The assignments of individual magnetic resonances of backbone nuclei of a larger protein, ribonuclease H from Escherichia coli, which consists of 155 amino acid residues and has a molecular mass of 17.6 kDa are presented. To remove the problem of degenerate chemical shifts, which is inevitable in proteins of this size, three-dimensional NMR was applied. The strategy for the sequential assignment was, first, resonance peaks of amides were classified into 15 amino acid types by 1H-15N HMQC experiments with samples in which specific amino acids were labeled with 15N. Second, the amide 1H-15N peaks were connected along the amino acid sequence by tracing intraresidue and sequential NOE cross peaks. In order to obtain unambiguous NOE connectivities, four types of heteronuclear 3D NMR techniques, 1H-15N-1H 3D NOESY-HMQC, 1H-15N-1H 3D TOCSY-HMQC, 13C-1H-1H 3D HMQC-NOESY, and 13C-1H-1H 3D HMQC-TOCSY, were applied to proteins uniformly labeled either with 15N or with 13C. This method gave a systematic way to assign backbone nuclei (N, NH, C alpha H, and C alpha) of larger proteins. Results of the sequential assignments and identification of secondary structure elements that were revealed by NOE cross peaks among backbone protons are reported.  相似文献   

18.
Analysing proteomic data   总被引:5,自引:0,他引:5  
The rapid growth of proteomics has been made possible by the development of reproducible 2D gels and biological mass spectrometry. However, despite technical improvements 2D gels are still less than perfectly reproducible and gels have to be aligned so spots for identical proteins appear in the same place. Gels can be warped by a variety of techniques to make them concordant. When gels are manipulated to improve registration, information is lost, so direct methods for gel registration which make use of all available data for spot matching are preferable to indirect ones. In order to identify proteins from gel spots a property or combination of properties that are unique to that protein are required. These can then be used to search databases for possible matches. Molecular mass, pI, amino acid composition and short sequence tags can all be used in database searches. Currently the method of choice for protein identification is mass spectrometry. Proteins are eluted from the gels and cleaved with specific endoproteases to produce a series of peptides of different molecular mass. In peptide mass fingerprinting, the peptide profile of the unknown protein is compared with theoretical peptide libraries generated from sequences in the different databases. Tandem mass spectroscopy (MS/MS) generates short amino acid sequence tags for the individual peptides. These partial sequences combined with the original peptide masses are then used for database searching, greatly improving specificity. Increasingly protein identification from MS/MS data is being fully or partially automated. When working with organisms, which do not have sequenced genomes (the case with most helminths), protein identification by database searching becomes problematical. A number of approaches to cross species protein identification have been suggested, but if the organism being studied is only distantly related to any organism with a sequenced genome then the likelihood of protein identification remains small. The dynamic nature of the proteome means that there really is no such thing as a single representative proteome and a complete set of metadata (data about the data) is going to be required if the full potential of database mining is to be realised in the future.  相似文献   

19.
FMRFamide-like peptide (FLP) amino acid sequences have been collected and statistically analyzed. FLP amino acid composition as a function of position in the peptide is graphically presented for several major phyla. Results of total amino acid composition and frequencies of pairs of FLP amino acids have been computed and compared with corresponding values from the entire GenBank protein sequence database. The data for pairwise distributions of amino acids should help in future structure-function studies of FLPs. To aid in future peptide discovery, a computer program and search protocol was developed to identify FLPs from the GenBank protein database without the use of keywords.  相似文献   

20.
We demonstrate a new approach to the determination of amino acid composition from tandem mass spectrometrically fragmented peptides using both experimental and simulated data. The approach has been developed to be used as a search-space filter in a protein identification pipeline with the aim of increased performance above that which could be attained by using immonium ion information. Three automated methods have been developed and tested: one based upon a simple peak traversal, in which all intense ion peaks are treated as being either a b- or y-ion using a wide mass tolerance; a second which uses a much narrower tolerance and does not perform transformations of ion peaks to the complementary type; and the unique fragments method which allows for b- or y-ion type to be inferred and corroborated using a scan of the other ions present in each peptide spectrum. The combination of these methods is shown to provide a high-accuracy set of amino acid predictions using both experimental and simulated data sets. These high quality predictions, with an accuracy of over 85%, may be used to identify peptide fragments that are hard to identify using other methods. The data simulation algorithm is also shown post priori to be a good model of noiseless tandem mass spectrometric peptide data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号