首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Tandem mass spectrometry has emerged to be one of the most powerful high-throughput techniques for protein identification. Tandem mass spectrometry selects and fragments peptides of interest into N-terminal ions and C-terminal ions, and it measures the mass/charge ratios of these ions. The de novo peptide sequencing problem is to derive the peptide sequences from given tandem mass spectral data of k ion peaks without searching against protein databases. By transforming the spectral data into a matrix spectrum graph G = (V, E), where |V| = O(k(2)) and |E| = O(k(3)), we give the first polynomial time suboptimal algorithm that finds all the suboptimal solutions (peptides) in O(p|E|) time, where p is the number of solutions. The algorithm has been implemented and tested on experimental data. The program is available at http://hto-c.usc.edu:8000/msms/menu/denovo.htm.  相似文献   

2.
We report an isotope labeling shotgun proteome analysis strategy to validate the spectrum-to-sequence assignments generated by using sequence-database searching for the construction of a more reliable MS/MS spectral library. This strategy is demonstrated in the analysis of the E. coli K12 proteome. In the workflow, E. coli cells were cultured in normal and (15)N-enriched media. The differentially labeled proteins from the cell extracts were subjected to trypsin digestion and two-dimensional liquid chromatography quadrupole time-of-flight tandem mass spectrometry (2D-LC QTOF MS/MS) analysis. The MS/MS spectra of the two samples were individually searched using Mascot against the E. coli proteome database to generate lists of peptide sequence matches. The two data sets were compared by overlaying the spectra of unlabeled and labeled matches of the same peptide sequence for validation. Two cutoff filters, one based on the number of common fragment ions and another one on the similarity of intensity patterns among the common ions, were developed and applied to the overlaid spectral pairs to reject the low quality or incorrectly assigned spectra. By examining 257,907 and 245,156 spectra acquired from the unlabeled and (15)N-labeled samples, respectively, an experimentally validated MS/MS spectral library of tryptic peptides was constructed for E. coli K12 that consisted of 9,302 unique spectra with unique sequence and charge state, representing 7,763 unique peptide sequences. This E. coli spectral library could be readily expanded, and the overall strategy should be applicable to other organisms. Even with this relatively small library, it was shown that more peptides could be identified with higher confidence using the spectral search method than by sequence-database searching.  相似文献   

3.
In collision-induced dissociation (CID) of peptides, it has been observed that rearrangement processes can take place that appear to permute/scramble the original primary structure, which may in principle adversely affect peptide identification. Here, an analysis of sequence permutation in tandem mass spectra is presented for a previously published proteomics study on P. aeruginosa (Scherl et al., J. Am. Soc. Mass Spectrom.2008, 19, 891) conducted using an LTQ-orbitrap. Overall, 4878 precursor ions are matched by considering the accurate mass (i.e., <5 ppm) of the precursor ion and at least one fragment ion that confirms the sequence. The peptides are then grouped into higher- and lower-confidence data sets, using five fragment ions as a cutoff for higher-confidence identification. It is shown that the propensity for sequence permutation increases with the length of the tryptic peptide in both data sets. A higher charge state (i.e., 3+ vs 2+) also appears to correlate with a higher appearance of permuted masses for larger peptides. The ratio of these permuted sequence ions, compared to all tandem mass spectral peaks, reaches ~25% in the higher-confidence data set, compared to an estimated incidence of false positives for permuted masses (maximum ~8%), based on a null-hypothesis decoy data set.  相似文献   

4.
MHCPEP, a database of MHC-binding peptides: update 1997.   总被引:10,自引:1,他引:10       下载免费PDF全文
MHCPEP (http://wehih.wehi.edu.au/mhcpep/) is a curated database comprising over 13 000 peptide sequences known to bind MHC molecules. Entries are compiled from published reports as well as from direct submissions of experimental data. Each entry contains the peptide sequence, its MHC specificity and where available, experimental method, observed activity, binding affinity, source protein and anchor positions, as well as publication references. The present format of the database allows text string matching searches but can easily be converted for use in conjunction with sequence analysis packages. The database can be accessed via Internet using WWW or FTP.  相似文献   

5.
Product ion mass spectral data of [M + H]+ ions of oligosaccharides, mainly tetra- and pentasaccharides, as their dipalmitoyl phosphatidylethanolamine derivatives were obtained using both liquid secondary ion mass spectrometry with B/E linked scanning and fast atom bombardment ionization with collision-induced dissociation/tandem mass spectrometry. Both methods give similar positive product ion spectra of equivalent high sensitivity (detection limits of approximately 50 pmol) that principally contain glycosidic cleavage ions retaining the reducing end of the molecule from which monosaccharide sequence can be deduced. A series of ions from fission of the phosphate ester bond together with glycosidic cleavage are present in the tandem mass spectra and B/E linked scan spectra when helium collision gas is used. Monosaccharide linkage position of isomeric molecules is reflected in the intensity of glycosidic fragmentation, without retention of the oxygen atom, with decreasing cleavage in the order 1-3 greater than 1-4 greater than 1-6 linkage. Fucose and N-acetylhexosamines show an increased degree of fragmentation over hexose sugars. The application of product ion spectra of derivatized oligosaccharides is demonstrated for characterizing mixed samples and also the acquisition of spectra directly from the silica surface of high-performance thin-layer chromatography plates.  相似文献   

6.
The individual haplotyping problem is a computing problem of reconstructing two haplotypes for an individual based on several optimal criteria from one's fragments sequencing data. This paper is based on the fact that the length of a fragment and the number of the fragments covering a SNP (single nucleotide polymorphism) site are both very small compared with the length of a sequenced region and the total number of the fragments and introduces the parameterized haplotyping problems. With m fragments whose maximum length is k(1), n SNP sites and the number of the fragments covering a SNP site no more than k(2), our algorithms can solve the gapless MSR (Minimum SNP Removal) and MFR (Minimum Fragment Removal) problems in the time complexity O(nk(1)k(2) + m log m + nk(2) + mk(1)) and O(mk(2)(2) + mk(1) k(2) + m log m + nk(2) + mk(1))respectively. Since, the value of k(1) and k(2) are both small (about 10) in practice, our algorithms are more efficient and applicable compared with the algorithms of V. Bafna et al. of time complexity O(mn(2)) and O(m(2)n + m(3)), respectively.  相似文献   

7.
MOTIVATION: This paper is concerned with algorithms for aligning two whole genomes so as to identify regions that possibly contain conserved genes. Motivated by existing heuristic-based software tools, we initiate the study of an optimization problem that attempts to uncover conserved genes with a global concern. Another interesting feature in our formulation is the tolerance of noise, which also complicates the optimization problem. A brute-force approach takes time exponential in the noise level. RESULTS: We show how an insight into the optimization structure can lead to a drastic improvement in the time and space requirement [precisely, to O(k2n2) and O(k2n), respectively, where n is the size of the input and k is the noise level]. The reduced space requirement allows us to implement the new algorithm, called MaxMinCluster, on a PC. It is exciting to see that when tested with different real data sets, MaxMinCluster consistently uncovers a high percentage of conserved genes that have been published by GenBank. Its performance is indeed favorably compared to MUMmer (perhaps the most popular software tool for uncovering conserved genes in a whole-genome scale). AVAILABILITY: The source code is available from the website http://www.csis.hku.hk/~colly/maxmincluster/ detailed proof of the propositions can also be found there.  相似文献   

8.
Oxidative metabolites of the anticoagulant, warfarin [4-hydroxy-3-(3-oxo-1-phenylbutyl)-2H-1-benzopyran-2-one], produced by the actions of cytochromes P450 were analyzed by thermospray high-performance liquid chromatography/mass spectrometry. Warfarin, dehydrowarfarin, and the 6-, 7-, 8-, and 4'-hydroxy derivatives of warfarin were found to ionize well by the thermospray process in the presence of ammonium acetate. Thermospray mass spectra of these compounds were generally dominated by the protonated molecule, (M + H)+, and ions formed by the loss of water from the protonated molecule, (M + H - H2O)+. Fragment ions arising from the hydroxycoumarin, benzylhydroxycoumarin, and phenylbutanone portions of the molecules were observed, and the relative intensity of these fragment ions was greatly increased with filament ionization and application of a high repeller potential (100-130 V). Selected-ion monitoring of the (M + H)+ and (M + H - H2O)+ ions provided sensitivities for these compounds in the 2 to 10 ng range. A method employing thermospray HPLC/MS with selected-ion monitoring and internal standard quantitation for the analysis of the oxidative metabolites of warfarin is described.  相似文献   

9.
pProRep is a web application integrating electrophoretic and mass spectral data from proteome analyses into a relational database. The graphical web-interface allows users to upload, analyse and share experimental proteome data. It offers researchers the possibility to query all previously analysed datasets and can visualize selected features, such as the presence of a certain set of ions in a peptide mass spectrum, on the level of the two-dimensional gel. AVAILABILITY: The pProRep package and instructions for its use can be downloaded from http://www.ptools.ua.ac.be/pProRep. The application requires a web server that runs PHP 5 (http://www.php.net) and MySQL. Some (non-essential) extensions need additional freely available libraries: details are described in the installation instructions.  相似文献   

10.
We describe the application of a peptide retention time reversed phase liquid chromatography (RPLC) prediction model previously reported (Petritis et al. Anal. Chem. 2003, 75, 1039) for improved peptide identification. The model uses peptide sequence information to generate a theoretical (predicted) elution time that can be compared with the observed elution time. Using data from a set of known proteins, the retention time parameter was incorporated into a discriminant function for use with tandem mass spectrometry (MS/MS) data analyzed with the peptide/protein identification program SEQUEST. For singly charged ions, the number of confident identifications increased by 12% when the elution time metric is included compared to when mass spectral data is the sole source of information in the context of a Drosophila melanogaster database. A 3-4% improvement was obtained for doubly and triply charged ions for the same biological system. Application to the larger Rattus norvegicus (rat) and human proteome databases resulted in an 8-9% overall increase in the number of confident identifications, when both the discriminant function and elution time are used. The effect of adding "runner-up" hits (peptide matches that are not the highest scoring for a spectra) from SEQUEST is also explored, and we find that the number of confident identifications is further increased by 1% when these hits are also considered. Finally, application of the discriminant functions derived in this work with approximately 2.2 million spectra from over three hundred LC-MS/MS analyses of peptides from human plasma protein resulted in a 16% increase in confident peptide identifications (9022 vs 7779) using elution time information. Further improvements from the use of elution time information can be expected as both the experimental control of elution time reproducibility and the predictive capability are improved.  相似文献   

11.
The increasing use of multistage tandem mass spectrometry (MS/MS and MS (3)) methods for comprehensive phosphoproteome analysis studies, as well as the emerging application of in silico spectral intensity prediction algorithms in enhanced database search analysis strategies, necessitate the development of an improved understanding of the mechanisms and other factors that affect the gas-phase fragmentation reactions of phosphorylated peptide ions. To address this need, we have examined the multistage collision-induced dissociation (CID) behavior of a set of singly and doubly charged phosphoserine- and phosphothreonine-containing peptide ions, as well as their regioselectively or uniformly deuterated derivatives, in a quadrupole ion trap mass spectrometer. Consistent with previous reports, the neutral loss of phosphoric acid (H 3PO 4) was observed as a dominant reaction pathway upon MS/MS. The magnitude of this loss was found to be highly dependent on the proton mobility of the precursor ion for both phosphoserine- and phosphothreonine-containing peptides. In contrast to that currently accepted in the literature, however, the results obtained in this study unequivocally demonstrate that the loss of H 3PO 4 does not predominantly occur via a "charge-remote" beta-elimination reaction. The observation of product ions corresponding to the loss of formaldehyde (CH 2O, 30 Da, or CD 2O, 32 Da) or acetaldehyde (CH 3CHO, 44 Da) upon MS (3) dissociation of the [M+ nH-H 3PO 4] ( n+ ) product ions from phosphoserine- and phosphothreonine-containing peptide ions, respectively, provide experimental evidence for a "charge-directed" mechanism involving an S N2 neighboring group participation reaction, resulting in the formation of a cyclic product ion. Potentially, these "diagnostic" MS (3) product ions may provide additional information to facilitate the characterization of phosphopeptides containing multiple potential phosphorylation sites.  相似文献   

12.
Identifying conserved gene clusters is an important step toward understanding the evolution of genomes and predicting the functions of genes. A famous model to capture the essential biological features of a conserved gene cluster is called the gene-team model. The problem of finding the gene teams of two general sequences is the focus of this paper. For this problem, He and Goldwasser had an efficient algorithm that requires O(mn) time using O(m + n) working space, where m and n are, respectively, the numbers of genes in the two given sequences. In this paper, a new efficient algorithm is presented. Assume m ≤ n. Let C = Σ(α)(∈)(Σ) o(1)(α)o(2)(α), where Σ is the set of distinct genes, and o(1)(α) and o(2)(α) are, respectively, the numbers of copies of α in the two given sequences. Our new algorithm requires O(min{C lg n, mn}) time using O(m + n) working space. As compared with He and Goldwasser's algorithm, our new algorithm is more practical, as C is likely to be much smaller than mn in practice. In addition, our new algorithm is output sensitive. Its running time is O(lg n) times the size of the output. Moreover, our new algorithm can be efficiently extended to find the gene teams of k general sequences in O(k C lg (n(1)n(2). . .n(k)) time, where n(i) is the number of genes in the ith input sequence.  相似文献   

13.
Electrospray ionization mass spectrometry (ESI-MS) was used to measure the binding of Cu2+ ions to synthetic peptides corresponding to sections of the sequence of the mature prion protein (PrP). ESI-MS demonstrates that Cu2+ is unique among divalent metal ions in binding to PrP and defines the location of the major Cu2+ binding site as the octarepeat region in the N-terminal domain, containing multiple copies of the repeat ProHisGlyGlyGlyTrpGlyGln. The stoichiometries of the complexes measured directly by ESI-MS are pH dependent: a peptide containing four octarepeats chelates two Cu2+ ions at pH 6 but four at pH 7.4. At the higher pH, the binding of multiple Cu2+ ions occurs with a high degree of cooperativity for peptides C-terminally extended to incorporate a fifth histidine. Dissociation constants for each Cu2+ ion binding to the octarepeat peptides, reported here for the first time, are mostly in the low micromolar range; for the addition of the third and fourth Cu2+ ions to the extended peptides at pH 7.4, K(D)'s are <100 nM. N-terminal acetylation of the peptides caused some reduction in the stoichiometry of binding at both pH's. Cu2+ also binds to a peptide corresponding to the extreme N-terminus of PrP that precedes the octarepeats, arguing that this region of the sequence may also make a contribution to the Cu2+ complexation. Although the structure of the four-octarepeat peptide is not affected by pH changes in the absence of Cu2+, as judged by circular dichroism, Cu2+ binding induces a modest change at pH 6 and a major structural perturbation at pH 7.4. It is possible that PrP functions as a Cu2+ transporter by binding Cu2+ ions from the extracellular medium under physiologic conditions and then releasing some or all of this metal upon exposure to acidic pH in endosomes or secondary lysosomes.  相似文献   

14.
ToxoDB: accessing the Toxoplasma gondii genome   总被引:1,自引:0,他引:1  
ToxoDB (http://ToxoDB.org) provides a genome resource for the protozoan parasite Toxoplasma gondii. Several sequencing projects devoted to T. gondii have been completed or are in progress: an EST project (http://genome.wustl.edu/est/index.php?toxoplasma=1), a BAC clone end-sequencing project (http://www.sanger.ac.uk/Projects/T_gondii/) and an 8X random shotgun genomic sequencing project (http://www.tigr.org/tdb/e2k1/tga1/). ToxoDB was designed to provide a central point of access for all available T. gondii data, and a variety of data mining tools useful for the analysis of unfinished, un-annotated draft sequence during the early phases of the genome project. In later stages, as more and different types of data become available (microarray, proteomic, SNP, QTL, etc.) the database will provide an integrated data analysis platform facilitating user-defined queries across the different data types.  相似文献   

15.
Oxidative inactivation is a common problem for enzymatic reactions that proceed via iron oxo intermediates. In an investigation of the inactivation of a viral prolyl-4-hydroxylase (26 kD), electrospray mass spectrometry (MS) directly shows the degree of oxidation under varying experimental conditions, but indicates the addition at most of three oxygen atoms per molecule. Thus, molecular ion masses (M + nO) of one sample indicate the oxygen atom adducts n = 0, 1, 2, 3, and 4 of 35, 41, 19, 5 +/- 3, and <2%, respectively; "top-down" MS/MS of these ions show oxidation at the sites R(28)-V(31), E(95)-F(107), and K(216) of 22%, 28%, and 34%, respectively, but with a possible (approximately 4%) fourth site at V(125)-D(150). However, for the doubly oxidized molecular ions (increasing the precursor oxygen content from 0.94 to 2), MS/MS showed an easily observable approximately 13% oxygen at the V(125)-D(150) site. For the "bottom-up" approach, detection of the approximately 4% oxidation at the V(125)-D(150) site by MS analysis of a proteolysis mixture would have been very difficult. The unmodified peptide containing this site would represent a few percent of the proteolysis mixture; the oxidized peptide not only would be just approximately 4% of this, but the uniqueness of its mass value (approximately 1-2 kD) would be far less than the 11,933 Dalton value used here. Using different molecular ion precursors for top-down MS/MS also provides kinetic data from a single sample, that is, from molecular ions with 0.94 and 2 oxygens. Little oxidation occurs at V(125)-D(150) until K(216) is oxidized, suggesting that these are competitively catalyzed by the iron center; among several prolyl-4-hydroxylases the K(216), H(137), and D(139) are conserved residues.  相似文献   

16.
We present a method for automatically extracting groups of orthologous genes from a large set of genomes by a new clustering algorithm on a weighted multipartite graph. The method assigns a score to an arbitrary subset of genes from multiple genomes to assess the orthologous relationships between genes in the subset. This score is computed using sequence similarities between the member genes and the phylogenetic relationship between the corresponding genomes. An ortholog cluster is found as the subset with the highest score, so ortholog clustering is formulated as a combinatorial optimization problem. The algorithm for finding an ortholog cluster runs in time O(|E| + |V| log |V|), where V and E are the sets of vertices and edges, respectively, in the graph. However, if we discretize the similarity scores into a constant number of bins, the runtime improves to O(|E| + |V|). The proposed method was applied to seven complete eukaryote genomes on which the manually curated database of eukaryotic ortholog clusters, KOG, is constructed. A comparison of our results with the manually curated ortholog clusters shows that our clusters are well correlated with the existing clusters  相似文献   

17.
110ke V Fe^ 离子注入原卟啉IX二钠盐薄膜亲品后的一些谱学分析结果表明,低能铁离子束辐照可以导致生物分子的损伤和化学改性,并且初步证实注入铁离子在样品分子中慢化沉积后形成含铁的金属络合物,即注入铁离子的质量沉积。  相似文献   

18.
Given a text of length n, a pattern of length m and an integer k, we present an algorithm for finding all occurrences of the pattern in the text, each with at most k substitutions. The algorithm runs in O(k(m log m + n)) time, and requires O(nk) space. This algorithm has direct implications for nucleotide and amino acid sequence comparisons.  相似文献   

19.
Hydrogen/deuterium exchange reactions of protonated and sodium cationized peptide molecules have been studied in the gas phase with a MALDI/quadrupole ion trap mass spectrometer. Unit-mass selected precursor ions were allowed to react with deuterated ammonia introduced into the trap cell by a pulsed valve. The reactant gas pressure, reaction time, and degree of the internal excitation of reactant ions were varied to explore the kinetics of the gas phase isotope exchange. Protonated peptide molecules exhibited a high degree of reactivity, some showing complete exchange of all labile hydrogen atoms. On the contrary, peptide molecules cationized with sodium exhibited only very limited reactivity, indicating a vast difference between the gas phase structures of the two ions. © 1997 Wiley-Liss Inc.  相似文献   

20.
A set of 10 different recombinant human parvalbumins was used to establish a method for the investigation of the Ca2+-binding properties of proteins by electrospray ionization mass spectrometry (ESI-MS). Human PVWT was found to bind 2 mol Ca2+ ions/mol of protein, whereas its mutants (PVE101V, PVD90A, PVE62V, PVD51A, PVD90A,E101V, PVE62V,E101V, PVD51A,D90A, PVD51A,E62V, PVD51A,E62V, D90A,E101V) containing inactivating substitutions in the Ca2+-binding loops bind 0 or 1 Ca2+ ion per protein molecule, depending on the degree of inactivation. These findings fully agree with previously reported results obtained by flow dialysis experiments. The RP-HPLC desalted metal-free proteins were analyzed in 10 mM ammonium acetate at pH 7.0. The experimental conditions were optimized with the recombinant parvalbumin test system before analyzing the Ca2+-binding properties of rat and murine parvalbumins in muscle tissue extracts. ESI-MS revealed that (i) rat and murine alpha-parvalbumins each bind specifically two Ca2+ ions per protein molecule and (ii) both extracted parvalbumins were found to be posttranslationally modified; each protein is acetylated at the N-terminus. Finally, during our investigations of the murine parvalbumin a sequencing error was detected at the C-terminus where the amino acid at position 109 is Ser and not Thr as mentioned in the SwissProt data base (Accession No. P32848). This work demonstrates the great potential of the ESI-MS technique as a sensitive, specific, and rapid method for direct identification and determination of the stoichiometry of Ca2+-binding proteins and other metalloproteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号