共查询到20条相似文献,搜索用时 15 毫秒
1.
The promise of mass spectrometry as a tool for probing signal-transduction is predicated on reliable identification of post-translational modifications. Phosphorylations are key mediators of cellular signaling, yet are hard to detect, partly because of unusual fragmentation patterns of phosphopeptides. In addition to being accurate, MS/MS identification software must be robust and efficient to deal with increasingly large spectral data sets. Here, we present a new scoring function for the Inspect software for phosphorylated peptide tandem mass spectra for ion-trap instruments, without the need for manual validation. The scoring function was modeled by learning fragmentation patterns from 7677 validated phosphopeptide spectra. We compare our algorithm against SEQUEST and X!Tandem on testing and training data sets. At a 1% false positive rate, Inspect identified the greatest total number of phosphorylated spectra, 13% more than SEQUEST and 39% more than X!Tandem. Spectra identified by Inspect tended to score better in several spectral quality measures. Furthermore, Inspect runs much faster than either SEQUEST or X!Tandem, making desktop phosphoproteomics feasible. Finally, we used our new models to reanalyze a corpus of 423,000 LTQ spectra acquired for a phosphoproteome analysis of Saccharomyces cerevisiae DNA damage and repair pathways and discovered 43% more phosphopeptides than the previous study. 相似文献
2.
3.
Statistically meaningful comparison/combination of peptide identification results from various search methods is impeded by the lack of a universal statistical standard. Providing an E-value calibration protocol, we demonstrated earlier the feasibility of translating either the score or heuristic E-value reported by any method into the textbook-defined E-value, which may serve as the universal statistical standard. This protocol, although robust, may lose spectrum-specific statistics and might require a new calibration when changes in experimental setup occur. To mitigate these issues, we developed a new MS/MS search tool, RAId_aPS, that is able to provide spectrum-specific-values for additive scoring functions. Given a selection of scoring functions out of RAId score, K-score, Hyperscore and XCorr, RAId_aPS generates the corresponding score histograms of all possible peptides using dynamic programming. Using these score histograms to assign E-values enables a calibration-free protocol for accurate significance assignment for each scoring function. RAId_aPS features four different modes: (i) compute the total number of possible peptides for a given molecular mass range, (ii) generate the score histogram given a MS/MS spectrum and a scoring function, (iii) reassign E-values for a list of candidate peptides given a MS/MS spectrum and the scoring functions chosen, and (iv) perform database searches using selected scoring functions. In modes (iii) and (iv), RAId_aPS is also capable of combining results from different scoring functions using spectrum-specific statistics. The web link is http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid_aps/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page. 相似文献
4.
5.
6.
The accurate mass values of all immonium, y(1), y(2), a(2), and b(2) ions of tryptic peptides composed of the 20 standard amino acids were calculated. The differences between adjacent masses in this data set are greater than 10 mDa for more than 80% of the values. Using this mass list, the majority of low mass ions in quadrupole-time of flight tandem mass spectra of peptides from tryptic digests and from an elastase digest could be assigned. Besides the a(2)/b(2) ions, which carry residues 1-2 from the N-terminus, a variety of internal dipeptide b ions were regularly observed. In case internal proline was present, corresponding dipeptide b ions carrying proline at the N-terminal position occurred. By assigning the dipeptide b ions on the basis of their accurate mass, bidirectional or unidirectional sequence information was obtained, which is localized to the peptide N-terminus (a(2)/b(2) ions) or not localized (internal b ions). Identification of the y(1) and y(2) ions by their accurate mass provides unidirectional sequence information localized to the peptide C-terminus. It is shown that this patchwork-type sequence information extractable from accurate mass data of low-mass ions is highly efficient for protein identification. 相似文献
7.
8.
Partial nucleotide sequences of 634 cDNAs randomly isolated from a feline uterine cDNA library (Stratagene) were determined by single pass sequencing. Homology search of the sequences to the non-redundant nucleotide databases revealed that 83% of the cDNAs matched registered feline or non-feline genes. Based on the gene identifications, these genes were predicted to be related with immunological, biochemical and regulatory functions in cats. Interestingly, the rest 17% of the cDNAs did not show homology to gene or EST sequence present in the nucleotide and protein databases, suggesting that these cDNAs include novel genes expressed only in the Felidae. This large scale sequencing of uterine cDNA will provide a useful molecular source for research not only towards health and disease conditions in cats but also in different fields of science where genetic information from cats will be of interest. 相似文献
9.
The relevance of libraries of annotated MS/MS spectra is growing with the amount of proteomic data generated in high-throughput experiments. These reference libraries provide a fast and accurate way to identify newly acquired MS/MS spectra. In the context of multiple hypotheses testing, the control of the number of false-positive identifications expected in the final result list by means of the calculation of the false discovery rate (FDR). In a classical sequence search where experimental MS/MS spectra are compared with the theoretical peptide spectra calculated from a sequence database, the FDR is estimated by searching randomized or decoy sequence databases. Despite on-going discussion on how exactly the FDR has to be calculated, this method is widely accepted in the proteomic community. Recently, similar approaches to control the FDR of spectrum library searches were discussed. We present in this paper a detailed analysis of the similarity between spectra of distinct peptides to set the basis of our own solution for decoy library creation (DeLiberator). It differs from the previously published results in some key points, mainly in implementing new methods that prevent decoy spectra from being too similar to the original library spectra while keeping important features of real MS/MS spectra. Using different proteomic data sets and library creation methods, we evaluate our approach and compare it with alternative methods. 相似文献
10.
11.
Improving LC-MS sensitivity through increases in chromatographic performance: comparisons of UPLC-ES/MS/MS to HPLC-ES/MS/MS 总被引:1,自引:0,他引:1
Churchwell MI Twaddle NC Meeker LR Doerge DR 《Journal of chromatography. B, Analytical technologies in the biomedical and life sciences》2005,825(2):134-143
Recent technological advances have made available reverse phase chromatographic media with a 1.7 microm particle size along with a liquid handling system that can operate such columns at much higher pressures. This technology, termed ultra performance liquid chromatography (UPLC), offers significant theoretical advantages in resolution, speed, and sensitivity for analytical determinations, particularly when coupled with mass spectrometers capable of high-speed acquisitions. This paper explores the differences in LC-MS performance by conducting a side-by-side comparison of UPLC for several methods previously optimized for HPLC-based separation and quantification of multiple analytes with maximum throughput. In general, UPLC produced significant improvements in method sensitivity, speed, and resolution. Sensitivity increases with UPLC, which were found to be analyte-dependent, were as large as 10-fold and improvements in method speed were as large as 5-fold under conditions of comparable peak separations. Improvements in chromatographic resolution with UPLC were apparent from generally narrower peak widths and from a separation of diastereomers not possible using HPLC. Overall, the improvements in LC-MS method sensitivity, speed, and resolution provided by UPLC show that further advances can be made in analytical methodology to add significant value to hypothesis-driven research. 相似文献
12.
Ligating adapters with unique synthetic oligonucleotide sequences (sequence tags) onto individual DNA samples before massively parallel sequencing is a popular and efficient way to obtain sequence data from many individual samples. Tag sequences should be numerous and sufficiently different to ensure sequencing, replication, and oligonucleotide synthesis errors do not cause tags to be unrecoverable or confused. However, many design approaches only protect against substitution errors during sequencing and extant tag sets contain too few tag sequences. We developed an open-source software package to validate sequence tags for conformance to two distance metrics and design sequence tags robust to indel and substitution errors. We use this software package to evaluate several commercial and non-commercial sequence tag sets, design several large sets (maxcount = 7,198) of edit metric sequence tags having different lengths and degrees of error correction, and integrate a subset of these edit metric tags to polymerase chain reaction (PCR) primers and sequencing adapters. We validate a subset of these edit metric tagged PCR primers and sequencing adapters by sequencing on several platforms and subsequent comparison to commercially available alternatives. We find that several commonly used sets of sequence tags or design methodologies used to produce sequence tags do not meet the minimum expectations of their underlying distance metric, and we find that PCR primers and sequencing adapters incorporating edit metric sequence tags designed by our software package perform as well as their commercial counterparts. We suggest that researchers evaluate sequence tags prior to use or evaluate tags that they have been using. The sequence tag sets we design improve on extant sets because they are large, valid across the set, and robust to the suite of substitution, insertion, and deletion errors affecting massively parallel sequencing workflows on all currently used platforms. 相似文献
13.
Cerqueira GC DaRocha WD Campos PC Zouain CS Teixeira SM 《Memórias do Instituto Oswaldo Cruz》2005,100(4):385-389
A total of 880 expressed sequence tags (EST) originated from clones randomly selected from a Trypanosoma cruzi amastigote cDNA library have been analyzed. Of these, 40% (355 ESTs) have been identified by similarity to sequences in public databases and classified according to functional categorization of their putative products. About 11% of the mRNAs expressed in amastigotes are related to the translational machinery, and a large number of them (9% of the total number of clones in the library) encode ribosomal proteins. A comparative analysis with a previous study, where clones from the same library were selected using sera from patients with Chagas disease, revealed that ribosomal proteins also represent the largest class of antigen coding genes expressed in amastigotes (54% of all immunoselected clones). However, although more than thirty classes of ribosomal proteins were identified by EST analysis, the results of the immunoscreening indicated that only a particular subset of them contains major antigenic determinants recognized by antibodies from Chagas disease patients. 相似文献
14.
Shi-Hua Chen Shan Li Guo Zeng Lan Wang Ji Qiang Zhao Yan Xiu Zhao Hui Zhang 《DNA sequence》2007,18(1):61-67
Halophytes can grow under a high salinity condition. Similar to glycophytes, their salt-tolerance possesses a high genetic complexity. There are many morphological and physiological studies on halophytes but very little information is at molecular level why they are salt-tolerant. Limonium sinense is a salt-secreting halophyte and can excretes salts by multi-cellular glands. Here, we report the library construction and sequence analysis of a cDNA library made from leaf tissue of L. sinenes. Among those 1082 expressed sequence tag (EST) obtained, 684 unique genes were identified: 429 showed homology to previously identified genes, 255 matched to uncharacterized genes. Compared with other EST databases, some characteristic features such as abundance genes in related to cytoskeleton and intracellular traffic, membrane transporting were observed, which may be specific to halophytes. 相似文献
15.
16.
Nakagawa T Nakatsuka A Yano K Yasugahira S Nakamura R Sun N Itai A Suzuki T Itamura H 《Plant cell reports》2008,27(5):931-938
Persimmon (Diospyros kaki Thunb.) is an important fruit in Asian countries, where it is eaten as a fresh fruit and is also used for many other purposes.
To understand the molecular mechanism of fruit development and ripening in persimmon, we generated a total of 9,952 expressed
sequence tags (ESTs) from randomly selected clones of two different cDNA libraries. One cDNA library was derived from fruit
of “Saijo” persimmon at an early stage of development, and the other from ripening fruit. These ESTs were clustered into 6,700
non-redundant sequences. Of the 6,700 non-redundant sequences evaluated, the deduced amino acid sequences of 4,356 (65%) showed
significant homology to known proteins, and 2,344 (35%) showed no significant similarity to any known proteins in Arabidopsis databases. We report comparison of genes identified in the two cDNA libraries and describe some putative genes involved in
proanthocyanidin and carotenoid synthesis. This study provides the first global overview of a set of genes that are expressed
during fruit development and ripening in persimmon. 相似文献
17.
BIRGITT OESER FRANÇOIS BEAUSSART THOMAS HAARMANN † NICOLE LORENZ EVA NATHUES YVONNE ROLKE ‡ JAN SCHEFFER JANUARY WEINER § PAUL TUDZYNSKI 《Molecular Plant Pathology》2009,10(5):665-684
The ascomycete Claviceps purpurea (ergot) is a biotrophic flower pathogen of rye and other grasses. The deleterious toxic effects of infected rye seeds on humans and grazing animals have been known since the Middle Ages. To gain further insight into the molecular basis of this disease, we generated about 10 000 expressed sequence tags (ESTs)—about 25% originating from axenic fungal culture and about 75% from tissues collected 6–20 days after infection of rye spikes. The pattern of axenic vs. in planta gene expression was compared. About 200 putative plant genes were identified within the in planta library. A high percentage of these were predicted to function in plant defence against the ergot fungus and other pathogens, for example pathogenesis-related proteins. Potential fungal pathogenicity and virulence genes were found via comparison with the pathogen–host interaction database (PHI-base; http://www.phi-base.org ) and with genes known to be highly expressed in the haustoria of the bean rust fungus. Comparative analysis of Claviceps and two other fungal flower pathogens (necrotrophic Fusarium graminearum and biotrophic Ustilago maydis ) highlighted similarities and differences in their lifestyles, for example all three fungi have signalling components and cell wall-degrading enzymes in their arsenal. In summary, the analysis of axenic and in planta ESTs yielded a collection of candidate genes to be evaluated for functional roles in this plant–microbe interaction. 相似文献
18.
Mario Cannataro Giovanni Cuda Marco Gaspari Sergio Greco Giuseppe Tradigo Pierangelo Veltri 《BMC bioinformatics》2007,8(1):255
Background
Isotope-coded affinity tags (ICAT) is a method for quantitative proteomics based on differential isotopic labeling, sample digestion and mass spectrometry (MS). The method allows the identification and relative quantification of proteins present in two samples and consists of the following phases. First, cysteine residues are either labeled using the ICAT Light or ICAT Heavy reagent (having identical chemical properties but different masses). Then, after whole sample digestion, the labeled peptides are captured selectively using the biotin tag contained in both ICAT reagents. Finally, the simplified peptide mixture is analyzed by nanoscale liquid chromatography-tandem mass spectrometry (LC-MS/MS). Nevertheless, the ICAT LC-MS/MS method still suffers from insufficient sample-to-sample reproducibility on peptide identification. In particular, the number and the type of peptides identified in different experiments can vary considerably and, thus, the statistical (comparative) analysis of sample sets is very challenging. Low information overlap at the peptide and, consequently, at the protein level, is very detrimental in situations where the number of samples to be analyzed is high. 相似文献19.
Breitwieser FP Müller A Dayon L Köcher T Hainard A Pichler P Schmidt-Erfurth U Superti-Furga G Sanchez JC Mechtler K Bennett KL Colinge J 《Journal of proteome research》2011,10(6):2758-2766
Quantitative comparison of the protein content of biological samples is a fundamental tool of research. The TMT and iTRAQ isobaric labeling technologies allow the comparison of 2, 4, 6, or 8 samples in one mass spectrometric analysis. Sound statistical models that scale with the most advanced mass spectrometry (MS) instruments are essential for their efficient use. Through the application of robust statistical methods, we developed models that capture variability from individual spectra to biological samples. Classical experimental designs with a distinct sample in each channel as well as the use of replicates in multiple channels are integrated into a single statistical framework. We have prepared complex test samples including controlled ratios ranging from 100:1 to 1:100 to characterize the performance of our method. We demonstrate its application to actual biological data sets originating from three different laboratories and MS platforms. Finally, test data and an R package, named isobar, which can read Mascot, Phenyx, and mzIdentML files, are made available. The isobar package can also be used as an independent software that requires very little or no R programming skills. 相似文献
20.
We developed and evaluated simple sequence repeat (SSR) markers derived from expressed sequence tags (ESTs) of Liriodendron tulipifera. Characteristics of 15 EST‐SSR loci were investigated using 33 L. tulipifera individuals. The number of alleles per locus ranged from two to five. The expected and observed heterozygosities ranged from 0.216 to 0.751 and from 0.182 to 0.97, respectively. These loci were further tested for their cross‐species transferability to Liriodendron Chinense. Because of their high level of polymorphism and transferability, our 15 single‐locus EST‐SSR markers will be valuable tools for research on mating system, population genetics and systemic evolution of Liriodendron. 相似文献