首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Protein identification has been greatly facilitated by database searches against protein sequences derived from product ion spectra of peptides. This approach is primarily based on the use of fragment ion mass information contained in a MS/MS spectrum. Unambiguous protein identification from a spectrum with low sequence coverage or poor spectral quality can be a major challenge. We present a two-dimensional (2D) mass spectrometric method in which the numbers of nitrogen atoms in the molecular ion and the fragment ions are used to provide additional discriminating power for much improved protein identification and de novo peptide sequencing. The nitrogen number is determined by analyzing the mass difference of corresponding peak pairs in overlaid spectra of (15)N-labeled and unlabeled peptides. These peptides are produced by enzymatic or chemical cleavage of proteins from cells grown in (15)N-enriched and normal media, respectively. It is demonstrated that, using 2D information, i.e., m/z and its associated nitrogen number, this method can, not only confirm protein identification results generated by MS/MS database searching, but also identify peptides that are not possible to identify by database searching alone. Examples are presented of analyzing Escherichia coli K12 extracts that yielded relatively poor MS/MS spectra, presumably from the digests of low abundance proteins, which can still give positive protein identification using this method. Additionally, this 2D MS method can facilitate spectral interpretation for de novo peptide sequencing and identification of posttranslational or other chemical modifications. We envision that this method should be particularly useful for proteome expression profiling of organelles or cells that can be grown in (15)N-enriched media.  相似文献   

2.
3.
Lack of genomic sequence data and the relatively high cost of tandem mass spectrometry have hampered proteomic investigations into helminths, such as resolving the mechanism underpinning globally reported anthelmintic resistance. Whilst detailed mechanisms of resistance remain unknown for the majority of drug-parasite interactions, gene mutations and changes in gene and protein expression are proposed key aspects of resistance. Comparative proteomic analysis of drug-resistant and -susceptible nematodes may reveal protein profiles reflecting drug-related phenotypes. Using the gastro-intestinal nematode, Haemonchus contortus as case study, we report the application of freely available expressed sequence tag (EST) datasets to support proteomic studies in unsequenced nematodes. EST datasets were translated to theoretical protein sequences to generate a searchable database. In conjunction with matrix-assisted laser desorption ionisation time-of-flight mass spectrometry (MALDI-TOF-MS), Peptide Mass Fingerprint (PMF) searching of databases enabled a cost-effective protein identification strategy. The effectiveness of this approach was verified in comparison with MS/MS de novo sequencing with searching of the same EST protein database and subsequent searches of the NCBInr protein database using the Basic Local Alignment Search Tool (BLAST) to provide protein annotation. Of 100 proteins from 2-DE gel spots, 62 were identified by MALDI-TOF-MS and PMF searching of the EST database. Twenty randomly selected spots were analysed by electrospray MS/MS and MASCOT Ion Searches of the same database. The resulting sequences were subjected to BLAST searches of the NCBI protein database to provide annotation of the proteins and confirm concordance in protein identity from both approaches. Further confirmation of protein identifications from the MS/MS data were obtained by de novo sequencing of peptides, followed by FASTS algorithm searches of the EST putative protein database. This study demonstrates the cost-effective use of available EST databases and inexpensive, accessible MALDI-TOF MS in conjunction with PMF for reliable protein identification in unsequenced organisms.  相似文献   

4.
Homology-driven proteomics promises to reveal functional biology in insects with sparse genome sequence information. A proteomics study comparing plant virus transmission competent and refractive genotypes of the aphid Schizaphis graminum isolated numerous candidate proteins involved in virus transmission, but limited genome sequence information hampered their identification. The complete genome of the pea aphid, Acyrthosiphon pisum, released in 2008, enabled us to double the number of protein identifications beyond what was possible using available EST libraries and other insect sequences. This was concomitant with a dramatic increase of the number of MS and MS/MS peptide spectra matching the genome-derived protein sequence. LC-MS/MS proved to be the most robust method of peptide detection. Cross-matching spectral data to multiple EST sequences and error tolerant searching to identify amino acid substitutions enhanced the percent coverage of the Schizaphis graminum proteins. 2-D electrophoresis provided the protein pI and MW which enabled the refinement of the candidate protein selection and provided a measure of protein abundance when coupled to the spectral data. Thus, the homology-based proteomics pipeline for insects should include efforts to maximize the number of peptide matches to the protein to increase certainty in protein identification and relative protein abundance.  相似文献   

5.
未知基因组及蛋白质序列数据库有限的物种的蛋白质组学分析是当前一些非模式生物物种蛋白质组学研究领域的瓶颈之一.基于同源性搜索的BLAST方法(MS BLAST),是近年新发展起来的一种用于未知基因组的蛋白质鉴定的搜索工具,已成功应用于许多未知基因组物种的蛋白质鉴定.SPITC化学辅助方法是本实验室建立的一种改进的de novo质谱测序方法.采用MS BLAST方法对经Mascot软件数据库搜索未能鉴定到的19个金鱼胚胎蛋白质进行鉴定,其中12个蛋白质是直接测序后进行MS BLAST搜索得到的结果,另外7个蛋白质是联合MS BLAST和SPITC衍生方法得到的鉴定结果.实验结果证明,采用MS BLAST方法进行蛋白质的跨物种鉴定具有可行性和可靠性,给蛋白质的跨物种鉴定提供了一条新的途径.  相似文献   

6.
当前,基于生物质谱进行蛋白质鉴定的技术已经成为蛋白质组学研究的支撑技术之一.产生的数据主要使用数据库搜索的方法进行处理,这种方法的一大缺陷是不能鉴定数据库中未包含的蛋白质,因此如何充分利用质谱数据对蛋白质组研究的意义很大,而新蛋白质鉴定更是其中一个重要的内容.新蛋白质鉴定是蛋白质鉴定的一个方面,新蛋白质的定义按照序列和功能的已知程度分为3个层次;以蛋白质鉴定的方法为基础,目前新蛋白质鉴定的方法可分为denovo测序和相似序列搜索结合的方法以及搜索EST、基因组等核酸数据库的方法2大类;两者各有利弊.存在各自的问题和相应处理的策略.不同的研究者可以根据具体目的应用和发展不同的鉴定方法,同时新蛋白质的鉴定也将随着蛋白质组学研究的发展而更加完善.  相似文献   

7.
Although peptide mass fingerprinting is currently the method of choice to identify proteins, the number of proteins available in databases is increasing constantly, and hence, the advantage of having sequence data on a selected peptide, in order to increase the effectiveness of database searching, is more crucial. Until recently, the ability to identify proteins based on the peptide sequence was essentially limited to the use of electrospray ionization tandem mass spectrometry (MS) methods. The recent development of new instruments with matrix-assisted laser desorption/ionization (MALDI) sources and true tandem mass spectrometry (MS/MS) capabilities creates the capacity to obtain high quality tandem mass spectra of peptides. In this work, using the new high resolution tandem time of flight MALDI-(TOF/TOF) mass spectrometer from Applied Biosystems, examples of successful identification and characterization of bovine heart proteins (SWISS-PROT entries: P02192, Q9XSC6, P13620) separated by two-dimensional electrophoresis and blotted onto polyvinylidene difluoride membrane are described. Tryptic protein digests were analyzed by MALDI-TOF to identify peptide masses afterward used for MS/MS. Subsequent high energy MALDI-TOF/TOF collision-induced dissociation spectra were recorded on selected ions. All data, both MS and MS/MS, were recorded on the same instrument. Tandem mass spectra were submitted to database searching using MS-Tag or were manually de novo sequenced. An interesting modification of a tryptophan residue, a "double oxidation", came to light during these analyses.  相似文献   

8.
While tandem mass spectrometry (MS/MS) is routinely used to identify proteins from complex mixtures, certain types of proteins present unique challenges for MS/MS analyses. The major wheat gluten proteins, gliadins and glutenins, are particularly difficult to distinguish by MS/MS. Each of these groups contains many individual proteins with similar sequences that include repetitive motifs rich in proline and glutamine. These proteins have few cleavable tryptic sites, often resulting in only one or two tryptic peptides that may not provide sufficient information for identification. Additionally, there are less than 14,000 complete protein sequences from wheat in the current NCBInr release. In this paper, MS/MS methods were optimized for the identification of the wheat gluten proteins. Chymotrypsin and thermolysin as well as trypsin were used to digest the proteins and the collision energy was adjusted to improve fragmentation of chymotryptic and thermolytic peptides. Specialized databases were constructed that included protein sequences derived from contigs from several assemblies of wheat expressed sequence tags (ESTs), including contigs assembled from ESTs of the cultivar under study. Two different search algorithms were used to interrogate the database and the results were analyzed and displayed using a commercially available software package (Scaffold). We examined the effect of protein database content and size on the false discovery rate. We found that as database size increased above 30,000 sequences there was a decrease in the number of proteins identified. Also, the type of decoy database influenced the number of proteins identified. Using three enzymes, two search algorithms and a specialized database allowed us to greatly increase the number of detected peptides and distinguish proteins within each gluten protein group.  相似文献   

9.
10.
Tandem mass spectrometry (MS/MS) combined with database searching is currently the most widely used method for high-throughput peptide and protein identification. Many different algorithms, scoring criteria, and statistical models have been used to identify peptides and proteins in complex biological samples, and many studies, including our own, describe the accuracy of these identifications, using at best generic terms such as "high confidence." False positive identification rates for these criteria can vary substantially with changing organisms under study, growth conditions, sequence databases, experimental protocols, and instrumentation; therefore, study-specific methods are needed to estimate the accuracy (false positive rates) of these peptide and protein identifications. We present and evaluate methods for estimating false positive identification rates based on searches of randomized databases (reversed and reshuffled). We examine the use of separate searches of a forward then a randomized database and combined searches of a randomized database appended to a forward sequence database. Estimated error rates from randomized database searches are first compared against actual error rates from MS/MS runs of known protein standards. These methods are then applied to biological samples of the model microorganism Shewanella oneidensis strain MR-1. Based on the results obtained in this study, we recommend the use of use of combined searches of a reshuffled database appended to a forward sequence database as a means providing quantitative estimates of false positive identification rates of peptides and proteins. This will allow researchers to set criteria and thresholds to achieve a desired error rate and provide the scientific community with direct and quantifiable measures of peptide and protein identification accuracy as opposed to vague assessments such as "high confidence."  相似文献   

11.
The wool proteome has been largely uncharted due to a lack of database coverage, poor protein extractability and dynamic range issues. Yet, investigating correlations between wool physical properties and protein content, or characterising UV-, heat- or processing-induced protein damage requires the availability of an identifiable and identified proteome.In this study we have achieved unprecedented wool proteome identification through a strategy of comprehensive data acquisition, iterative protein identification/validation and concurrent augmentation of the sequence database. Data acquisition comprised a range of different hyphenated MS techniques including LC–MS/MS, LC–MALDI, 2D-LC–MS/MS and SDS-PAGE LC–MS. Using iterative searching of databases and search result combination using ProteinScape, a systematic expansion of identifiable proteins in the sequence database was achieved. This was followed by extensive validation and rationalisation of the protein identifications. In total, 72 complete and 30 partial ovine-specific protein sequences were added to the database, and 113 wool proteins were identified.Enhanced access to ovine-specific protein identification and characterisation will facilitate all wool fibre protein chemistry and proteomics research.  相似文献   

12.
Abbaraju NV  Cai Y  Rees BB 《Proteomics》2011,11(21):4257-4261
Reliable proteomic analysis of biological tissues requires sampling approaches that preserve proteins as close to their in vivo state as possible. In the current study, the patterns of protein abundance in one‐dimensional (1‐D) gels were assessed for five tissues of the gulf killifish, Fundulus grandis, following snap‐freezing tissues in liquid nitrogen or immersion of fresh tissues in RNAlater®. In liver and heart, the protein profiles in 1‐D gels were better preserved by snap‐freezing, while in gill, the 1‐D protein profile was better preserved by immersion in RNAlater®. In skeletal muscle and brain, the two approaches yielded similar patterns of protein abundance. LC‐MS/MS analyses and database searching resulted in the identification of 17 proteins in liver and 12 proteins in gill. Identified proteins include enzymes of energy metabolism, structural proteins, and proteins serving other biological functions. These protein identifications for a species without a sequenced genome demonstrate the utility of F. grandis as a model organism for environmental proteomic studies in vertebrates.  相似文献   

13.
Kim SI  Kim JY  Kim EA  Kwon KH  Kim KW  Cho K  Lee JH  Nam MH  Yang DC  Yoo JS  Park YM 《Proteomics》2003,3(12):2379-2392
As an initial step to the comprehensive proteomic analysis of Panax ginseng C. A. Meyer, protein mixtures extracted from the cultured hairy root of Panax ginseng were separated by two-dimensional polyacrylamide gel electrophoresis (2-DE). The protein spots were analyzed and identified by peptide finger printing and internal amino acid sequencing by matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) and electrospray ionization quadrupole-time of flight mass spectrometry (ESI Q-TOF MS), respectively. More than 300 protein spots were detected on silver stained two-dimensional (2-D) gels using pH 3-10, 4-7, and 4.5-5.5 gradients. Major protein spots (159) were analyzed by peptide fingerprinting or de novo sequencing and the functions of 91 of these proteins were identified. Protein identification was achieved using the expressed sequence tag (EST) database from Panax ginseng and the protein database of plants like Arabidopsis thaliana and Oryza sativa. However, peptide mass fingerprinting by MALDI-TOF MS alone was insufficient for protein identification because of the lack of a genome database for Panax ginseng. Only 17 of the 159 protein spots were verified by peptide mass fingerprinting using MALDI-TOF MS whereas 87 out of 102 protein spots, which included 13 of the 17 proteins identified by MALDI-TOF MS, were identified by internal amino acid sequencing using tandem mass spectrometry analysis by ESI Q-TOF MS. When the internal amino acid sequences were used as identification markers, the identification rate exceeded 85.3%, suggesting that a combination of internal sequencing and EST data analysis was an efficient identification method for proteome analysis of plants having incomplete genome data like ginseng. The 2-D patterns of the main root and leaves of Panax ginseng differed from that of the cultured hairy root, suggesting that some proteins are exclusively expressed by different tissues for specific cellular functions. Proteome analysis will undoubtedly be helpful for understanding the physiology of Panax ginseng.  相似文献   

14.
Analysing proteomic data   总被引:5,自引:0,他引:5  
The rapid growth of proteomics has been made possible by the development of reproducible 2D gels and biological mass spectrometry. However, despite technical improvements 2D gels are still less than perfectly reproducible and gels have to be aligned so spots for identical proteins appear in the same place. Gels can be warped by a variety of techniques to make them concordant. When gels are manipulated to improve registration, information is lost, so direct methods for gel registration which make use of all available data for spot matching are preferable to indirect ones. In order to identify proteins from gel spots a property or combination of properties that are unique to that protein are required. These can then be used to search databases for possible matches. Molecular mass, pI, amino acid composition and short sequence tags can all be used in database searches. Currently the method of choice for protein identification is mass spectrometry. Proteins are eluted from the gels and cleaved with specific endoproteases to produce a series of peptides of different molecular mass. In peptide mass fingerprinting, the peptide profile of the unknown protein is compared with theoretical peptide libraries generated from sequences in the different databases. Tandem mass spectroscopy (MS/MS) generates short amino acid sequence tags for the individual peptides. These partial sequences combined with the original peptide masses are then used for database searching, greatly improving specificity. Increasingly protein identification from MS/MS data is being fully or partially automated. When working with organisms, which do not have sequenced genomes (the case with most helminths), protein identification by database searching becomes problematical. A number of approaches to cross species protein identification have been suggested, but if the organism being studied is only distantly related to any organism with a sequenced genome then the likelihood of protein identification remains small. The dynamic nature of the proteome means that there really is no such thing as a single representative proteome and a complete set of metadata (data about the data) is going to be required if the full potential of database mining is to be realised in the future.  相似文献   

15.
MS/MS and database searching has emerged as a valuable technology for rapidly analyzing protein expression, localization, and post-translational modifications. The probability-based search engine Mascot has found widespread use as a tool to correlate tandem mass spectra with peptides in a sequence database. Although the Mascot scoring algorithm provides a probability-based model for peptide identification, the independent peptide scores do not correlate with the significance of the proteins to which they match. Herein, we describe a heuristic method for organizing proteins identified at a specified false-discovery rate using Mascot-matched peptides. We call this method PROVALT, and it uses peptide matches from a random database to calculate false-discovery rates for protein identifications and reduces a complex list of peptide matches to a nonredundant list of homologous protein groups. This method was evaluated using Mascot-identified peptides from a Trypanosoma cruzi epimastigote whole-cell lysate, which was separated by multidimensional LC and analyzed by MS/MS. PROVALT was then compared with the two traditional methods of protein identification when using Mascot, the single peptide score and cumulative protein score methods, and was shown to be superior to both in regards to the number of proteins identified and the inclusion of lower scoring nonrandom peptide matches.  相似文献   

16.
毛细管区带电泳/串联质谱联用法鉴定多肽和蛋白质   总被引:11,自引:3,他引:8  
建立了毛细管区带电泳-串联质谱联用(CZE/MS/MS)对多肽和蛋白质高灵敏度鉴定方法,对Met-脑啡肽和Leu-脑啡肽的混合物进行了分析,用CZE/MS/MS方法验证了各自的序列,同样对细胞色素c的胰蛋白酶酶解产物用CZE/MS/MS方法进行了肽质谱分析,几科所有肽段的序列及其与在分子中的位置都得到了确定,通过SEQUEST软件进行蛋白质序列数据库搜索得到准确的鉴定结果,所消耗的样品量均在低皮可  相似文献   

17.
18.
19.
Human saliva contains a large number of proteins and peptides (salivary proteome) that help maintain homeostasis in the oral cavity. Global analysis of human salivary proteome is important for understanding oral health and disease pathogenesis. In this study, large-scale identification of salivary proteins was demonstrated by using shotgun proteomics and two-dimensinal gel electrophoresis-mass spectrometry (2-DE-MS). For the shotgun approach, whole saliva proteins were prefractionated according to molecular weight. The smallest fraction, presumably containing salivary peptides, was directly separated by capillary liquid chromatography (LC). However, the large protein fractions were digested into peptides for subsequent LC separation. Separated peptides were analyzed by on-line electrospray tandem mass spectrometry (MS/MS) using a quadrupole-time of flight mass spectrometer, and the obtained spectra were automatically processed to search human protein sequence database for protein identification. Additionally, 2-DE was used to map out the proteins in whole saliva. Protein spots 105 in number were excised and in-gel digested; and the resulting peptide fragments were measured by matrix-assisted laser desorption/ionization-mass spectrometry and sequenced by LC-MS/MS for protein identification. In total, we cataloged 309 proteins from human whole saliva by using these two proteomic approaches.  相似文献   

20.
A technology of mass spectrometry (MS) was used in this study for the large-scale proteomic identification and verification of protein-encoding genes present in the silkworm (Bombyx mori) genome. Peptide sequences identified by MS were compared with those from an open reading frame (ORF) library of the B. mori genome and a cDNA library, to validate the coding attributes of ORFs. Two databases were created. The first was based on a 9× draft sequence of the silkworm genome and contained 14,632 putative proteins. The second was based on a B. mori pupal cDNA library containing 3,187 putative proteins of at least 30 amino acid residues in length. A total of 81,000 peptide sequences with a threshold score of 60% were generated by the MS/MS analysis, and 55,400 of these were chosen for a sequence alignment. By searching these two databases, 6,649 and 250 proteins were matched, which accounted for approximately 45.4% and 7.8% of the peptide sequences and putative proteins, respectively. Further analyses carried out by several bioinformatic tools suggested that the matches included proteins with predicted transmembrane domains (1,393) and preproteins with a signal peptide (976). These results provide a fundamental understanding of the expression and function of silkworm proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号