首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Informatics for protein identification by mass spectrometry   总被引:3,自引:0,他引:3  
High throughput protein analysis (i.e., proteomics) first became possible when sensitive peptide mass mapping techniques were developed, thereby allowing for the possibility of identifying and cataloging most 2D gel electrophoresis spots. Shortly thereafter a few groups pioneered the idea of identifying proteins by using peptide tandem mass spectra to search protein sequence databases. Hence, it became possible to identify proteins from very complex mixtures. One drawback to these latter techniques is that it is not entirely straightforward to make matches using tandem mass spectra of peptides that are modified or have sequences that differ slightly from what is present in the sequence database that is being searched. This has been part of the motivation behind automated de novo sequencing programs that attempt to derive a peptide sequence regardless of its presence in a sequence database. The sequence candidates thus generated are then subjected to homology-based database search programs (e.g., BLAST or FASTA). These homology search programs, however, were not developed with mass spectrometry in mind, and it became necessary to make minor modifications such that mass spectrometric ambiguities can be taken into account when comparing query and database sequences. Finally, this review will discuss the important issue of validating protein identifications. All of the search programs will produce a top ranked answer; however, only the credulous are willing to accept them carte blanche.  相似文献   

2.
Analysing proteomic data   总被引:5,自引:0,他引:5  
The rapid growth of proteomics has been made possible by the development of reproducible 2D gels and biological mass spectrometry. However, despite technical improvements 2D gels are still less than perfectly reproducible and gels have to be aligned so spots for identical proteins appear in the same place. Gels can be warped by a variety of techniques to make them concordant. When gels are manipulated to improve registration, information is lost, so direct methods for gel registration which make use of all available data for spot matching are preferable to indirect ones. In order to identify proteins from gel spots a property or combination of properties that are unique to that protein are required. These can then be used to search databases for possible matches. Molecular mass, pI, amino acid composition and short sequence tags can all be used in database searches. Currently the method of choice for protein identification is mass spectrometry. Proteins are eluted from the gels and cleaved with specific endoproteases to produce a series of peptides of different molecular mass. In peptide mass fingerprinting, the peptide profile of the unknown protein is compared with theoretical peptide libraries generated from sequences in the different databases. Tandem mass spectroscopy (MS/MS) generates short amino acid sequence tags for the individual peptides. These partial sequences combined with the original peptide masses are then used for database searching, greatly improving specificity. Increasingly protein identification from MS/MS data is being fully or partially automated. When working with organisms, which do not have sequenced genomes (the case with most helminths), protein identification by database searching becomes problematical. A number of approaches to cross species protein identification have been suggested, but if the organism being studied is only distantly related to any organism with a sequenced genome then the likelihood of protein identification remains small. The dynamic nature of the proteome means that there really is no such thing as a single representative proteome and a complete set of metadata (data about the data) is going to be required if the full potential of database mining is to be realised in the future.  相似文献   

3.
A proteomics-based approach was used for characterizing wheat gliadins from an Italian common wheat (Triticum aestivum) cultivar. A two-dimensional gel electrophoresis (2-DE) map of roughly 40 spots was obtained by submitting the 70% alcohol-soluble crude protein extract to isoelectric focusing on immobilized pH gradient strips across two pH gradient ranges, i.e., 3-10 or pH 6-11, and to sodium dodecyl sulfate-polyacrylamide electrophoresis in the second dimension. The chymotryptic digest of each spot was characterized by matrix-assisted laser desorption/ionization-time of flight mass spectrometry and nano electrospray ionization-tandem mass spectrometry (MS/MS) analysis, providing a "peptide map" for each digest. The measured masses were subsequently sought in databases for sequences. For accurate identification of the parent protein, it was necessary to determine de novo sequences by MS/MS experiments on the peptides. By partial mass fingerprinting, we identified protein molecules such as alpha/beta-, gamma-, omega-gliadin, and high molecular weight-glutenin. The single spots along the 2-DE map were discriminated on the basis of their amino acid sequence traits. alpha-Gliadin, the most represented wheat protein in databases, was highly conserved as the relative N-terminal sequence of the components from the 2-DE map contained only a few silent amino acid substitutions. The other closely related gliadins were identified by sequencing internal peptide chains. The results gave insight into the complex nature of gliadin heterogeneity. This approach has provided us with sound reference data for differentiating gliadins amongst wheat varieties.  相似文献   

4.
Delahunty CM  Yates JR 《BioTechniques》2007,43(5):563, 565, 567 passim
Large-scale biology emerged out of the efforts to sequence genomes of important organisms. Based on resources created by whole genome sequencing, large-scale analyses of messenger RNA (mRNA) and protein expression are now possible. With the availability of large amounts of genomic sequence information, a convenient method for the identification and analysis of proteins based on proteolytic digestion into peptides emerged. Processes to fragment peptides using collision-activated dissociation (CAD) in tandem mass spectrometers and computer algorithms to match the tandem mass spectra of peptides to sequences in databases enable rapid identification of amino acid sequences, and hence proteins, present in mixtures. The inherent complexity of the peptide mixtures has necessitated improvements in methodology for mass spectrometry (MS) analysis of peptides.  相似文献   

5.
Genomic studies have shown that there are four abundant type I and type II intermediate filament proteins (IFPs) in wool. When separated using 2D-PAGE, the type I IFPs separated into four clearly defined major rows. The type II IFPs separated into two distinct staggered rows. The large number of spots seen by 2D-PAGE has previously been attributed to charge heterogeneity caused by post-translational modification of the protein. However, analysis of wool IFPs by 2D-PAGE techniques and mass spectrometry suggested an absence of phosphorylation or glycosylation modifications. Investigations with both the type I and type II IFPs showed that when single protein spots from a 2D-PAGE separation are eluted, re-focused and re-electrophoresed, several spots are formed on both the acidic and basic side of the original spot. Amino acid analysis, mass spectrometry and Ellman's assay support the hypothesis that the proteins have the same sequence but vary in isoelectric charge, due to differences in exposure of charged residues on the molecular surface. The cause of IFP charge heterogeneity is thus proposed to be a conformational equilibrium between several different forms of the same protein in the rehydration solution used for the first dimension.  相似文献   

6.
The triatomine bugs are obligatory haematophagous organisms that act as vectors of Chagas disease by transmitting the protozoan Trypanosoma cruzi. Their feeding success is strongly related to salivary proteins that allow these insects to access blood by counteracting host haemostatic mechanisms. Proteomic studies were performed on saliva from the Amazonian triatomine bugs: Rhodnius brethesi and R. robustus, species epidemiologically relevant in the transmission of T. cruzi. Initially, salivary proteins were separated by two-dimensional gel electrophoresis (2-DE). The average number of spots of the R. brethesi and R. robustus saliva samples were 129 and 135, respectively. The 2-DE profiles were very similar between the two species. Identification of spots by peptide mass fingerprinting afforded limited efficiency, since very few species-specific salivary protein sequences are available in public sequence databases. Therefore, peptide fragmentation and de novo sequencing using a MALDI-TOF/TOF mass spectrometer were applied for similarity-driven identifications which generated very positive results. The data revealed mainly lipocalin-like proteins which promote blood feeding of these insects. The redundancy of saliva sequence identification suggested multiple isoforms caused by gene duplication followed by gene modification and/or post-translational modifications. In the first experimental assay, these proteins were predominantly phosphorylated, suggesting functional phosphoregulation of the lipocalins.  相似文献   

7.
Low molecular weight peptides were isolated from the chromatin of wheat sprouts. Following gel filtration the peptide fraction shows a sharp inhibiting activity on the growth of HeLa cancer cells. Infrared (IR) spectroscopy and mass spectrometry have been utilized to characterize the wheat sprout peptides in an attempt to recognize the peptide sequence involved in the control of cell growth. The quantitative presence of a peptide with MH+=572 appears proportional to the cell growth inhibition activity. This compound has been subjected to extensive mass spectrometry analysis. The automatic computational analysis of the ions of second, third and fourth generations indicate a peptide sequence, AcHis-Asp-Ser-Glu-, that binds at the C-terminal a molecule of ethanolamine. Moreover, the results show that some sequences of the wheat sprout peptide family are present in the peptide fractions isolated from several other tissues, thus supporting the hypothesis of ubiquitous regulatory peptides.  相似文献   

8.
Separation of proteins by two-dimensional gel electrophoresis (2-DE) coupled with identification of proteins through peptide mass fingerprinting (PMF) by matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is the widely used technique for proteomic analysis. This approach relies, however, on the presence of the proteins studied in public-accessible protein databases or the availability of annotated genome sequences of an organism. In this work, we investigated the reliability of using raw genome sequences for identifying proteins by PMF without the need of additional information such as amino acid sequences. The method is demonstrated for proteomic analysis of Klebsiella pneumoniae grown anaerobically on glycerol. For 197 spots excised from 2-DE gels and submitted for mass spectrometric analysis 164 spots were clearly identified as 122 individual proteins. 95% of the 164 spots can be successfully identified merely by using peptide mass fingerprints and a strain-specific protein database (ProtKpn) constructed from the raw genome sequences of K. pneumoniae. Cross-species protein searching in the public databases mainly resulted in the identification of 57% of the 66 high expressed protein spots in comparison to 97% by using the ProtKpn database. 10 dha regulon related proteins that are essential for the initial enzymatic steps of anaerobic glycerol metabolism were successfully identified using the ProtKpn database, whereas none of them could be identified by cross-species searching. In conclusion, the use of strain-specific protein database constructed from raw genome sequences makes it possible to reliably identify most of the proteins from 2-DE analysis simply through peptide mass fingerprinting.  相似文献   

9.
López JL  Marina A  Alvarez G  Vázquez J 《Proteomics》2002,2(12):1658-1665
In this work, a novel approach based on proteomics is applied for the analysis of the three European marine mussel species: Mytilus edulis (ME), Mytilus galloprovincialis (MG) and Mytilus trossulus (MT), which are of interest in biotechnology and food industry. The proteomes of these species are poorly described in databases, are difficult to diagnose, and have a controversial taxonomy, To characterise species-specific peptides, we compared 51 matrix-assisted laser desorption/ioization-time of flight peptide mass maps generated from 6 random selected prominent spots derived from the two-dimensional electrophoresis analysis of foot protein extracts from several individuals. Minor species-specific differences in the peptide maps were detected in only one of the spots, corresponding to tropomyosin. Two peptides were unique to ME and MG individuals, whereas another peptide was present only in MT individuals. The sequence of these peptides was characterised by, nanoelectrospray ionization-ion trap (nanoESI-IT) tandem mass spectrometry (MS/MS) analysis followed by database searching and de novo sequence interpretation. We detected a single T to D amino acid substitution in MT tropomyosin. Unambiguous and highly-specific species identification was then demonstrated by analysing peptide extracts from tropomyosin spots by micro high-performande liquid chromatography (microHPL) ESI-IT mass spectrometry using the selected ion monitoring configuration, focused on these peptides, in continuous MS/MS operation. Our results suggest that proteomics may be successfully applied for the identification of species whose proteome is not present in databases.  相似文献   

10.
The characterization by de novo peptide sequencing of the different protein nucleoside diphosphate kinase B (NDK B) from all the commercial hakes and grenadiers belonging to the family Merlucciidae is reported. A classical proteomics approach, consisting of two-dimmensional gel electrophoresis, tryptic in-gel digestion of the excised spots, MALDI-TOF MS, LC-MS/MS, and nanoESI-MS/MS analyses, was followed for the purification and characterization of the different isoforms of the NDK B. Fragmentation spectra were used for de novo peptide sequence. A high degree of homology was found between the sequences of all the species studied and the NDK B sequence from Gillichthys mirabilis, which is accessible in the protein databases. Particular attention was paid to the differential characterization of species-specific peptides that could be used for fish authentication purposes. These findings allowed us to propose a rapid and effective classification method, based in the detection of these biomarker peptides using the selective ion reaction monitoring (SIRM) scan mode in mass spectrometry.  相似文献   

11.
12.
Proteins in the small subunit of the mammalian mitochondrial ribosome were separated by two-dimensional polyacrylamide gel electrophoresis. Four individual proteins were subjected to in-gel Endoprotease Lys-C digestion. The sequences of selected proteolytic peptides were obtained by electrospray tandem mass spectrometry. Peptide sequences obtained from in-gel digestion of individual spots were used to screen human, mouse, and rat expressed sequence tag databases, and complete consensus cDNAs for these species were deduced in silico. The corresponding protein sequences were characterized by comparison to known ribosomal proteins in protein databases. Four different classes of mammalian mitochondrial small subunit ribosomal proteins were identified. Only two of these proteins have significant sequence similarities to ribosomal proteins from prokaryotes. These proteins are homologs to Escherichia coli S9 and S5 proteins. The presence of these newly identified mitochondrial ribosomal proteins are also investigated in the Drosophila melanogaster, Caenorhabditis elegans, and in the genomes of several fungi.  相似文献   

13.
De novo interpretation of tandem mass spectrometry (MS/MS) spectra provides sequences for searching protein databases when limited sequence information is present in the database. Our objective was to define a strategy for this type of homology-tolerant database search. Homology searches, using MS-Homology software, were conducted with 20, 10, or 5 of the most abundant peptides from 9 proteins, based either on precursor trigger intensity or on total ion current, and allowing for 50%, 30%, or 10% mismatch in the search. Protein scores were corrected by subtracting a threshold score that was calculated from random peptides. The highest (p < .01) corrected protein scores (i.e., above the threshold) were obtained by submitting 20 peptides and allowing 30% mismatch. Using these criteria, protein identification based on ion mass searching using MS/MS data (i.e., Mascot) was compared with that obtained using homology search. The highest-ranking protein was the same using Mascot, homology search using the 20 most intense peptides, or homology search using all peptides, for 63.4% of 112 spots from two-dimensional polyacrylamide gel electrophoresis gels. For these proteins, the percent coverage was greatest using Mascot compared with the use of all or just the 20 most intense peptides in a homology search (25.1%, 18.3%, and 10.6%, respectively). Finally, 35% of de novo sequences completely matched the corresponding known amino acid sequence of the matching peptide. This percentage increased when the search was limited to the 20 most intense peptides (44.0%). After identifying the protein using MS-Homology, a peptide mass search may increase the percent coverage of the protein identified.  相似文献   

14.
The complete amino acid sequence of exogastrula-inducing peptide C from embryos of the sea urchin, Anthocidaris crassispina has been determined by analysis of the amino acid sequences in the S-pyridylethylated peptide C and the peptides generated after digestion of the peptide C with arginyl endopeptidase. Exogastrula-inducing peptide C was composed of 58 amino acid residues and its molecular weight was calculated to be 6464. The sequence was DTKGGCERATNNCNGHGDCVQGRWGQYYCKCTLPYRVGGSESSCYMPKDKEEDVEIET.  相似文献   

15.
16.
We describe two novel sequence similarity search algorithms, FASTS and FASTF, that use multiple short peptide sequences to identify homologous sequences in protein or DNA databases. FASTS searches with peptide sequences of unknown order, as obtained by mass spectrometry-based sequencing, evaluating all possible arrangements of the peptides. FASTF searches with mixed peptide sequences, as generated by Edman sequencing of unseparated mixtures of peptides. FASTF deconvolutes the mixture, using a greedy heuristic that allows rapid identification of high scoring alignments while reducing the total number of explored alternatives. Both algorithms use the heuristic FASTA comparison strategy to accelerate the search but use alignment probability, rather than similarity score, as the criterion for alignment optimality. Statistical estimates are calculated using an empirical correction to a theoretical probability. These calculated estimates were accurate within a factor of 10 for FASTS and 1000 for FASTF on our test dataset. FASTS requires only 15-20 total residues in three or four peptides to robustly identify homologues sharing 50% or greater protein sequence identity. FASTF requires about 25% more sequence data than FASTS for equivalent sensitivity, but additional sequence data are usually available from mixed Edman experiments. Thus, both algorithms can identify homologues that diverged 100 to 500 million years ago, allowing proteomic identification from organisms whose genomes have not been sequenced.  相似文献   

17.
Several peptides were isolated from the protein silk fibroin of Bombyx mori by means of ion-exchange chromatography of a chymotryptic digest. The sequences of three of the peptides, Gly-Ala-Gly-Tyr, Gly-Val-Gly-Tyr and Gly-Ala-Gly-Ala-Gly-Ala-Gly-Tyr, were known from previous chemical work, but the sequence of the fourth, Gly-Ala-Gly-Val-Gly-Ala-Gly-Tyr, was previously only partially known. The necessary volatility for mass-spectrometric examination of the peptides was achieved by permethylation of the N-acetyl-peptide methyl ester derivatives. From the mass spectra it was possible to confirm the known sequences and to establish that of the partially known one. In one instance it was possible to deduce from the same mass spectrum the sequence of a main peptide component and that of a small amount of contaminating peptide. These results demonstrate for the first time the use of mass spectrometry in the determination of the amino acid sequences in peptides from a protein hydrolysate.  相似文献   

18.
The interaction of bilirubin with collagen in the significance of jaundice incidence have been previously reported and investigated. The novel peptide sequences containing bilirubin binding domain was identified and located to develop a basis for further studies investigating the interactions of collagen with bilirubin in the present study. In this study an intricate interaction between bilirubin and collagen was characterized and their binding domain has been established using in-gel digestion and LC–MS/MS analysis based on the collagen sequencing and peptide mass fingerprinting. The biotinylated bilirubin derivatives bind to α1(I) chain but not to α2(I) chains which clearly designates that bilirubin shows greater affinity to α1 chains of collagen. The intact proteins collected after analyzing the resulting complex mixture of peptides was used for peptide mapping. Using the electrospray method, among the other peptide sequence information obtained, the molecular weight of collagen alpha-2(I) chain was obtained by locating a 130 kDa weight peptide sequences with greater pi value (9.14) with 1,364 amino acid residues and collagen alpha-1(I) chain with 1,463 amino acid residues with 138.9 kDa molecular weight. This information leads to locate the exact sequence of these helices focussing on the domain identification. The total charge of the peptide domain sequences infers that the bilirubin participates in the electrostatic mode of interaction with collagen peptide. Moreover, other modes of interactions such as hydrogen bonding, covalent interactions and hydrophobic interactions are possible.  相似文献   

19.
Trypsin cleaves exclusively C-terminal to arginine and lysine residues   总被引:2,自引:0,他引:2  
Almost all large-scale projects in mass spectrometry-based proteomics use trypsin to convert protein mixtures into more readily analyzable peptide populations. When searching peptide fragmentation spectra against sequence databases, potentially matching peptide sequences can be required to conform to tryptic specificity, namely, cleavage exclusively C-terminal to arginine or lysine. In many published reports, however, significant numbers of proteins are identified by non-tryptic peptides. Here we use the sub-parts per million mass accuracy of a new ion trap Fourier transform mass spectrometer to achieve more than a 100-fold increased confidence in peptide identification compared with typical ion trap experiments and show that trypsin cleaves solely C-terminal to arginine and lysine. We find that non-tryptic peptides occur only as the C-terminal peptides of proteins and as breakup products of fully tryptic peptides N-terminal to an internal proline. Simulating lower mass accuracy led to a large number of proteins erroneously identified with non-tryptic peptide hits. Our results indicate that such peptide hits in previous studies should be re-examined and that peptide identification should be based on strict trypsin specificity.  相似文献   

20.
Salivary agglutinin is a high molecular mass component of human saliva that binds Streptococcus mutans, an oral bacterium implicated in dental caries. To study its protein sequence, we isolated the agglutinin from human parotid saliva. After trypsin digestion, a portion was analyzed by matrix-assisted laser/desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS), which gave the molecular mass of 14 unique peptides. The remainder of the digest was subjected to high performance liquid chromatography, and the separated peptides were analyzed by MALDI-TOF/post-source decay; the spectra gave the sequences of five peptides. The molecular mass and peptide sequence information showed that salivary agglutinin peptides were identical to sequences in lung (lavage) gp-340, a member of the scavenger receptor cysteine-rich protein family. Immunoblotting with antibodies that specifically recognized either lung gp-340 or the agglutinin confirmed that the salivary agglutinin was gp-340. Immunoblotting with an antibody specific to the sialyl Le(x) carbohydrate epitope detected expression on the salivary but not the lung glycoprotein, possible evidence of different glycoforms. The salivary agglutinin also interacted with Helicobacter pylori, implicated in gastritis and peptic ulcer disease, Streptococcus agalactiae, implicated in neonatal meningitis, and several oral commensal streptococci. These results identify the salivary agglutinin as gp-340 and suggest it binds bacteria that are important determinants of either the oral ecology or systemic diseases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号