首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: Although many methods are available for the identification of structural domains from protein three-dimensional structures, accurate definition of protein domains and the curation of such data for a large number of proteins are often possible only after manual intervention. The availability of domain definitions for protein structural entries is useful for the sequence analysis of aligned domains, structure comparison, fold recognition procedures and understanding protein folding, domain stability and flexibility. RESULTS: We have improved our method of domain identification starting from the concept of clustering secondary structural elements, but with an intention of reducing the number of discontinuous segments in identified domains. The results of our modified and automatic approach have been compared with the domain definitions from other databases. On a test data set of 55 proteins, this method acquires high agreement (88%) in the number of domains with the crystallographers' definition and resources such as SCOP, CATH, DALI, 3Dee and PDP databases. This method also obtains 98% overlap score with the other resources in the definition of domain boundaries of the 55 proteins. We have examined the domain arrangements of 4592 non-redundant protein chains using the improved method to include 5409 domains leading to an update of the structural domain database. AVAILABILITY: The latest version of the domain database and online domain identification methods are available from http://www.ncbs.res.in/~faculty/mini/ddbase/ddbase.html Supplementary information: http://www.ncbs.res.in/~faculty/mini/ddbase/supplementary/supplementary.html  相似文献   

2.
The identification of unknown amino acid sequences of peptides as well as protein identification is of great significance in proteomics. Here, we present a publicly available web application that facilitates a high resolution mapping of measured molecular masses to peptides and proteins, irrespectively of the enzyme/digestion method used. Furthermore, multi-filtering may be applied in terms of measured mass tolerance, molecular mass and isoelectric point range as well as pattern matching to refine the results. This approach serves complementary to the existing solutions for protein identification and gives insights in novel peptides discovery and protein identification at the cases where the identification scores from the other approaches may be below significance threshold. Peptide Finder has been proven useful in proteomics procedures with experimental data from MALDI-TOF. AVAILABILITY: Peptide Finder web-application is available at http://bioserver-1.bioacademy.gr/Bioserver/PeptideFinder/.  相似文献   

3.
Introduction: Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) is increasingly utilized as a rapid technique to identify microorganisms including pathogenic bacteria. However, little attention has been paid to the significant proteomic information encoded in the MS peaks that collectively constitute the MS ‘fingerprint’. This review/perspective is intended to explore this topic in greater detail in the hopes that it may spur interest and further research in this area.

Areas covered: This paper examines the recent literature on utilizing MALDI-TOF for bacterial identification. Critical works highlighting protein biomarker identification of bacteria, arguments for and against protein biomarker identification, proteomic approaches to biomarker identification, emergence of MALDI-TOF-TOF platforms and their use for top-down proteomic identification of bacterial proteins, protein denaturation and its effect on protein ion fragmentation, collision cross-sections and energy deposition during desorption/ionization are also explored.

Expert commentary: MALDI-TOF and TOF-TOF mass spectrometry platforms will continue to provide chemical analyses that are rapid, cost-effective and high throughput. These instruments have proven their utility in the taxonomic identification of pathogenic bacteria at the genus and species level and are poised to more fully characterize these microorganisms to the benefit of clinical microbiology, food safety and other fields.  相似文献   


4.
Chen S 《Proteomics》2006,6(1):16-25
Current protein identification techniques are largely based on MALDI-TOF mass fingerprinting and LC-ESI MS/MS sequence tag analysis. Here we describe an improved method for rapid protein identification that uses direct infusion nanoelectrospray quadrupole time-of-flight (nanoESI QTOF) MS. Protein digests were analyzed without LC separation using nanoESI on a QSTAR XL MS/MS system in information dependent data acquisition mode. The protein identification conditions and parameters were extensively evaluated with in-solution and in-gel digested protein samples. Rapid identification of proteins was achieved and compared directly to the results obtained on the same samples using nanoflow HPLC-MS/MS on the QSTAR system. The increased throughput, reproducibility, the high data quality, and the ease of use make the direct infusion system an efficient and affordable technique for protein identification analysis.  相似文献   

5.
Matrix-assisted laser desorption/ionization (MALDI) imaging of proteolytic peptides from formalin-fixed paraffin embedded (FFPE) tissue sections could be integrated in the portfolio of molecular pathologists for protein localization and tissue classification. However, protein identification can be very tedious using MALDI-time-of-flight (TOF) and post-source decay (PSD)-based fragmentation. Hereby, we implemented an R package and Shiny app to exploit liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based proteomic biomarker discovery data for more specific identification of peaks observed in bottom-up MALDI imaging data. The package is made available under the GPL 3 license. The Shiny app can directly be used at the following address: https://biosciences.shinyapps.io/Maldimid.  相似文献   

6.
Zhao Song  Luonan Chen  Dong Xu 《Proteomics》2009,9(11):3090-3099
Protein identification using Peptide Mass Fingerprinting (PMF) data remains an important yet only partially solved problem. Current computational methods may lead to false positive identification since the top hit from a database search may not be the target protein. In addition, the identification scores assigned singly by a scoring function (raw scores) are not normalized. Therefore, the ranking based on raw scores may be biased. To address the above issue, we have developed a statistical model to evaluate the confidence of the raw score and to improve the ranking of proteins for identification. The results show that the statistical model better ranks the correct protein than the raw scores. Our study provides a new method to enhance the accuracy of protein identification by using PMF data. We incorporated the method into our software package “Protein‐Decision” together with a user‐friendly graphical interface. A standalone version of Protein‐Decision is freely available at http://digbio.missouri.edu/ProteinDecision/ .  相似文献   

7.
8.
We present results from a novel strategy that enables concurrent identification of protein-protein interactions and topologies in living cells without specific antibodies or genetic manipulations for immuno-/affinity purifications. The strategy consists of (i) a chemical cross-linking reaction: intact cell labeling with a novel class of chemical cross-linkers, protein interaction reporters (PIRs); (ii) two-stage mass spectrometric analysis: stage 1 identification of PIR-labeled proteins and construction of a restricted database by two-dimensional LC/MSMS and stage 2 analysis of PIR-labeled peptides by multiplexed LC/FTICR-MS; and (iii) data analysis: identification of cross-linked peptides and proteins of origin using accurate mass and other constraints. The primary advantage of the PIR approach and distinction from current technology is that protein interactions together with topologies are detected in native biological systems by stabilizing protein complexes with new covalent bonds while the proteins are present in the original cellular environment. Thus, weak or transient interactions or interactions that require properly folded, localized, or membrane-bound proteins can be labeled and identified through the PIR approach. This strategy was applied to Shewanella oneidensis bacterial cells, and initial studies resulted in identification of a set of protein-protein interactions and their contact/binding regions. Furthermore most identified interactions involved membrane proteins, suggesting that the PIR approach is particularly suited for studies of membrane protein-protein interactions, an area under-represented with current widely used approaches.  相似文献   

9.
Recently, the first investigation of nucleoli using mass spectrometry led to the identification of 271 proteins. This represents a rich resource for a comprehensive investigation of nucleolus evolution. We applied a protocol for the identification of known and novel conserved protein domains of the nucleolus, resulting in the identification of 115 known and 91 novel domain profiles. The phyletic distribution of nucleolar protein domains in a collection of complete proteomes of selected organisms from all domains of life confirms the archaebacterial origin of the core machinery for ribosome maturation and assembly, but also reveals substantial eubacterial and eukaryotic contributions to nucleolus evolution. We predict that, in different phases of nucleolus evolution, protein domains with different biochemical functions were recruited to the nucleolus. We suggest a model for the late and continuous evolution of the nucleolus in early eukaryotes and argue against an endosymbiotic origin of the nucleolus and the nucleus. Supplementary material for this article can be found on the BioEssays website at http://www.interscience.wiley.com/jpages/0265-9247/suppmat/index.html.  相似文献   

10.
To interpret LC-MS/MS data in proteomics, most popular protein identification algorithms primarily use predicted fragment m/z values to assign peptide sequences to fragmentation spectra. The intensity information is often undervalued, because it is not as easy to predict and incorporate into algorithms. Nevertheless, the use of intensity to assist peptide identification is an attractive prospect and can potentially improve the confidence of matches and generate more identifications. On the basis of our previously reported study of fragmentation intensity patterns, we developed a protein identification algorithm, SeQuence IDentfication (SQID), that makes use of the coarse intensity from a statistical analysis. The scoring scheme was validated by comparing with Sequest and X!Tandem using three data sets, and the results indicate an improvement in the number of identified peptides, including unique peptides that are not identified by Sequest or X!Tandem. The software and source code are available under the GNU GPL license at http://quiz2.chem.arizona.edu/wysocki/bioinformatics.htm.  相似文献   

11.
12.
Protein identification is a key and essential step in mass spectrometry (MS) based proteome research. To date, there are many protein identification strategies that employ either MS data or MS/MS data for database searching. While MS-based methods provide wider coverage than MS/MS-based methods, their identification accuracy is lower since MS data have less information than MS/MS data. Thus, it is desired to design more sophisticated algorithms that achieve higher identification accuracy using MS data. Peptide Mass Fingerprinting (PMF) has been widely used to identify single purified proteins from MS data for many years. In this paper, we extend this technology to protein mixture identification. First, we formulate the problem of protein mixture identification as a Partial Set Covering (PSC) problem. Then, we present several algorithms that can solve the PSC problem efficiently. Finally, we extend the partial set covering model to both MS/MS data and the combination of MS data and MS/MS data. The experimental results on simulated data and real data demonstrate the advantages of our method: 1) it outperforms previous MS-based approaches significantly; 2) it is useful in the MS/MS-based protein inference; and 3) it combines MS data and MS/MS data in a unified model such that the identification performance is further improved.  相似文献   

13.
14.
SUMMARY: ASAP is a web tool designed to search for specific dipeptides, tripeptides and tetrapeptides in a protein sequence database. The server allows for: (a) identification of frequent and infrequent peptides and the creation of peptide probability tables for a given database of sequences (GenerNet program), (b) determination of the compatibility of an amino-acid sequence to the given peptide probability tables (ClonErrNet program); and (c) comparison of different protein databases based on peptide composition (CompNet program). ASAP server can be useful in protein engineering and/or protein classification studies.  相似文献   

15.
Zhang J  Xu X  Gao M  Yang P  Zhang X 《Proteomics》2007,7(4):500-512
The current "shotgun" proteomic analysis, strong cation exchange-RPLC-MS/MS system, is a widely used method for proteome research. Currently, it is not suitable for complicated protein sample analysis, like mammal tissues or cells. To increase the protein identification confidence and number, an additional separation dimension for sample fractionation is necessary to be coupled prior to current multi-dimensional protein identification technology (MudPIT). In this work, SEC was elaborately selected and applied for sample prefractionation in consideration of its non-bias against sample and variety of choice of mobile phases. The analysis of the global lysate of normal human liver tissue sample provided by the China Human Liver Proteome Project, were performed to compare the proteome coverage, sequence coverage (peptide per protein identification) and protein identification efficiency in MudPIT, 3-D LC-MS/MS identification strategy with preproteolytic and postproteolytic fractionation. It was demonstrated that 3-D LC-MS/MS utilizing protein level fractionation was the most effective method. A MASCOT search using the MS/MS results acquired by QSTAR(XL) identified 1622 proteins from 3-D LC-MS/MS identification approaches. A primary analysis on molecular weight, pI and grand average hydrophobicity value distribution of the identified proteins in different approaches was made to further evaluate the 3-D LC-MS/MS analysis strategy.  相似文献   

16.
Two-dimensional liquid chromatography (2D-LC) coupled on-line with electrospray ionization tandem mass spectrometry (2D-LC-ESI-MS/MS) is a new platform for analysis and identification of proteome. Peptides are separated by 2D-LC and then performed MS/MS analysis by tandem MS/MS. The MS/MS data are searched against database for protein identification. In one 2D-LC-ESI-MS/MS run, we obtained not only the structural information of peptides directly from MS/MS, but also the retention time of peptides eluted from LC. Information on the chromatographic behavior of peptides can assist protein identification in the new platform for proteomics. The retention time of the matching peptides of the identified protein was predicted by the hydrophobic contribute of each amino acid on reversed-phase liquid chromatography (RPLC). By using this strategy proteins were identified by four types of information: peptide mass fingerprinting (PMF), sequence query, and MS/MS ions searched and the predicted retention time. This additional information obtained from LC could assist protein identification with no extra experimental cost.  相似文献   

17.
We have investigated the use of a top-down liquid chromatography/mass spectrometric (LC/MS) approach for the identification of specific protein biomarkers useful for differentiation of closely related strains of bacteria. The sequence information derived from the protein biomarker was then used to develop specific polymerase chain reaction primers useful for rapid identification of the strains. Shiga-toxigenic Escherichia coli (STEC) strains were used for this evaluation. The expressed protein profiles of two closely related serotype 0157:H7 strains, the predominant strain implicated in illness worldwide, and the nonpathogenic E. coli K-12 strain were compared with each other in an attempt to identify new protein markers that could be used to distinguish the 0157:H7 strains from each other and from the E. coli K-12 strain. Sequencing of a single protein unique to one of the 0157:H7 strains identified it as a cytolethal distending toxin, a potential virulence marker. The protein sequence information enabled the derivation of genetic sequence information for this toxin, thus allowing the development of specific polymerase chain reaction primers for its detection. In addition, the top-down LC/MS technique was able to identify other unique biomarkers and differentiate nearly identical 0157:H7 strains, which exhibited identical phenotypic, serologic, and genetic traits. The results of these studies demonstrate that this approach can be expanded to other serotypes of interest and provide a rational approach to identifying new molecular targets for detection.  相似文献   

18.
We examine differential protein expression in Euhalothece sp. BAA001, an extremely halotolerant and unsequenced cyanobacterium, under adaptation to low (0% w/v), medium (3% w/v), high (6% w/v) and very high (9% w/v) salt concentrations using cross-species protein identification tools. We combine stable isotope labelling with 1-D SDS-PAGE, and MASCOT protein identification software with MS-driven BLAST searches, to produce an accurate method for protein identification and quantitation. The use of metabolic labelling to improve the confidence in identification of proteins in cross-species proteomics is demonstrated. Three hundred and eighty-three unique proteins were identified, and 72 were deemed to be differentially expressed (average CV for quantitations was 0.10 +/- 0.08), belonging to 24 functional groups. Responses to low salt as well as high salt are discussed in terms of adaptation and evidence shows that Euhalothece cells display 'stress' responses in nonsaline conditions as well as higher salt environments.  相似文献   

19.
MOTIVATION: The rapid increase in the amount of protein sequence data has created a need for an automated identification of evolutionarily related subgroups from large datasets. The existing methods typically require a priori specification of the number of putative groups, which defines the resolution of the classification solution. RESULTS: We introduce a Bayesian model-based approach to simultaneous identification of evolutionary groups and conserved parts of the protein sequences. The model-based approach provides an intuitive and efficient way of determining the number of groups from the sequence data, in contrast to the ad hoc methods often exploited for similar purposes. Our model recognizes the areas in the sequences that are relevant for the clustering and regards other areas as noise. We have implemented the method using a fast stochastic optimization algorithm which yields a clustering associated with the estimated maximum posterior probability. The method has been shown to have high specificity and sensitivity in simulated and real clustering tasks. With real datasets the method also highlights the residues close to the active site. AVAILABILITY: Software 'kPax' is available at http://www.rni.helsinki.fi/jic/softa.html  相似文献   

20.
MOTIVATION: One bottleneck in high-throughput protein crystallography is interpreting an electron-density map, that is, fitting a molecular model to the 3D picture crystallography produces. Previously, we developed ACMI (Automatic Crystallographic Map Interpreter), an algorithm that uses a probabilistic model to infer an accurate protein backbone layout. Here, we use a sampling method known as particle filtering to produce a set of all-atom protein models. We use the output of ACMI to guide the particle filter's sampling, producing an accurate, physically feasible set of structures. RESULTS: We test our algorithm on 10 poor-quality experimental density maps. We show that particle filtering produces accurate all-atom models, resulting in fewer chains, lower sidechain RMS error and reduced R factor, compared to simply placing the best-matching sidechains on ACMI's trace. We show that our approach produces a more accurate model than three leading methods--Textal, Resolve and ARP/WARP--in terms of main chain completeness, sidechain identification and crystallographic R factor. AVAILABILITY: Source code and experimental density maps available at http://ftp.cs.wisc.edu/machine-learning/shavlik-group/programs/acmi/  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号