首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Recent advances in liquid chromatography-mass spectrometry (LC-MS) technology have led to more effective approaches for measuring changes in peptide/protein abundances in biological samples. Label-free LC-MS methods have been used for extraction of quantitative information and for detection of differentially abundant peptides/proteins. However, difference detection by analysis of data derived from label-free LC-MS methods requires various preprocessing steps including filtering, baseline correction, peak detection, alignment, and normalization. Although several specialized tools have been developed to analyze LC-MS data, determining the most appropriate computational pipeline remains challenging partly due to lack of established gold standards.

Results

The work in this paper is an initial study to develop a simple model with "presence" or "absence" condition using spike-in experiments and to be able to identify these "true differences" using available software tools. In addition to the preprocessing pipelines, choosing appropriate statistical tests and determining critical values are important. We observe that individual statistical tests could lead to different results due to different assumptions and employed metrics. It is therefore preferable to incorporate several statistical tests for either exploration or confirmation purpose.

Conclusions

The LC-MS data from our spike-in experiment can be used for developing and optimizing LC-MS data preprocessing algorithms and to evaluate workflows implemented in existing software tools. Our current work is a stepping stone towards optimizing LC-MS data acquisition and testing the accuracy and validity of computational tools for difference detection in future studies that will be focused on spiking peptides of diverse physicochemical properties in different concentrations to better represent biomarker discovery of differentially abundant peptides/proteins.  相似文献   

2.
Identification of fusion proteins has contributed significantly to our understanding of cancer progression, yielding important predictive markers and therapeutic targets. While fusion proteins can be potentially identified by mass spectrometry, all previously found fusion proteins were identified using genomic (rather than mass spectrometry) technologies. This lack of MS/MS applications in studies of fusion proteins is caused by the lack of computational tools that are able to interpret mass spectra from peptides covering unknown fusion breakpoints (fusion peptides). Indeed, the number of potential fusion peptides is so large that the existing MS/MS database search tools become impractical even in the case of small genomes. We explore computational approaches to identifying fusion peptides, propose an algorithm for solving the fusion peptide identification problem, and analyze the performance of this algorithm on simulated data. We further illustrate how this approach can be modified for human exons prediction.  相似文献   

3.
Tilted peptides are short sequence fragments (10-20 residues long) that possess an asymmetric hydrophobicity gradient along their sequence when they are helical. Due to this gradient, they adopt a tilted orientation towards a single lipid/water interface and destabilize the lipids. We have detected those peptides in many different proteins with various functions. While being all tilted-oriented at a single lipid/water interface, no consensus sequence can be evidenced. In order to better understand the relationships between their lipid-destabilizing activity and their properties, we used IMPALA to classify the tilted peptides. This method allows the study of interactions between a peptide and a modeled lipid bilayer using simple restraint functions designed to mimic some of the membrane properties. We predict that tilted peptides have access to a wide conformational space in membranes, in contrast to transmembrane and amphipathic helices. In agreement with previous studies, we suggest that those metastable configurations could lead to the perturbation of the acyl chains organization and could be a general mechanism for lipid destabilization. Our results further suggest that tilted peptides fall into two classes: those from proteins acting on membrane behave differently than destabilizing fragments from interfacial proteins. While the former have equal access to the two layers of the membrane, the latter are confined within a single lipid layer. This could be in relation with the organization of lipid substrate on which the peptides physiologically act.  相似文献   

4.
Differential quantification of proteins and peptides by LC-MS is a promising method to acquire knowledge about biological processes, and for finding drug targets and biomarkers. However, differential protein analysis using LC-MS has been held back by the lack of suitable software tools. Large amounts of experimental data are easily generated in protein and peptide profiling experiments, but data analysis is time-consuming and labor-intensive. Here, we present a fully automated method for scanning LC-MS/MS data for biologically significant peptides and proteins, including support for interactive confirmation and further profiling. By studying peptide mixtures of known composition, we demonstrate that peptides present in different amounts in different groups of samples can be automatically screened for using statistical tests. A linear response can be obtained over almost 3 orders of magnitude, facilitating further profiling of peptides and proteins of interest. Furthermore, we apply the method to study the changes of endogenous peptide levels in mouse brain striatum after administration of reserpine, a classical model drug for inducing Parkinson disease symptoms.  相似文献   

5.
Luminita Moruz  Lukas Käll 《Proteomics》2014,14(12):1464-1466
We here present GradientOptimizer, an intuitive, lightweight graphical user interface to design nonlinear gradients for separation of peptides by reversed‐phase liquid chromatography. The software allows to calculate three types of nonlinear gradients, each of them optimizing a certain retention time distribution of interest. GradientOptimizer is straightforward to use, requires minimum processing of the input files, and is supported under Windows, Linux, and OS X platforms. The software is open‐source and can be downloaded under an Apache 2.0 license at https://github.com/statisticalbiotechnology/NonlinearGradientsUI .  相似文献   

6.
The automation of protein structure determination using NMR is coming of age. The tedious processes of resonance assignment, followed by assignment of NOE (nuclear Overhauser enhancement) interactions (now intertwined with structure calculation), assembly of input files for structure calculation, intermediate analyses of incorrect assignments and bad input data, and finally structure validation are all being automated with sophisticated software tools. The robustness of the different approaches continues to deal with problems of completeness and uniqueness; nevertheless, the future is very bright for automation of NMR structure generation to approach the levels found in X-ray crystallography. Currently, near completely automated structure determination is possible for small proteins, and the prospect for medium-sized and large proteins is good.  相似文献   

7.
Physical properties of membranes, such as fluidity, charge or curvature influence their function. Proteins and peptides can modulate those properties and conversely, the lipids can affect the activity and/or the structure of the former. Tilted peptides are short hydrophobic protein fragments characterized by an asymmetric distribution of their hydrophobic residues when helical. They were detected in viral fusion proteins and in proteins involved in different biological processes that need membrane destabilization. Those peptides and non lamellar lipids such as PE or PA appear to cooperate in the lipid destabilization process by enhancing the formation of negatively-curved domains. Such highly bent lipidic structures could favour the formation of the viral fusion pore intermediates or that of toroidal pores. Structural flexibility appears as another crucial property for the interaction of peptides with membranes. Computational analysis on another kind of lipid-interacting peptides, i.e. cell penetrating peptides (CPP) suggests that peptides being conformationally polymorphic should be more prone to traverse the bilayer. Future investigations on the structural intrinsic properties of tilted peptides and the influence of CPP on the bilayer organization using the techniques described in this chapter should help to further understand the molecular determinants of the peptide/lipid inter-relationships.  相似文献   

8.
The major histocompatibility complex (MHC) peptide repertoire of cancer cells serves both as a source for new tumor antigens for development of cancer immunotherapy and as a rich information resource about the protein content of the cancer cells (their proteome). Thousands of different MHC peptides are normally displayed by each cell, where most of them are derived from different proteins and thus represent most of the cellular proteome. However, in contrast to standard proteomics, which surveys the cellular protein contents, analyses of the MHC peptide repertoire correspond more to the rapidly degrading proteins in the cells (i.e. the transient proteome). MHC peptides can be efficiently purified by affinity chromatography from membranal MHC molecules, or preferably following transfection of vectors for expression of recombinant soluble MHC molecules. The purified peptides are resolved and analyzed by capillary high-pressure liquid chromatography-electrospray ionization-tandem mass spectrometry, and the data are deciphered with new software tools enabling the creation of large databanks of MHC peptides displayed by different cell types and by different MHC haplotypes. These lists of identified MHC peptides can now be used for searching new tumor antigens, and for identification of proteins whose rapid degradation is significant to cancer progression and metastasis. These lists can also be used for identification of new proteins of yet unknown function that are not detected by standard proteomics approaches. This review focuses on the presentation, identification and analysis of MHC peptides significant for cancer immunotherapy. It is also concerned with the aspects of human proteomics observed through large-scale analyses of MHC peptides.  相似文献   

9.
To further our understanding of the biology of the thermophilic bacterium Geobacillus thermoleovorans T80, we now report the first proteomic analysis of the insoluble subproteome of this isolate. A combination of both shotgun and multidimensional methodologies were utilized, and a total of 8628 peptides was initially identified by automated MS/MS identification software. Curation of these peptides led to a final list of 184 positive protein identifications. The proteins from this insoluble subproteome were functionally classified, and physiochemical characterization was carried out. Of 15 hypothetical conserved proteins identified, we have assigned function to all but four. A total of 31 proteins were predicted to possess signal peptides. In silico investigation of these proteins allowed us to identify four of the five bacterial classes of signal peptide, namely, (i) twin-arginine translocation; (ii) Sec-type; (iii) lipoprotein, and (iv) ABC transport. In addition, a number of proteins were identified that are known to be involved in the transport of compatible solutes, known to be important in microbial stress responses.  相似文献   

10.
为了获取茎瘤固氮根瘤菌(Azorhizobium caulinodans ORS571)的分泌蛋白,以便更深入地了解该菌的共生固氮作用,本研究采用SignalP、TMHMM、PSORTb、TargetP、LipoP、TatP和SecretomeP软件对该菌全部4717个蛋白序列进行分析预测。结果共识别了653个分泌蛋白,其中具有分泌型信号肽的蛋白54个,具有RR-motif型信号肽的蛋白1个,具有脂蛋白信号肽的蛋白2个和非经典分泌蛋白596个。该菌含信号肽分泌蛋白仅占全部蛋白的1.2%,低于其它固氮菌。在分泌蛋白中识别了核酸内切酶和核糖核酸酶等6个核酸酶。它们可能参与宿主植物遗传物质的降解,干扰宿主遗传代谢,进一步在宿主植物侵染过程中起到重要作用。此外还识别了超氧化物歧化酶、过氧化氢酶和谷胱甘肽S-转移酶等4个抗氧化酶。它们可能参与活性氧的清除以保护固氮酶,是该菌固氮过程的重要参与者。  相似文献   

11.
The increase in prevalence of antimicrobial resistance makes the search for new antibiotic agents imperative. Antimicrobial peptides (AMPs) from natural resources have been recognized as suitable tools to combat antibiotic-resistant bacteria. The liver fluke Clonorchis sinensis living in germ-filled environments could be a good source of antimicrobials. Here, we report the use of a rational protocol that combines AMP predictions based on their physicochemical properties and their in vivo stability to discover AMP candidates from the entire genome of C. sinensis. To screen AMP candidates, in silico analyses based on the physicochemical properties of known AMPs, such as length, charge, isoelectric point, and in vitro and in vivo aggregation values were performed. To enhance their in vivo stability, proteins having proteolytic cleavage sites were excluded. As a consequence, four high-activity, highstability peptides were identified. These peptides could be potential starting materials for the development of new AMPs via structural modification and optimization. Thus, this study proposes a refined computational method to develop new AMPs and identifies four AMP candidates, which could serve as templates for further development of peptide antibiotics.  相似文献   

12.
This study aims to design epitope-based peptides for the utility of vaccine development by targeting Glycoprotein 2 (GP2) and Viral Protein 24 (VP24) of the Ebola virus (EBOV) that, respectively, facilitate attachment and fusion of EBOV with host cells. Using various databases and tools, immune parameters of conserved sequences from GP2 and VP24 proteins of different strains of EBOV were tested to predict probable epitopes. Binding analyses of the peptides with major histocompatibility complex (MHC) class I and class II molecules, population coverage, and linear B cell epitope prediction were peroformed. Predicted peptides interacted with multiple MHC alleles and illustrated maximal population coverage for both GP2 and VP24 proteins, respectively. The predicted class-I nonamers, FLYDRLAST, LFLRATTEL and NYNGLLSSI were found to cover the maximum number of MHC I alleles and showed interactions with binding energies of ?7.8, ?8.5 and ?7.7 kcal/mol respectively. Highest scoring class II MHC binding peptides were EGAFFLYDRLASTVI and SPLWALRVILAAGIQ with binding energies of ?6.2 and -5.6 kcal/mol. Putative B cell epitopes were also found on 4 conserved regions in GP2 and two conserved regions in VP24. Our in silico analysis suggests that the predicted epitopes could be a better choice as universal vaccine component against EBOV irrespective of different strains and should be subjected to in vitro and in vivo analyses for further research and development.  相似文献   

13.
P Y Muller  E Studer  A R Miserez 《BioTechniques》2001,31(6):1306, 1308, 1310-1306, 1308, 1313
In all fields of molecular biology, researchers are increasingly challenged by experiments planned and evaluated on the basis of nucleic acid and protein sequence data generally retrieved from public databases. Despite the wide spectrum of available Web-based software tools for sequence analysis, the routine use of these tools has disadvantages, particularly because of the elaborate and heterogeneous ways of data input, output, and storage. Here we present a Visual Basic-encoded Microsoft Word Add-In, the Molecular BioComputing Suite (MBCS), available at the BioTechniques Software Library (www.BioTechniques.com). The MBCS software aims to manage and expedite a wide range of sequence analyses and manipulations using an integrated text editor environment including menu-guided commands. Its independence of sequence formats enables MBCS to be used as a pivotal application between other software tools for sequence analysis, manipulation, annotation, and editing.  相似文献   

14.
The soluble and peripheral proteins in the thylakoids of pea were systematically analyzed by using two-dimensional electrophoresis, mass spectrometry, and N-terminal Edman sequencing, followed by database searching. After correcting to eliminate possible isoforms and post-translational modifications, we estimated that there are at least 200 to 230 different lumenal and peripheral proteins. Sixty-one proteins were identified; for 33 of these proteins, a clear function or functional domain could be identified, whereas for 10 proteins, no function could be assigned. For 18 proteins, no expressed sequence tag or full-length gene could be identified in the databases, despite experimental determination of a significant amount of amino acid sequence. Nine previously unidentified proteins with lumenal transit peptides are presented along with their full-length genes; seven of these proteins possess the twin arginine motif that is characteristic for substrates of the TAT pathway. Logoplots were used to provide a detailed analysis of the lumenal targeting signals, and all nuclear-encoded proteins identified on the two-dimensional gels were used to test predictions for chloroplast localization and transit peptides made by the software programs ChloroP, PSORT, and SignalP. A combination of these three programs was found to provide a useful tool for evaluating chloroplast localization and transit peptides and also could reveal possible alternative processing sites and dual targeting. The potential of proteomics for plant biology and homology-based searching with mass spectrometry data is discussed.  相似文献   

15.
Systematic investigation of cellular process by mass spectrometric detection of peptides obtained from proteins digestion or directly from immuno-purification can be a powerful tool when used appropriately. The true sequence of these peptides is defined by the interpretation of spectral data using a variety of available algorithms. However peptide match algorithm scoring is typically based on some, but not all, of the mechanisms of peptide fragmentation. Although algorithm rules for soft ionization techniques generally fit very well to tryptic peptides, manual validation of spectra is often required for endogenous peptides such as MHC class I molecules where traditional trypsin digest techniques are not used. This study summarizes data mining and manual validation of hundreds of peptide sequences from MHC class I molecules in publically available data files. We herein describe several important features to improve and quantify manual validation for these endogenous peptides--post automated algorithm searching. Important fragmentation patterns are discussed for the studied MHC Class I peptides. These findings lead to practical rules that are helpful when performing manual validation. Furthermore, these observations may be useful to improve current peptide search algorithms or development of novel software tools.  相似文献   

16.
Gay S  Binz PA  Hochstrasser DF  Appel RD 《Proteomics》2002,2(10):1374-1391
Matrix-assisted laser desorption/ionization-time of flight mass spectrometry has become a valuable tool in proteomics. With the increasing acquisition rate of mass spectrometers, one of the major issues is the development of accurate, efficient and automatic peptide mass fingerprinting (PMF) identification tools. Current tools are mostly based on counting the number of experimental peptide masses matching with theoretical masses. Almost all of them use additional criteria such as isoelectric point, molecular weight, PTMs, taxonomy or enzymatic cleavage rules to enhance prediction performance. However, these identification tools seldom use peak intensities as parameter as there is currently no model predicting the intensities based on the physicochemical properties of peptides. In this work, we used standard datamining methods such as classification and regression methods to find correlations between peak intensities and the properties of the peptides composing a PMF spectrum. These methods were applied on a dataset comprising a series of PMF experiments involving 157 proteins. We found that the C4.5 method gave the more informative results for the classification task (prediction of the presence or absence of a peptide in a spectra) and M5' for the regression methods (prediction of the normalized intensity of a peptide peak). The C4.5 result correctly classified 88% of the theoretical peaks; whereas the M5' peak intensities had a correlation coefficient of 0.6743 with the experimental peak intensities. These methods enabled us to obtain decision and model trees that can be directly used for prediction and identification of PMF results. The work performed permitted to lay the foundations of a method to analyze factors influencing the peak intensity of PMF spectra. A simple extension of this analysis could lead to improve the accuracy of the results by using a larger dataset. Additional peptide characteristics or even PMF experimental parameters can also be taken into account in the datamining process to analyze their influence on the peak intensity. Furthermore, this datamining approach can certainly be extended to the tandem mass spectrometry domain or other mass spectrometry derived methods.  相似文献   

17.
Budisa N  Pal PP 《Biological chemistry》2004,385(10):893-904
Fluorescence methods are now well-established and powerful tools to study biological macromolecules. The canonical amino acid tryptophan (Trp), encoded by a single UGG triplet, is the main reporter of intrinsic fluorescence properties of most natural proteins and peptides and is thus an attractive target for tailoring their spectral properties. Recent advances in research have provided substantial evidence that the natural protein translational machinery can be genetically reprogrammed to introduce a large number of non-coded (i.e. noncanonical) Trp analogues and surrogates into various proteins. Especially attractive targets for such an engineering approach are fluorescent proteins in which the chromophore is formed post-translationally from an amino acid sequence, like the green fluorescent protein from Aequorea victoria. With the currently available translationally active fluoro-, hydroxy-, amino-, halogen-, and chalcogen-containing Trp analogues and surrogates, the traditional methods for protein engineering and design can be supplemented or even fully replaced by these novel approaches. Future research will provide a further increase in the number of Trp-like amino acids that are available for redesign (by engineering of the genetic code) of native Trp residues and enable novel strategies to generate proteins with tailored spectral properties.  相似文献   

18.
We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm.  相似文献   

19.
Novel approaches for the qualitative and quantitative proteomics analysis by nanoscale LC-MS applied to the study of protein expression response in depleted and undepleted serum of Gaucher patients undergoing enzyme replacement therapy are presented. Particular emphasis is given to the method reproducibility of these LC-MS experiments without the use of isotopic labels. The level of chitotriosidase, an established Gaucher biomarker, was assessed by means of an absolute concentration determination technique for alternate scanning LC-MS generated data. Disease associated proteins, including fibrinogens, complement cascade proteins, and members of the high density lipoprotein serum content, were recognized by various clustering methods and sorting and intensity profile grouping of identified peptides. Condition-unique LC-MS protein signatures could be generated utilizing the measured serum protein concentrations and are presented for all investigated conditions. The clustering results of the study were also used as input for gene ontology searches to determine the correlation between the molecular functions of the identified peptides and proteins.  相似文献   

20.
The study of the protein?Cprotein interactions (PPIs) of unique ORFs is a strategy for deciphering the biological roles of unique ORFs of interest. For uniform reference, we define unique ORFs as those for which no matching protein is found after PDB-BLAST search with default parameters. The uniqueness of the ORFs generally precludes the straightforward use of structure-based approaches in the design of experiments to explore PPIs. Many open-source bioinformatics tools, from the commonly-used to the relatively esoteric, have been built and validated to perform analyses and/or predictions of sorts on proteins. How can these available tools be combined into a protocol that helps the non-expert bioinformaticist researcher to design experiments to explore the PPIs of their unique ORF? Here we define a pragmatic protocol based on accessibility of software to achieve this and we make it concrete by applying it on two proteins??the ImuB and ImuA?? proteins from Mycobacterium tuberculosis. The protocol is pragmatic in that decisions are made largely based on the availability of easy-to-use freeware. We define the following basic and user-friendly software pathway to build testable PPI hypotheses for a query protein sequence: PSI-PRED????MUSTER????metaPPISP????ASAView and ConSurf. Where possible, other analytical and/or predictive tools may be included. Our protocol combines the software predictions and analyses with general bioinformatics principles to arrive at consensus, prioritised and testable PPI hypotheses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号