共查询到20条相似文献,搜索用时 0 毫秒
1.
Background
Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins. 相似文献2.
False discovery rate (FDR) analyses of protein and peptide identification results using decoy database searching conventionally report aggregate or global FDRs for a whole set of identifications, which are often not very informative about the error rates of individual members in the set. We describe a nonlinear curve fitting method for calculating the local FDR, which estimates the chance that an individual protein (or peptide) is incorrect, and present a simple tool that implements this analysis. The goal of this method is to offer a simple extension to the now commonplace decoy database searching, providing additional valuable information. 相似文献
3.
fdrtool: a versatile R package for estimating local and tail area-based false discovery rates 总被引:1,自引:0,他引:1
Strimmer K 《Bioinformatics (Oxford, England)》2008,24(12):1461-1462
False discovery rate (FDR) methodologies are essential in the study of high-dimensional genomic and proteomic data. The R package 'fdrtool' facilitates such analyses by offering a comprehensive set of procedures for FDR estimation. Its distinctive features include: (i) many different types of test statistics are allowed as input data, such as P-values, z-scores, correlations and t-scores; (ii) simultaneously, both local FDR and tail area-based FDR values are estimated for all test statistics and (iii) empirical null models are fit where possible, thereby taking account of potential over- or underdispersion of the theoretical null. In addition, 'fdrtool' provides readily interpretable graphical output, and can be applied to very large scale (in the order of millions of hypotheses) multiple testing problems. Consequently, 'fdrtool' implements a flexible FDR estimation scheme that is unified across different test statistics and variants of FDR. AVAILABILITY: The program is freely available from the Comprehensive R Archive Network (http://cran.r-project.org/) under the terms of the GNU General Public License (version 3 or later). CONTACT: strimmer@uni-leipzig.de. 相似文献
4.
Development of statistical methods for assessing the significance of peptide assignments to tandem mass spectra obtained using database searching remains an important problem. In the past several years, several different approaches have emerged, including the concept of expectation values, target-decoy strategy, and the probability mixture modeling approach of PeptideProphet. In this work, we provide a background on statistical significance analysis in the field of mass spectrometry-based proteomics, and present our perspective on the current and future developments in this area. 相似文献
5.
Victor B Gabriël S Kanobana K Mostovenko E Polman K Dorny P Deelder AM Palmblad M 《Journal of proteome research》2012,11(3):1991-1995
Tandem mass spectrometry is commonly used to identify peptides, typically by comparing their product ion spectra with those predicted from a protein sequence database and scoring these matches. The most reported quality metric for a set of peptide identifications is the false discovery rate (FDR), the fraction of expected false identifications in the set. This metric has so far only been used for completely sequenced organisms or known protein mixtures. We have investigated whether FDR estimations are also applicable in the case of partially sequenced organisms, where many high-quality spectra fail to identify the correct peptides because the latter are not present in the searched sequence database. Using real data from human plasma and simulated partial sequence databases derived from two complete human sequence databases with different levels of redundancy, we could demonstrate that the mixture model approach in PeptideProphet is robust for partial databases, particularly if used in combination with decoy sequences. We therefore recommend using this method when estimating the FDR and reporting peptide identifications from incompletely sequenced organisms. 相似文献
6.
A hypothesis was formed that it would be possible to isolate an adequate amount of protein from a patient, having normal renal function, to identify biological markers of a particular disease state using a variety of proteomics techniques. To support this hypothesis, three samples of urine were collected from a volunteer: first when healthy, later when experiencing acute inflammation due to a pilonidal abcess, and again later still after successful recovery from the condition. The urine from these samples was processed by solid-phase extraction to concentrate and desalt the endogenous proteins and peptides. The proteins and peptides from these urine samples were analyzed in three different experiments: (1) traditional two-dimensional gel electrophoresis followed by proteolysis and mass spectrometric identification of various protein spots, (2) whole mixture proteolysis followed by one-dimensional packed capillary liquid chromatography and tandem mass spectrometry, (3) whole mixture proteolysis followed by two-dimensional capillary liquid chromatography and tandem mass spectrometry. In all three cases, a set of proteins was identified representing putative biomarkers. Each of these proteins was then found to have been previously linked in the scientific literature to inflammation. One acute phase reactant in particular, orosomucoid, was readily observed in all three experiments to dramatically increase in abundance, thereby supporting the hypothesis. 相似文献
7.
Higdon R Reiter L Hather G Haynes W Kolker N Stewart E Bauman AT Picotti P Schmidt A van Belle G Aebersold R Kolker E 《Journal of Proteomics》2011,75(1):116-121
In high-throughput mass spectrometry proteomics, peptides and proteins are not simply identified as present or not present in a sample, rather the identifications are associated with differing levels of confidence. The false discovery rate (FDR) has emerged as an accepted means for measuring the confidence associated with identifications. We have developed the Systematic Protein Investigative Research Environment (SPIRE) for the purpose of integrating the best available proteomics methods. Two successful approaches to estimating the FDR for MS protein identifications are the MAYU and our current SPIRE methods. We present here a method to combine these two approaches to estimating the FDR for MS protein identifications into an integrated protein model (IPM). We illustrate the high quality performance of this IPM approach through testing on two large publicly available proteomics datasets. MAYU and SPIRE show remarkable consistency in identifying proteins in these datasets. Still, IPM results in a more robust FDR estimation approach and additional identifications, particularly among low abundance proteins. IPM is now implemented as a part of the SPIRE system. 相似文献
8.
A variety of methods have been described in the literature for assigning statistical significance to peptides identified via tandem mass spectrometry. Here, we explain how two types of scores, the q-value and the posterior error probability, are related and complementary to one another. 相似文献
9.
SUMMARY: twilight is a Bioconductor compatible package for analysing the statistical significance of differentially expressed genes. It is based on the concept of the local false discovery rate (FDR), a generalization of the frequently used global FDR. twilight implements the heuristic search algorithm for estimating the local FDR introduced in our earlier work. In addition to the raw significance measures, it produces diagnostic plots, which provide insight into the extent of differential expression across genes. AVAILABILITY: http://www.bioconductor.org CONTACT: stefanie.scheid@molgen.mpg.de SUPPLEMENTARY INFORMATION: Please visit our software webpage on http://compdiag.molgen.mpg.de/software. 相似文献
10.
A mixture model for estimating the local false discovery rate in DNA microarray analysis 总被引:3,自引:0,他引:3
MOTIVATION: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. RESULTS: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR. AVAILABILITY: An R function implementing the proposed model is available at http://www.geocities.com/jg_liao/software 相似文献
11.
Urine is an important source of biomarkers. This article reviews current advances, major challenges, and future prospects
in the field of urinary proteomics. Because the practical clinical problem is to distinguish diseases with similar symptoms,
merely comparing samples from patients of a particular disease to those of healthy individuals is inadequate for finding biomarkers
with sufficient diagnostic power. In addition, the variation of expression levels of urinary proteins among healthy individuals
and individuals under different physiological conditions adds to the difficulty in identifying biomarkers. We propose that
establishing the natural variation in urinary protein expression among a healthy population can serve as a reference to help
identify protein abundance changes that are caused by disease, not by individual variations or physiological changes. We also
discuss that comparing protein expression levels between urine and plasma may reveal the physiological function of the kidney
and that may facilitate biomarker discovery. Finally, we propose that establishing a data-sharing platform for data collection
and integrating results from all urinary biomarker studies will help promote the development of urinary proteomics. 相似文献
12.
13.
Godovac-Zimmermann J Mulvey C Konstantoulaki M Sainsbury R Brown LR 《Expert review of proteomics》2007,4(2):161-173
Proteomics has lacked adequate methods for handling the complexity (hundreds of thousands of different proteins) and range of protein concentrations (> or =10(6)) of eukaryotic proteomes. New multiphoton-detection methods for ultrasensitive detection of proteins produce 10,000-fold gains in sensitivity and allow highly quantitative, linear detection of 50 zmol (30,000 molecules) to 500 fmol of proteins in complex samples. The potential of multiphoton detection in top-down proteomics analyses is illustrated with applications in monitoring proteomes in very small numbers of cells, in identifying and monitoring complex functional isoforms of cancer-related proteins, and in super-sensitive immunoassays of serum proteins for high-performance detection of cancer. 相似文献
14.
The identification and clinical use of more sensitive and specific biomarkers in the field of solid organ transplantation is an urgent need in medicine. Solid organ transplantation has seen improvements in the short-term survival of transplanted organs due to recent advancements in immunosuppressive therapy. However, the currently available methods of allograft monitoring are not optimal. Recent advancements in assaying methods for biomolecules such as genes, mRNA and proteins have helped to identify surrogate biomarkers that can be used to monitor the transplanted organ. These high-throughput 'omic' methods can help researchers to significantly speed up the identification and the validation steps, which are crucial factors for biomarker discovery efforts. Still, the progress towards identifying more sensitive and specific biomarkers remains a great deal slower than expected. In this article, we have evaluated the current status of biomarker discovery using proteomics tools in different solid organ transplants in recent years. This article summarizes recent reports and current status, along with the hurdles in efficient biomarker discovery of protein biomarkers using proteomics approaches. Finally, we will touch upon personalized medicine as a future direction for better management of transplanted organs, and provide what we think could be a recipe for success in this field. 相似文献
15.
《Expert review of proteomics》2013,10(2):161-173
Proteomics has lacked adequate methods for handling the complexity (hundreds of thousands of different proteins) and range of protein concentrations (≥106) of eukaryotic proteomes. New multiphoton-detection methods for ultrasensitive detection of proteins produce 10,000-fold gains in sensitivity and allow highly quantitative, linear detection of 50 zmol (30,000 molecules) to 500 fmol of proteins in complex samples. The potential of multiphoton detection in top-down proteomics analyses is illustrated with applications in monitoring proteomes in very small numbers of cells, in identifying and monitoring complex functional isoforms of cancer-related proteins, and in super-sensitive immunoassays of serum proteins for high-performance detection of cancer. 相似文献
16.
OBJECTIVES: To develop a method for designing studies to find disease mutations that can achieve a set of goals with respect to proportions of false and true discoveries with the minimum amount of genotyping. METHODS: Derivation of an analytical framework supplemented with simulation techniques. The approach is illustrated for a fine mapping study and a whole-genome linkage disequilibrium scan. RESULTS: The use of multiple stages where earlier stages are characterized by very high false discovery rates (FDR) followed by an abrupt change to the required FDR in the final stage results in a 50-75% reduction in genotyping. The proportion of true discoveries is a much more important determinant of the genotyping burden than the FDR. Neither sample size nor controlling the false discoveries will present major problems in whole-genome LD scans but the amount of genotyping will be extremely large even if the study is completely designed to minimize genotyping. CONCLUSIONS: The proposed statistical framework presents a simple and flexible approach to determine the design parameters (e.g. sample size, p values at which tests need to be performed at each stage) that minimize the genotyping burden given a set of goals for the percentage of true and false discoveries. 相似文献
17.
Robust estimation of the false discovery rate 总被引:2,自引:0,他引:2
MOTIVATION: Presently available methods that use p-values to estimate or control the false discovery rate (FDR) implicitly assume that p-values are continuously distributed and based on two-sided tests. Therefore, it is difficult to reliably estimate the FDR when p-values are discrete or based on one-sided tests. RESULTS: A simple and robust method to estimate the FDR is proposed. The proposed method does not rely on implicit assumptions that tests are two-sided or yield continuously distributed p-values. The proposed method is proven to be conservative and have desirable large-sample properties. In addition, the proposed method was among the best performers across a series of 'real data simulations' comparing the performance of five currently available methods. AVAILABILITY: Libraries of S-plus and R routines to implement the method are freely available from www.stjuderesearch.org/depts/biostats. 相似文献
18.
L H Oliver R S Poulsen G T Toussaint 《The journal of histochemistry and cytochemistry》1977,25(7):696-701
The performance of a cell recognition system on unknown data is often estimated in terms of its error rates on a test set. This paper investigates methods for producing estimates of error rates in cervical cell classification. Classification performance curves calculated using these methods are given for several classification schemes used to classify 1500 cervical cells. 相似文献
19.
A primary component of next-generation sequencing analysis is to align short reads to a reference genome, with each read aligned
independently. However, reads that observe the same non-reference DNA sequence are highly correlated and can be used to better
model the true variation in the target genome. A novel short-read micro re-aligner, SRMA, that leverages this correlation
to better resolve a consensus of the underlying DNA sequence of the targeted genome is described here. 相似文献
20.
The use of proteomics in the discovery of serum biomarkers from patients with severe acute respiratory syndrome 总被引:1,自引:0,他引:1
Ren Y He QY Fan J Jones B Zhou Y Xie Y Cheung CY Wu A Chiu JF Peiris JS Tam PK 《Proteomics》2004,4(11):3477-3484
Severe acute respiratory syndrome (SARS) is a new infectious disease with a global impact. Understanding its pathogenesis and developing specific diagnostic methods for its early diagnosis are crucial for the effective management and control of this disease. By using proteomic technology, truncated forms of alpha(1)-antitrypsin (TF-alpha(1)-AT) were found to increase significantly and consistently in sera of SARS patients compared to control subjects. The result showed a sensitivity of 100% for SARS patients and a specificity of 92.8% for controls. Furthermore, the levels of these proteins significantly correlated with certain clinico-pathological parameters. The dramatic increase in TF-alpha(1)-AT may be the result of degradation of alpha(1)-AT. As alpha(1)-AT plays an important role in the protection of lung function, its degradation may be an important factor in the pathogenesis of SARS. These findings indicate that increased TF-alpha(1)-AT may be therapeutically relevant, and may also be a useful biological marker for the diagnosis of SARS. 相似文献