Abstract: | Proteomics based on tandem mass spectrometry is a powerful tool for identifying novel biomarkers and drug targets. Previously, a major bottleneck in high-throughput proteomics has been that the computational techniques needed to reliably identify proteins from proteomic data lagged behind the ability to collect the immense quantity of data generated. This is no longer the case, as fully automated pipelines for peptide and protein identification exist, and these are publicly and privately accessible. Such pipelines can automatically and rapidly generate high-confidence protein identifications from large datasets in a searchable format covering multiple experimental runs. However, the main challenge for the community now is to use these resources as they are, by taking full advantage of the pooling of information, so that the next barrier in our understanding of biology may be broken. There are currently two pipelines in the public domain that provide such potential: PeptideAtlas and the Genome Annotating Proteomic Pipeline. This review will introduce their features in the context of high-throughput proteomics, and provide indicative results as to their usefulness and usability through a side-by-side comparison of results obtained when processing a set of human plasma samples. |