首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Colland F  Daviet L 《Biochimie》2004,86(9-10):625-632
Functional proteomics is a promising technique for the rational identification of novel therapeutic targets by elucidation of the function of newly identified proteins in disease-relevant cellular pathways. Of the recently described high-throughput approaches for analyzing protein-protein interactions, the yeast two-hybrid (Y2H) system has turned out to be one of the most suitable for genome-wide analysis. However, this system presents a challenging technical problem: the high prevalence of false positives and false negatives in datasets due to intrinsic limitations of the technology and the use of a high-throughput, genetic assay. We discuss here the different experimental strategies applied to Y2H assays, their general limitations and advantages. We also address the issue of the contribution of protein interaction mapping to functional biology, especially when combined with complementary genomic and proteomic analyses. Finally, we illustrate how the combination of protein interaction maps with relevant functional assays can provide biological support to large-scale protein interaction datasets and contribute to the identification and validation of potential therapeutic targets.  相似文献   

2.
Proteomics has rapidly become an important tool for life science research, allowing the integrated analysis of global protein expression from a single experiment. To accommodate the complexity and dynamic nature of any proteome, researchers must use a combination of disparate protein biochemistry techniques, often a highly involved and time-consuming process. Whilst highly sophisticated, individual technologies for each step in studying a proteome are available, true high-throughput proteomics that provides a high degree of reproducibility and sensitivity has been difficult to achieve. The development of high-throughput proteomic platforms, encompassing all aspects of proteome analysis and integrated with genomics and bioinformatics technology, therefore represents a crucial step for the advancement of proteomics research. ProteomIQ (Proteome Systems) is the first fully integrated, start-to-finish proteomics platform to enter the market. Sample preparation and tracking, centralized data acquisition and instrument control, and direct interfacing with genomics and bioinformatics databases are combined into a single suite of integrated hardware and software tools, facilitating high reproducibility and rapid turnaround times. This review will highlight some features of ProteomIQ, with particular emphasis on the analysis of proteins separated by 2D polyacrylamide gel electrophoresis.  相似文献   

3.
Proteomics has rapidly become an important tool for life science research, allowing the integrated analysis of global protein expression from a single experiment. To accommodate the complexity and dynamic nature of any proteome, researchers must use a combination of disparate protein biochemistry techniques, often a highly involved and time-consuming process. Whilst highly sophisticated, individual technologies for each step in studying a proteome are available, true high-throughput proteomics that provides a high degree of reproducibility and sensitivity has been difficult to achieve. The development of high-throughput proteomic platforms, encompassing all aspects of proteome analysis and integrated with genomics and bioinformatics technology, therefore represents a crucial step for the advancement of proteomics research. ProteomIQ? (Proteome Systems) is the first fully integrated, start-to-finish proteomics platform to enter the market. Sample preparation and tracking, centralized data acquisition and instrument control, and direct interfacing with genomics and bioinformatics databases are combined into a single suite of integrated hardware and software tools, facilitating high reproducibility and rapid turnaround times. This review will highlight some features of ProteomIQ, with particular emphasis on the analysis of proteins separated by 2D polyacrylamide gel electrophoresis.  相似文献   

4.
One of the major challenges for large scale proteomics research is the quality evaluation of results. Protein identification from complex biological samples or experimental setups is often a manual and subjective task which lacks profound statistical evaluation. This is not feasible for high-throughput proteomic experiments which result in large datasets of thousands of peptides and proteins and their corresponding mass spectra. To improve the quality, reliability and comparability of scientific results, an estimation of the rate of erroneously identified proteins is advisable. Moreover, scientific journals increasingly stipulate that articles containing considerable MS data should be subject to stringent statistical evaluation. We present a newly developed easy-to-use software tool enabling quality evaluation by generating composite target-decoy databases usable with all relevant protein search engines. This tool, when used in conjunction with relevant statistical quality criteria, enables to reliably determine peptides and proteins of high quality, even for nonexperienced users (e.g. laboratory staff, researchers without programming knowledge). Different strategies for building decoy databases are implemented and the resulting databases are characterized and compared. The quality of protein identification in high-throughput proteomics is usually measured by the false positive rate (FPR), but it is shown that the false discovery rate (FDR) delivers a more meaningful, robust and comparable value.  相似文献   

5.
The emerging field of bioinformatics in proteomics is introducing new algorithms in order to handle large and heterogeneous datasets and improve the knowledge-discovery process. Management systems, software construction and application, database population and leverage, as well as computed prediction, have crafted bioinformatics into a valuable tool for basic research. Human reproduction is one of many fields proteomics has been extensively studying over the last decade, accumulating complex experimental data at a rate far exceeding the ability to assimilate them. Transformation of the rapidly proliferating quantities of experimental information into a usable form in order to facilitate their analysis is a challenging task. On this track, bioinformatics, an essential part of proteomics research, aspires to amend inquiries into a better manipulated, a better handled and a better understood form so as to enhance existing knowledge expansion.  相似文献   

6.
As an emerging field, MS-based proteomics still requires software tools for efficiently storing and accessing experimental data. In this work, we focus on the management of LC–MS data, which are typically made available in standard XML-based portable formats. The structures that are currently employed to manage these data can be highly inefficient, especially when dealing with high-throughput profile data. LC–MS datasets are usually accessed through 2D range queries. Optimizing this type of operation could dramatically reduce the complexity of data analysis. We propose a novel data structure for LC–MS datasets, called mzRTree, which embodies a scalable index based on the R-tree data structure. mzRTree can be efficiently created from the XML-based data formats and it is suitable for handling very large datasets. We experimentally show that, on all range queries, mzRTree outperforms other known structures used for LC–MS data, even on those queries these structures are optimized for. Besides, mzRTree is also more space efficient. As a result, mzRTree reduces data analysis computational costs for very large profile datasets.  相似文献   

7.
KEGGanim: pathway animations for high-throughput data   总被引:1,自引:0,他引:1  
MOTIVATION: Gene expression analysis with microarrays has become one of the most widely used high-throughput methods for gathering genome-wide functional data. Emerging -omics fields such as proteomics and interactomics introduce new information sources. With the rise of systems biology, researchers need to concentrate on entire complex pathways that guide individual genes and related processes. Bioinformatics methods are needed to link the existing knowledge about pathways with the growing amounts of experimental data. RESULTS: We present KEGGanim, a novel web-based tool for visualizing experimental data in biological pathways. KEGGanim produces animations and images of KEGG pathways using public or user uploaded high-throughput data. Pathway members are coloured according to experimental measurements, and animated over experimental conditions. KEGGanim visualization highlights dynamic changes over conditions and allows the user to observe important modules and key genes that influence the pathway. The simple user interface of KEGGanim provides options for filtering genes and experimental conditions. KEGGanim may be used with public or private data for 14 organisms with a large collection of public microarray data readily available. Most common gene and protein identifiers and microarray probesets are accepted for visualization input. AVAILABILITY: http://biit.cs.ut.ee/KEGGanim/.  相似文献   

8.
Proteomics based on tandem mass spectrometry is a powerful tool for identifying novel biomarkers and drug targets. Previously, a major bottleneck in high-throughput proteomics has been that the computational techniques needed to reliably identify proteins from proteomic data lagged behind the ability to collect the immense quantity of data generated. This is no longer the case, as fully automated pipelines for peptide and protein identification exist, and these are publicly and privately accessible. Such pipelines can automatically and rapidly generate high-confidence protein identifications from large datasets in a searchable format covering multiple experimental runs. However, the main challenge for the community now is to use these resources as they are, by taking full advantage of the pooling of information, so that the next barrier in our understanding of biology may be broken. There are currently two pipelines in the public domain that provide such potential: PeptideAtlas and the Genome Annotating Proteomic Pipeline. This review will introduce their features in the context of high-throughput proteomics, and provide indicative results as to their usefulness and usability through a side-by-side comparison of results obtained when processing a set of human plasma samples.  相似文献   

9.
Proteomics based on tandem mass spectrometry is a powerful tool for identifying novel biomarkers and drug targets. Previously, a major bottleneck in high-throughput proteomics has been that the computational techniques needed to reliably identify proteins from proteomic data lagged behind the ability to collect the immense quantity of data generated. This is no longer the case, as fully automated pipelines for peptide and protein identification exist, and these are publicly and privately accessible. Such pipelines can automatically and rapidly generate high-confidence protein identifications from large datasets in a searchable format covering multiple experimental runs. However, the main challenge for the community now is to use these resources as they are, by taking full advantage of the pooling of information, so that the next barrier in our understanding of biology may be broken. There are currently two pipelines in the public domain that provide such potential: PeptideAtlas and the Genome Annotating Proteomic Pipeline. This review will introduce their features in the context of high-throughput proteomics, and provide indicative results as to their usefulness and usability through a side-by-side comparison of results obtained when processing a set of human plasma samples.  相似文献   

10.
11.
An important component of proteomic research is the high-throughput discovery of novel proteins and protein-protein interactions that control molecular events that contribute to critical cellular functions and human disease. The interactions of proteins are essential for cellular functions. Identifying perturbation of normal cellular protein interactions is vital for understanding the disease process and intervening to control the disease. A second area of proteomics research is the discovery of proteins that will serve as biomarkers for the early detection, diagnosis and drug treatment response for specific diseases. These studies have been referred to as clinical proteomics. To discover biomarkers, proteomics research employs the quantitative comparison of peptide and protein expression in body fluids and tissues from diseased individuals (case) versus normal individuals (control). Methods that couple 2D capillary liquid chromatography (LC) and tandem mass spectrometry (MS/MS) analysis have greatly facilitated this discovery science. Coupling 2D-LC/MS/MS analysis with automated genome-assisted spectra interpretation allows a direct, high-throughput and high-sensitivity identification of thousands of individual proteins from complex biological samples. The systematic comparison of experimental conditions and controls allows protein function or disease states to be modeled. This review discusses the different purification and quantification strategies that have been developed and used in combination with 2D-LC/MS/MS and computational analysis to examine regulatory protein networks and clinical samples.  相似文献   

12.
The iTRAQ labeling method combined with shotgun proteomic techniques represents a new dimension in multiplexed quantitation for relative protein expression measurement in different cell states. To expedite the analysis of vast amounts of spectral data, we present a fully automated software package, called Multi-Q, for multiplexed iTRAQ-based quantitation in protein profiling. Multi-Q is designed as a generic platform that can accommodate various input data formats from search engines and mass spectrometer manufacturers. To calculate peptide ratios, the software automatically processes iTRAQ's signature peaks, including peak detection, background subtraction, isotope correction, and normalization to remove systematic errors. Furthermore, Multi-Q allows users to define their own data-filtering thresholds based on semiempirical values or statistical models so that the computed results of fold changes in peptide ratios are statistically significant. This feature facilitates the use of Multi-Q with various instrument types with different dynamic ranges, which is an important aspect of iTRAQ analysis. The performance of Multi-Q is evaluated with a mixture of 10 standard proteins and human Jurkat T cells. The results are consistent with expected protein ratios and thus demonstrate the high accuracy, full automation, and high-throughput capability of Multi-Q as a large-scale quantitation proteomics tool. These features allow rapid interpretation of output from large proteomic datasets without the need for manual validation. Executable Multi-Q files are available on Windows platform at http://ms.iis.sinica.edu.tw/Multi-Q/.  相似文献   

13.
An important component of proteomic research is the high-throughput discovery of novel proteins and protein–protein interactions that control molecular events that contribute to critical cellular functions and human disease. The interactions of proteins are essential for cellular functions. Identifying perturbation of normal cellular protein interactions is vital for understanding the disease process and intervening to control the disease. A second area of proteomics research is the discovery of proteins that will serve as biomarkers for the early detection, diagnosis and drug treatment response for specific diseases. These studies have been referred to as clinical proteomics. To discover biomarkers, proteomics research employs the quantitative comparison of peptide and protein expression in body fluids and tissues from diseased individuals (case) versus normal individuals (control). Methods that couple 2D capillary liquid chromatography (LC) and tandem mass spectrometry (MS/MS) analysis have greatly facilitated this discovery science. Coupling 2D-LC/MS/MS analysis with automated genome-assisted spectra interpretation allows a direct, high-throughput and high-sensitivity identification of thousands of individual proteins from complex biological samples. The systematic comparison of experimental conditions and controls allows protein function or disease states to be modeled. This review discusses the different purification and quantification strategies that have been developed and used in combination with 2D-LC/MS/MS and computational analysis to examine regulatory protein networks and clinical samples.  相似文献   

14.
MOTIVATION: Enrichment tests are used in high-throughput experimentation to measure the association between gene or protein expression and membership in groups or pathways. The Fisher's exact test is commonly used. We specifically examined the associations produced by the Fisher test between protein identification by mass spectrometry discovery proteomics, and their Gene Ontology (GO) term assignments in a large yeast dataset. We found that direct application of the Fisher test is misleading in proteomics due to the bias in mass spectrometry to preferentially identify proteins based on their biochemical properties. False inference about associations can be made if this bias is not corrected. Our method adjusts Fisher tests for these biases and produces associations more directly attributable to protein expression rather than experimental bias. RESULTS: Using logistic regression, we modeled the association between protein identification and GO term assignments while adjusting for identification bias in mass spectrometry. The model accounts for five biochemical properties of peptides: (i) hydrophobicity, (ii) molecular weight, (iii) transfer energy, (iv) beta turn frequency and (v) isoelectric point. The model was fit on 181 060 peptides from 2678 proteins identified in 24 yeast proteomics datasets with a 1% false discovery rate. In analyzing the association between protein identification and their GO term assignments, we found that 25% (134 out of 544) of Fisher tests that showed significant association (q-value ≤0.05) were non-significant after adjustment using our model. Simulations generating yeast protein sets enriched for identification propensity show that unadjusted enrichment tests were biased while our approach worked well.  相似文献   

15.
16.
The availability of the results of high-throughput analyses coming from ‘omic’ technologies has been one of the major driving forces of pathway biology. Analytical pathway biology strives to design a ‘pathway search engine’, where the input is the ‘omic’ data and the output is the list of activated or dominant pathways in a given sample. Here we describe the first attempt to design and validate such a pathway search engine using as input expression proteomics data. The engine represents a specific workflow in computational tools developed originally for mRNA analysis (BMC Bioinformatics 2006, 7 (Suppl 2), S13). Using our own datasets as well as data from recent proteomics literature we demonstrate that different dominant pathways (EGF, TGFβ, stress, and Fas pathways) can be correctly identified even from limited datasets. Pathway search engines can find application in a variety of proteomics-related fields, from fundamental molecular biology to search for novel types of disease biomarkers.  相似文献   

17.
Recently a number of computational approaches have been developed for the prediction of protein–protein interactions. Complete genome sequencing projects have provided the vast amount of information needed for these analyses. These methods utilize the structural, genomic, and biological context of proteins and genes in complete genomes to predict protein interaction networks and functional linkages between proteins. Given that experimental techniques remain expensive, time-consuming, and labor-intensive, these methods represent an important advance in proteomics. Some of these approaches utilize sequence data alone to predict interactions, while others combine multiple computational and experimental datasets to accurately build protein interaction maps for complete genomes. These methods represent a complementary approach to current high-throughput projects whose aim is to delineate protein interaction maps in complete genomes. We will describe a number of computational protocols for protein interaction prediction based on the structural, genomic, and biological context of proteins in complete genomes, and detail methods for protein interaction network visualization and analysis.  相似文献   

18.

Background  

Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible.  相似文献   

19.
Proteomics: the industrialization of protein chemistry   总被引:7,自引:0,他引:7  
Establishing a proteomics platform in the industrial setting initially required implementation of a series of robotic systems to allow a high-throughput approach to analysis and identification of differences observed on 2-D electrophoresis gels. Now, a simpler alternative approach employing chromatography-based systems is emerging for identification of many components of complex mixtures, which can also provide quantitative comparisons through the use of a new labeling methodology.  相似文献   

20.
新疆天山雪莲(Sasussured involucrata)具有较高的极端低温耐受特性,为低温耐受机制研究提供了一种非常好的模式植物。新疆天山雪莲转录组注解知识库(http://www.shengtingbiology.com/Saussurea KBase/index.jsp)是基于网络数据资源的综合性数据库,由html、Perl、Perl CGI/DBD/DBI、Java和Java Script编程所设计的前端界面和用于数据存取、注释及管理的后端数据库管理系统Postgrel SQL构成。知识库包含基因组数据、转录组原始数据、质量控制数据、GC含量、功能基因序列及注释、功能基因代谢通路、功能基因的注释统计、雪莲与其它物种的转录组或基因组比较分析数据和生物分析软件包等资源。该数据库不仅有利于低温功能基因组学及低温耐受机制研究,而且为冷耐受性状物种的分子育种提供基因资源平台和理论依据。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号