首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 656 毫秒
1.
Geometric algorithms for the analysis of 2D-electrophoresis gels.   总被引:1,自引:0,他引:1  
In proteomics, two-dimensional gel electrophoresis (2-DE) is a separation technique for proteins. The resulting protein spots can be identified either by using picking robots and subsequent mass spectrometry or by visual cross inspection of a new gel image with an already analyzed master gel. Difficulties especially arise from inherent noise and irregular geometric distortions in 2-DE images. Aiming at the automated analysis of large series of 2-DE images, or at the even more difficult interlaboratory gel comparisons, the bottleneck is to solve the two most basic algorithmic problems with high quality: Identifying protein spots and computing a matching between two images. For the development of the analysis software CAROl at Freie Universit?t Berlin, we have reconsidered these two problems and obtained new solutions which rely on methods from computational geometry. Their novelties are: 1. Spot detection is also possible for complex regions formed by several "merged" (usually saturated) spots; 2. User-defined landmarks are not necessary for the matching. Furthermore, images for comparison are allowed to represent different parts of the entire protein pattern, which only partially "overlap." The implementation is done in a client server architecture to allow queries via the internet. We also discuss and point at related theoretical questions in computational geometry.  相似文献   

2.
Dowsey AW  Dunn MJ  Yang GZ 《Proteomics》2004,4(12):3800-3812
The quest for high-throughput proteomics has revealed a number of critical issues. Whilst improved two-dimensional gel electrophoresis (2-DE) sample preparation, staining and imaging issues are being actively pursued by industry, reliable high-throughput spot matching and quantification remains a significant bottleneck in the bioinformatics pipeline, thus restricting the flow of data to mass spectrometry through robotic spot excision and protein digestion. To this end, it is important to establish a full multi-site Grid infrastructure for the processing, archival, standardisation and retrieval of proteomic data and metadata. Particular emphasis needs to be placed on large-scale image mining and statistical cross-validation for reliable, fully automated differential expression analysis, and the development of a statistical 2-DE object model and ontology that underpins the emerging HUPO PSI GPS (Human Proteome Organization Proteomics Standards Initiative General Proteomics Standards). The first step towards this goal is to overcome the computational and communications burden entailed by the image analysis of 2-DE gels with Grid enabled cluster computing. This paper presents the proTurbo framework as part of the ProteomeGRID, which utilises Condor cluster management combined with CORBA communications and JPEG-LS lossless image compression for task farming. A novel probabilistic eager scheduler has been developed to minimise make-span, where tasks are duplicated in response to the likelihood of the Condor machines' owners evicting them. A 60 gel experiment was pair-wise image registered (3540 tasks) on a 40 machine Linux cluster. Real-world performance and network overhead was gauged, and Poisson distributed worker evictions were simulated. Our results show a 4:1 lossless and 9:1 near lossless image compression ratio and so network overhead did not affect other users. With 40 workers a 32x speed-up was seen (80% resource efficiency), and the eager scheduler reduced the impact of evictions by 58%.  相似文献   

3.
Analysis of images obtained from two-dimensional gel electrophoresis (2D-GE) is a topic of utmost importance in bioinformatics research, since commercial and academic software available currently has proven to be neither completely effective nor fully automatic, often requiring manual revision and refinement of computer generated matches. In this work, we present an effective technique for the detection and the reconstruction of over-saturated protein spots. Firstly, the algorithm reveals overexposed areas, where spots may be truncated, and plateau regions caused by smeared and overlapping spots. Next, it reconstructs the correct distribution of pixel values in these overexposed areas and plateau regions, using a two-dimensional least-squares fitting based on a generalized Gaussian distribution. Pixel correction in saturated and smeared spots allows more accurate quantification, providing more reliable image analysis results. The method is validated for processing highly exposed 2D-GE images, comparing reconstructed spots with the corresponding non-saturated image, demonstrating that the algorithm enables correct spot quantification.  相似文献   

4.
Motivation: The quest for high-throughput proteomics has revealeda number of challenges in recent years. Whilst substantial improvementsin automated protein separation with liquid chromatography andmass spectrometry (LC/MS), aka ‘shotgun’ proteomics,have been achieved, large-scale open initiatives such as theHuman Proteome Organization (HUPO) Brain Proteome Project haveshown that maximal proteome coverage is only possible when LC/MSis complemented by 2D gel electrophoresis (2-DE) studies. Moreover,both separation methods require automated alignment and differentialanalysis to relieve the bioinformatics bottleneck and so makehigh-throughput protein biomarker discovery a reality. The purposeof this article is to describe a fully automatic image alignmentframework for the integration of 2-DE into a high-throughputdifferential expression proteomics pipeline. Results: The proposed method is based on robust automated imagenormalization (RAIN) to circumvent the drawbacks of traditionalapproaches. These use symbolic representation at the very earlystages of the analysis, which introduces persistent errors dueto inaccuracies in modelling and alignment. In RAIN, a third-ordervolume-invariant B-spline model is incorporated into a multi-resolutionschema to correct for geometric and expression inhomogeneityat multiple scales. The normalized images can then be compareddirectly in the image domain for quantitative differential analysis.Through evaluation against an existing state-of-the-art methodon real and synthetically warped 2D gels, the proposed analysisframework demonstrates substantial improvements in matchingaccuracy and differential sensitivity. High-throughput analysisis established through an accelerated GPGPU (general purposecomputation on graphics cards) implementation. Availability: Supplementary material, software and images usedin the validation are available at http://www.proteomegrid.org/rain/ Contact: g.z.yang{at}imperial.ac.uk Supplementary information: Supplementary data are availableat Bioinformatics online. Associate Editor: David Rocke  相似文献   

5.
Since their origins in academic endeavours in the 1970s, computational analysis tools have matured into a number of established commercial packages that underpin research in expression proteomics. In this paper we describe the image analysis pipeline for the established 2-DE technique of protein separation, and by first covering signal analysis for MS, we also explain the current image analysis workflow for the emerging high-throughput 'shotgun' proteomics platform of LC coupled to MS (LC/MS). The bioinformatics challenges for both methods are illustrated and compared, whereas existing commercial and academic packages and their workflows are described from both a user's and a technical perspective. Attention is given to the importance of sound statistical treatment of the resultant quantifications in the search for differential expression. Despite wide availability of proteomics software, a number of challenges have yet to be overcome regarding algorithm accuracy, objectivity and automation, generally due to deterministic spot-centric approaches that discard information early in the pipeline, propagating errors. We review recent advances in signal and image analysis algorithms in 2-DE, MS, LC/MS and Imaging MS. Particular attention is given to wavelet techniques, automated image-based alignment and differential analysis in 2-DE, Bayesian peak mixture models, and functional mixed modelling in MS, and group-wise consensus alignment methods for LC/MS.  相似文献   

6.
Efficient analysis of protein expression by using two-dimensional electrophoresis (2-DE) data relies on the use of automated image processing techniques. The overall success of this research depends critically on the accuracy and the reliability of the analysis software. In addition, the software has a profound effect on the interpretation of the results obtained, and the amount of user intervention demanded during the analysis. The choice of analysis software that best meets specific needs is therefore of interest to the research laboratory. In this paper we compare two advanced analysis software packages, PDQuest and Progenesis. Their evaluation is based on quantitative tests at three different levels of standard 2-DE analysis: spot detection, gel matching and spot quantitation. As test materials we use three gel sets previously used in a similar comparison of Z3 and Melanie, and three sets of gels from our own research. It was observed that the quality of the test gels critically influences the spot detection and gel matching results. Both packages were sensitive to the parameter or filter settings with respect to the tendency of finding true positive and false positive spots. Quantitation results were very accurate for both analysis software packages.  相似文献   

7.
8.
Zhan X  Desiderio DM 《Proteomics》2003,3(5):699-713
In order to compare the proteomes from different cell types of pituitary adenomas for our long-term goal to clarify the molecular mechanisms that participate in the formation of pituitary adenoma, and to detect any tumor-related marker for an "early-stage" diagnosis, the two-dimensional gel electrophoresis (2-DE) reference map of a pituitary adenoma tissue proteome is described here. A vertical, two-dimensional (2-D) polyacrylamide gel electrophoresis system and PDQuest image analysis software have been used to provide a high level of between-gel reproducibility and to accurately array each protein expressed in a pituitary adenoma tissue. Mass spectrometry (matrix-assisted laser desorption/ionization-time of flight MALDI-TOF and liquid chromatography-electrospray ionization-quadrupole-ion trap LC-ESI-Q-IT) and protein databases were used to characterize each protein in the 2-D gel. The results demonstrate that a good reproducibility of the 2-D gel pattern was attained. The position deviation of matched spots among four 2-D gels was 1.95 +/- 0.45 mm in the isoelectric focusing direction, and 1.70 +/- 0.53 mm in the sodium dodecyl sulfate-polyacrylamide gel electrophoresis direction. A total of ca. 1000 protein spots were separated by 2-DE, and 135 protein spots that represent 111 proteins were characterized with mass spectrometry (96 spots for MALDI-TOF, 39 spots for LC-ESI-Q-IT). The characterized proteins include pituitary hormones, cellular signals, enzymes, cellular-defense proteins, cell-structure proteins, transport proteins, etc. Those proteins were located in the cytoplasmic, cellular membrane, mitochondrial, endoplasmic reticulum, nuclear, ribonucleosome, extracellular fractions, or were secreted in plasma, etc. Those identified proteins contribute to a functional profile of the pituitary adenoma proteome. These data will be used to expand the proteome database of the human pituitary, which can be accessed in the website http://www.utmem.edu /proteomics.  相似文献   

9.
The role of bioinformatics in two-dimensional gel electrophoresis   总被引:1,自引:0,他引:1  
Dowsey AW  Dunn MJ  Yang GZ 《Proteomics》2003,3(8):1567-1596
Over the last two decades, two-dimensional electrophoresis (2-DE) gel has established itself as the de facto approach to separating proteins from cell and tissue samples. Due to the sheer volume of data and its experimental geometric and expression uncertainties, quantitative analysis of these data with image processing and modelling has become an actively pursued research topic. The results of these analyses include accurate protein quantification, isoelectric point and relative molecular mass estimation, and the detection of differential expression between samples run on different gels. Systematic errors such as current leakage and regional expression inhomogeneities are corrected for, followed by each protein spot in the gel being segmented and modelled for quantification. To assess differential expression of protein spots in different samples run on a series of two-dimensional gels, a number of image registration techniques for correcting geometric distortion have been proposed. This paper provides a comprehensive review of the computation techniques used in the analysis of 2-DE gels, together with a discussion of current and future trends in large scale analysis. We examine the pitfalls of existing techniques and highlight some of the key areas that need to be developed in the coming years, especially those related to statistical approaches based on multiple gel runs and image mining techniques through the use of parallel processing based on cluster computing and the grid technology.  相似文献   

10.
Two-dimensional gel electrophoresis (2-DE) image analysis is conventionally used for comparative proteomics. However, there are a number of technical difficulties associated with 2-DE protein separation that limit the depth of proteome coverage, and the image analysis steps are typically labor-intensive and low-throughput. Recently, mass spectrometry-based quantitation strategies have been described as alternative differential proteome analysis techniques. In this study, we investigated changes in protein expression using an ovarian cancer cell line, OVMZ6, 24 h post-stimulation with the relatively weak agonist, urokinase-type plasminogen activator (uPA). Quantitative protein profiles were obtained by MALDI-TOF/TOF from stable isotope-labeled cells in culture (SILAC), and these results were compared to the quantitative ratios obtained using 2-DE gel image analysis. MALDI-TOF/TOF mass spectrometry showed that differential quantitation using SILAC was highly reproducible (approximately 8% coefficient of variation (CV)), and this variance was considerably lower than that achieved using automated 2-DE image analysis strategies (CV approximately 25%). Both techniques revealed subtle alterations in cellular protein expression following uPA stimulation. However, due to the lower variances associated with the SILAC technique, smaller changes in expression of uPA-inducible proteins could be found with greater certainty.  相似文献   

11.
Cutler P  Heald G  White IR  Ruan J 《Proteomics》2003,3(4):392-401
Separation of complex mixtures of proteins by two-dimensional gel electrophoresis (2-DE) is a fundamental component of current proteomic technology. Quantitative analysis of the images generated by digitization of such gels is critical for the identification of alterations in protein expression within a given biological system. Despite the availability of several commercially available software packages designed for this purpose, image analysis is extremely resource intensive, subjective and remains a major bottleneck. In addition to reducing throughput, the requirement for manual intervention results in the introduction of operator subjectivity, which can limit the statistical significance of the numerical data generated. A key requirement of image analysis is the accurate definition of protein spot boundaries using a suitable method of image segmentation. We describe a method of spot detection applicable to 2-DE image files using a segmentation method involving pixel value collection via serial analysis of the image through its range of density levels. This algorithm is reproducible, sensitive, accurate and primarily designed to be automatic, removing operator subjectivity. Furthermore, it is believed that this method may offer the potential for improved spot detection over currently available software.  相似文献   

12.
One of the key limitations for proteomic studies using two-dimensional (2D) gel is the lack of automatic, fast, robust, and reliable methods for detecting, matching, and quantifying protein spots. Although there are commercial software packages for 2D gel image analysis, extensive human intervention is still needed for spot detection and matching, which is time-consuming and error-prone. Moreover, the commercial software packages are usually expensive and non-open source. Thus, it is very beneficial for researchers to have free software that is fast, fully automatic, and robust. In this paper, we review and compare two recently developed and publicly available software packages, RegStatGel and Pinnacle, for analyzing 2D gel images. These two software packages share some common features and also have some fundamental difference in the aspects of spot detection and quantification. Based on our experience, RegStatGel is much better in terms of spot detection and matching. It also contains more advanced statistical tools and is more user-friendly. In contrast, Pinnacle is quite sensitive to background noise and relies on external statistical software packages for statistical analysis.  相似文献   

13.
蛋白质双向电泳图像分析   总被引:12,自引:0,他引:12  
随着人类基因组计划的接近完成,蛋白质组(proteome)研究成为新的热点.其中高分辨率的双向电泳(two-dimensional gel electrophoresis, 2-DE)技术使对组织或细胞的整个蛋白质组的综合分析成为可能.近年来这一技术有了很大的改进和提高,特别是图像分析系统,算法更为先进,功能日益强大,操作也更简便,为大规模研究提供了良好的工具.使用新一代的2D图像分析系统,对离体培养的雪旺氏细胞的蛋白质样品双向电泳结果进行了初步分析,探讨了在图像扫描、点检测、背景消除、匹配、结果报告和数据分析各步中的技术问题,并报告了进行2D图像分析的体会.  相似文献   

14.
15.
Clark BN  Gutstein HB 《Proteomics》2008,8(6):1197-1203
Many software packages have been developed to process and analyze 2-D gel images. Some programs have been touted as automated, high-throughput solutions. We tested five commercially available programs using 18 replicate gels of a rat brain protein extract. We determined computer processing time, approximate spot editing time, time required to correct spot mismatches, as well as total processing time. We also determined the number of spots automatically detected, number of spots kept after manual editing, and the percentage of automatically generated correct matches. We also determined the effect of increasing the number of replicate gels on spot matching efficiency for two of the programs. We found that for all programs tested, less than 3% of the total processing time was automated. The remainder of the time was spent in manual, subjective editing of detected spots and computer generated matches. Total processing time for 18 gels varied from 22 to 84 h. The percentage of correct matches generated automatically varied from 1 to 62%. Increasing the number of gels in an experiment dramatically reduced the percentage of automatically generated correct matches. Our results demonstrate that these 2-D gel analysis programs are not automatic or rapid, and also suggest that matching accuracy decreases as experiment size increases.  相似文献   

16.
MOTIVATION: The analysis of metabolic processes is becoming increasingly important to our understanding of complex biological systems and disease states. Nuclear magnetic resonance spectroscopy (NMR) is a particularly relevant technology in this respect, since the NMR signals provide a quantitative measure of the metabolite concentrations. However, due to the complexity of the spectra typical of biological samples, the demands of clinical and high-throughput analysis will only be fully met by a system capable of reliable, automatic processing of the spectra. An initial step in this direction has been taken by Targeted Profiling (TP), employing a set of known and predicted metabolite signatures fitted against the signal. However, an accurate fitting procedure for (1)H NMR data is complicated by shift uncertainties in the peak systems caused by measurement imperfections. These uncertainties have a large impact on the accuracy of identification and quantification and currently require compensation by very time consuming manual interactions. Here, we present an approach, termed Extended Targeted Profiling (ETP), that estimates shift uncertainties based on a genetic algorithm (GA) combined with a least squares optimization (LSQO). The estimated shifts are used to correct the known metabolite signatures leading to significantly improved identification and quantification. In this way, use of the automated system significantly reduces the effort normally associated with manual processing and paves the way for reliable, high-throughput analysis of complex NMR spectra. RESULTS: The results indicate that using simultaneous shift uncertainty correction and least squares fitting significantly improves the identification and quantification results for (1)H NMR data in comparison to the standard targeted profiling approach and compares favorably with the results obtained by manual expert analysis. Preservation of the functional structure of the NMR spectra makes this approach more realistic than simple binning strategies.  相似文献   

17.
MOTIVATION: In this paper, we propose a fully automatic block and spot indexing algorithm for microarray image analysis. A microarray is a device which enables a parallel experiment of ten to hundreds of thousands of test genes in order to measure gene expression. Due to this huge size of experimental data, automated image analysis is gaining importance in microarray image processing systems. Currently, most of the automated microarray image processing systems require manual block indexing and, in some cases, spot indexing. If the microarray image is large and contains a lot of noise, it is very troublesome work. In this paper, we show it is possible to locate the addresses of blocks and spots by applying the Nearest Neighbors Graph Model. Also, we propose an analytic model for the feasibility of block addressing. Our analytic model is validated by a large body of experimental results. RESULTS: We demonstrate the features of automatic block detection, automatic spot addressing, and correction of the distortion and skewedness of each microarray image.  相似文献   

18.
Along with productivity and physiology, morphological growth behavior is the key parameter in bioprocess design for filamentous fungi. Despite complex interactions between fungal morphology, broth viscosity, mixing kinetics, transport characteristics and process productivity, morphology is still commonly tackled only by empirical trial-and-error techniques during strain selection and process development procedures. In fact, morphological growth characteristics are investigated by computational analysis of only a limited number of pre-selected microscopic images or via manual evaluation of images, which causes biased results and does not allow any automation or high-throughput quantification. To overcome the lack of tools for fast, reliable and quantitative morphological analysis, this work introduces a method enabling statistically verified quantification of fungal morphology in accordance with Quality by Design principles. The novel, high-throughput method presented here interlinks fully automated recording of microscopic images with a newly developed evaluation approach reducing the need for manual intervention to a minimum. Validity of results is ensured by concomitantly testing the acquired sample for representativeness by statistical inference via bootstrap analysis. The novel approach for statistical verification can be equally applied as control logic to automatically proceed with morphological analysis of a consecutive sample once user defined acceptance criteria are met. Hence, analysis time can be reduced to an absolute minimum. The quantitative potential of the developed methodology is demonstrated by characterizing the morphological growth behavior of two industrial Penicillium chrysogenum production strains in batch cultivation.  相似文献   

19.
Classical proteomics combined two-dimensional gel electrophoresis (2-DE) for the separation and quantification of proteins in a complex mixture with mass spectrometric identification of selected proteins. More recently, the combination of liquid chromatography (LC), stable isotope tagging, and tandem mass spectrometry (MS/MS) has emerged as an alternative quantitative proteomics technology. We have analyzed the proteome of Mycobacterium tuberculosis, a major human pathogen comprising about 4,000 genes, by (i) 2-DE and mass spectrometry (MS) and by (ii) the isotope-coded affinity tag (ICAT) reagent method and MS/MS. The data obtained by either technology were compared with respect to their selectivity for certain protein types and classes and with respect to the accuracy of quantification. Initial datasets of 60,000 peptide MS/MS spectra and 1,800 spots for the ICAT-LC/MS and 2-DE/MS methods, respectively, were reduced to 280 and 108 conclusively identified and quantified proteins, respectively. ICAT-LC/MS showed a clear bias for high M(r) proteins and was complemented by the 2-DE/MS method, which showed a preference for low M(r) proteins and also identified cysteine-free proteins that were transparent to the ICAT-LC/MS method. Relative quantification between two strains of the M. tuberculosis complex also revealed that the two technologies provide complementary quantitative information; whereas the ICAT-LC/MS method quantifies the sum of the protein species of one gene product, the 2-DE/MS method quantifies at the level of resolved protein species, including post-translationally modified and processed polypeptides. Our data indicate that different proteomic technologies applied to the same sample provide complementary types of information that contribute to a more complete understanding of the biological system studied.  相似文献   

20.
Selected reaction monitoring (SRM) is a targeted mass spectrometric method that is increasingly used in proteomics for the detection and quantification of sets of preselected proteins at high sensitivity, reproducibility and accuracy. Currently, data from SRM measurements are mostly evaluated subjectively by manual inspection on the basis of ad hoc criteria, precluding the consistent analysis of different data sets and an objective assessment of their error rates. Here we present mProphet, a fully automated system that computes accurate error rates for the identification of targeted peptides in SRM data sets and maximizes specificity and sensitivity by combining relevant features in the data into a statistical model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号