首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Two-dimensional gel electrophoresis (2DE) offers high-resolution separation for intact proteins. However, variability in the appearance of spots can limit the ability to identify true differences between conditions. Variability can occur at a number of levels. Individual samples can differ because of biological variability. Technical variability can occur during protein extraction, processing, or storage. Another potential source of variability occurs during analysis of the gels and is not a result of any of the causes of variability named above. We performed a study designed to focus only on the variability caused by analysis. We separated three aliquots of rat left ventricle and analyzed differences in protein abundance on the replicate 2D gels. As the samples loaded on each gel were identical, differences in protein abundance are caused by variability in separation or interpretation of the gels. Protein spots were compared across gels by quantile values to determine differences. Fourteen percent of spots had a maximum difference in intensity of 0.4 quantile values or more between replicates. We then looked individually at the spots to determine the cause of differences between the measured intensities. Reasons for differences were: failure to identify a spot (59%), differences in spot boundaries (13%), difference in the peak height (6%), and a combination of these factors (21). This study demonstrates that spot identification and characterization make major contributions to variability seen with 2DE. Methods to highlight why measured protein spot abundance is different could reduce these errors.  相似文献   

2.
Assumptions that need to be considered prior to statistical analysis of protein spot volumes from two-dimensional gel electrophoresis (2-DE) data are studied using replicate gels of the same sample. The most important observation is that the data tables of protein spot volumes from 2-DE images contain a large number of missing values, which are not consistent with the presence or absence of the proteins. This implies both loss of information and problems for the subsequent statistical analysis. Challenges with 2-DE protein spot volumes are viewed in light of multiple gel comparisons and multivariate data analysis.  相似文献   

3.
Although two-dimensional gel electrophoresis (2-DE) has long been a favorite experimental method to screen proteomes, its reproducibility is seldom analyzed with the assistance of quantitative error models. The lack of models of residual distributions that can be used to assign likelihood to differential expression reflects the difficulty in tackling the combined effect of variability in spot intensity and uncertain recognition of the same spot in different gels. In this report we have analyzed a series of four triplicate two-dimensional gels of chicken embryo heart samples at two distinct development stages to produce such a model of residual distribution. In order to achieve this reference error model, a nonparametric procedure for consistent spot intensity normalization had to be established, and is also reported here. In addition to variability in normalized intensity due to various sources, the residual variation between replicates was observed to be compounded by failure to identify the spot itself (gel alignment). The mixed effect is reflected by variably skewed bimodal density distributions of residuals. The extraction of a global error model that accommodated such distribution was achieved empirically by machine learning, specifically by bootstrapped artificial neural networks. The model described is being used to assign confidence values to observed variations in arbitrary 2-DE gels in order to quantify the degree of over-expression and under-expression of protein spots.  相似文献   

4.
Quantitative proteomic comparisons require a sufficient number of samples to reach an acceptable level of significance. But 2D gel electrophoresis commonly results in incomplete data sets due to spots with missing values reducing thereby the number of parallel measurements for individual proteins. Here we investigated how many missing values per spot can be tolerated. The number of spots in common between all gels was found to decrease with the number of parallel gels in a non-linear fashion. Increasing numbers of missing values were associated with a moderate increase in the quantitative variation of spot volumes. Based on the missing value pattern in 20 gels we performed an analysis of the multiple testing power for the hypothetical scenario of a comparative 2DE study with six or twelve parallel gels. The calculation considered the statistical power of the individual spot as well as the number of spots included in the analysis. The power increased with inclusion of spots with higher number of missing values and showed an optimum at a specific minimum number of spot replicates. The results suggest that proteins with missing values can be included in a univariate analysis as long as a sufficient number of parallel gels are made.  相似文献   

5.
MOTIVATION: We present statistical methods for determining the number of per gene replicate spots required in microarray experiments. The purpose of these methods is to obtain an estimate of the sampling variability present in microarray data, and to determine the number of replicate spots required to achieve a high probability of detecting a significant fold change in gene expression, while maintaining a low error rate. Our approach is based on data from control microarrays, and involves the use of standard statistical estimation techniques. RESULTS: After analyzing two experimental data sets containing control array data, we were able to determine the statistical power available for the detection of significant differential expression given differing levels of replication. The inclusion of replicate spots on microarrays not only allows more accurate estimation of the variability present in an experiment, but more importantly increases the probability of detecting genes undergoing significant fold changes in expression, while substantially decreasing the probability of observing fold changes due to chance rather than true differential expression.  相似文献   

6.

Background  

In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software alignments return noisy gel matching which needs to be manually adjusted by the user.  相似文献   

7.
The role of bioinformatics in two-dimensional gel electrophoresis   总被引:1,自引:0,他引:1  
Dowsey AW  Dunn MJ  Yang GZ 《Proteomics》2003,3(8):1567-1596
Over the last two decades, two-dimensional electrophoresis (2-DE) gel has established itself as the de facto approach to separating proteins from cell and tissue samples. Due to the sheer volume of data and its experimental geometric and expression uncertainties, quantitative analysis of these data with image processing and modelling has become an actively pursued research topic. The results of these analyses include accurate protein quantification, isoelectric point and relative molecular mass estimation, and the detection of differential expression between samples run on different gels. Systematic errors such as current leakage and regional expression inhomogeneities are corrected for, followed by each protein spot in the gel being segmented and modelled for quantification. To assess differential expression of protein spots in different samples run on a series of two-dimensional gels, a number of image registration techniques for correcting geometric distortion have been proposed. This paper provides a comprehensive review of the computation techniques used in the analysis of 2-DE gels, together with a discussion of current and future trends in large scale analysis. We examine the pitfalls of existing techniques and highlight some of the key areas that need to be developed in the coming years, especially those related to statistical approaches based on multiple gel runs and image mining techniques through the use of parallel processing based on cluster computing and the grid technology.  相似文献   

8.
One of the main applications of electrophoretic 2-D gels is the analysis of differential responses between different conditions. For this reason, specific spots are present in one of the images, but not in the other. In some other occasions, the same experiment is repeated between 2 and 12 times in order to increase statistical significance. In both situations, one of the major difficulties of these analysis is that 2-D gels are affected by spatial distortions due to run-time differences and dye-front deformations, resulting in images that are significantly dissimilar not only because of their content, but also because of their geometry. In this technical brief, we show how to use free, state-of-the-art image registration and fusion algorithms developed by us for solving the problem of comparing differential expression profiles, or computing an "average" image from a series of virtually identical gels.  相似文献   

9.
10.
Two-dimensional SDS-PAGE gel electrophoresis using post-run staining is widely used to measure the abundances of thousands of protein spots simultaneously. Usually, the protein abundances of two or more biological groups are compared using biological and technical replicates. After gel separation and staining, the spots are detected, spot volumes are quantified, and spots are matched across gels. There are almost always many missing values in the resulting data set. The missing values arise either because the corresponding proteins have very low abundances (or are absent) or because of experimental errors such as incomplete/over focusing in the first dimension or varying run times in the second dimension as well as faulty spot detection and matching. In this study, we show that the probability for a spot to be missing can be modeled by a logistic regression function of the logarithm of the volume. Furthermore, we present an algorithm that takes a set of gels with technical and biological replicates as input and estimates the average protein abundances in the biological groups from the number of missing spots and measured volumes of the present spots using a maximum likelihood approach. Confidence intervals for abundances and p-values for differential expression between two groups are calculated using bootstrap sampling. The algorithm is compared to two standard approaches, one that discards missing values and one that sets all missing values to zero. We have evaluated this approach in two different gel data sets of different biological origin. An R-program, implementing the algorithm, is freely available at http://bioinfo.thep .lu.se/MissingValues2Dgels.html.  相似文献   

11.
Data extraction from composite oligonucleotide microarrays   总被引:1,自引:0,他引:1       下载免费PDF全文
Microarray or DNA chip technology is revolutionizing biology by empowering researchers in the collection of broad-scope gene information. It is well known that microarray-based measurements exhibit a substantial amount of variability due to a number of possible sources, ranging from hybridization conditions to image capture and analysis. In order to make reliable inferences and carry out quantitative analysis with microarray data, it is generally advisable to have more than one measurement of each gene. The availability of both between-array and within-array replicate measurements is essential for this purpose. Although statistical considerations call for increasing the number of replicates of both types, the latter is particularly challenging in practice due to a number of limiting factors, especially for in-house spotting facilities. We propose a novel approach to design so-called composite microarrays, which allow more replicates to be obtained without increasing the number of printed spots.  相似文献   

12.
Luhn S  Berth M  Hecker M  Bernhardt J 《Proteomics》2003,3(7):1117-1127
Databases for two-dimensional protein gels pose new challenges in extracting meaningful information from large numbers of experiments. In order to create expression profiles, positions of corresponding protein spots across all gel images have to be established. In larger gel sets errors may accumulate rapidly during this spot matching process, effectively limiting the number of samples available for data mining. Here we present a novel approach for organizing spot data based on the concept of a standard position for a protein species. Standard positions are meaningful average positions that are determined using all occurrences of a protein species. They can be extended to spots that are not annotated via interpolation. The standard position of a spot can serve as a unifying index across all gels in a database, thus allowing creation and analysis of expression profiles that span the whole collection. The standard position gives a much more accurate estimation of a spot's position on a gel than can be obtained using theoretical isoelectric point and molecular weight. Positional indexing is a complement to a priori identifications (e.g. by mass spectrometry or Edman degradation). Moreover it can be used in advance to select spots that are worth identifying because they show relevant expression profiles. Furthermore, we show how to combine all spots that occur on any of the gels into one synthetic but nevertheless realistic-looking image. This composite image is produced such that all spots have their standard positions. It can serve as a proteome reference map for an organism. As an application, we have computed a reference map from 23 gel images of Bacillus subtilis, using an enhanced prerelease version of the gel analysis software Delta2D (DECODON, Greifswald, Germany).  相似文献   

13.
Rogers M  Graham J  Tonge RP 《Proteomics》2003,3(6):887-896
In image analysis of two-dimensional electrophoresis gels, individual spots need to be identified and quantified. Two classes of algorithms are commonly applied to this task. Parametric methods rely on a model, making strong assumptions about spot appearance, but are often insufficiently flexible to adequately represent all spots that may be present in a gel. Nonparametric methods make no assumptions about spot appearance and consequently impose few constraints on spot detection, allowing more flexibility but reducing robustness when image data is complex. We describe a parametric representation of spot shape that is both general enough to represent unusual spots, and specific enough to introduce constraints on the interpretation of complex images. Our method uses a model of shape based on the statistics of an annotated training set. The model allows new spot shapes, belonging to the same statistical distribution as the training set, to be generated. To represent spot appearance we use the statistically derived shape convolved with a Gaussian kernel, simulating the diffusion process in spot formation. We show that the statistical model of spot appearance and shape is able to fit to image data more closely than the commonly used spot parameterizations based solely on Gaussian and diffusion models. We show that improvements in model fitting are gained without degrading the specificity of the representation.  相似文献   

14.
Efficient analysis of protein expression by using two-dimensional electrophoresis (2-DE) data relies on the use of automated image processing techniques. The overall success of this research depends critically on the accuracy and the reliability of the analysis software. In addition, the software has a profound effect on the interpretation of the results obtained, and the amount of user intervention demanded during the analysis. The choice of analysis software that best meets specific needs is therefore of interest to the research laboratory. In this paper we compare two advanced analysis software packages, PDQuest and Progenesis. Their evaluation is based on quantitative tests at three different levels of standard 2-DE analysis: spot detection, gel matching and spot quantitation. As test materials we use three gel sets previously used in a similar comparison of Z3 and Melanie, and three sets of gels from our own research. It was observed that the quality of the test gels critically influences the spot detection and gel matching results. Both packages were sensitive to the parameter or filter settings with respect to the tendency of finding true positive and false positive spots. Quantitation results were very accurate for both analysis software packages.  相似文献   

15.
Clark BN  Gutstein HB 《Proteomics》2008,8(6):1197-1203
Many software packages have been developed to process and analyze 2-D gel images. Some programs have been touted as automated, high-throughput solutions. We tested five commercially available programs using 18 replicate gels of a rat brain protein extract. We determined computer processing time, approximate spot editing time, time required to correct spot mismatches, as well as total processing time. We also determined the number of spots automatically detected, number of spots kept after manual editing, and the percentage of automatically generated correct matches. We also determined the effect of increasing the number of replicate gels on spot matching efficiency for two of the programs. We found that for all programs tested, less than 3% of the total processing time was automated. The remainder of the time was spent in manual, subjective editing of detected spots and computer generated matches. Total processing time for 18 gels varied from 22 to 84 h. The percentage of correct matches generated automatically varied from 1 to 62%. Increasing the number of gels in an experiment dramatically reduced the percentage of automatically generated correct matches. Our results demonstrate that these 2-D gel analysis programs are not automatic or rapid, and also suggest that matching accuracy decreases as experiment size increases.  相似文献   

16.
The complexity of human plasma presents a number of challenges to the efficient and reproducible proteomic analysis of differential expression in response to disease. Before individual variation and disease-specific protein biomarkers can be identified from human plasma, the experimental variability inherent in the protein separation and detection techniques must be quantified. We report on the variation found in two-dimensional difference gel electrophoresis (2-D DIGE) analysis of human plasma. Eight aliquots of a human plasma sample were subjected to top-6 highest abundant protein depletion and were subsequently analyzed in triplicate for a total of 24 DIGE samples on 12 gels. Spot-wise standard deviation estimates indicated that fold changes greater than 2 can be detected with a manageable number of replicates in simple ANOVA experiments with human plasma. Mixed-effects statistical modeling quantified the effect of the dyes, and segregated the spot-wise variance into components of sample preparation, gel-to-gel differences, and random error. The gel-to-gel component was found to be the largest source of variation, followed by the sample preparation step. An improved protocol for the depletion of the top-6 high-abundance proteins is suggested, which, along with the use of statistical modeling and future improvements in gel quality and image processing, can further reduce the variation and increase the efficiency of 2-D DIGE proteomic analysis of human plasma.  相似文献   

17.
18.
Cutler P  Heald G  White IR  Ruan J 《Proteomics》2003,3(4):392-401
Separation of complex mixtures of proteins by two-dimensional gel electrophoresis (2-DE) is a fundamental component of current proteomic technology. Quantitative analysis of the images generated by digitization of such gels is critical for the identification of alterations in protein expression within a given biological system. Despite the availability of several commercially available software packages designed for this purpose, image analysis is extremely resource intensive, subjective and remains a major bottleneck. In addition to reducing throughput, the requirement for manual intervention results in the introduction of operator subjectivity, which can limit the statistical significance of the numerical data generated. A key requirement of image analysis is the accurate definition of protein spot boundaries using a suitable method of image segmentation. We describe a method of spot detection applicable to 2-DE image files using a segmentation method involving pixel value collection via serial analysis of the image through its range of density levels. This algorithm is reproducible, sensitive, accurate and primarily designed to be automatic, removing operator subjectivity. Furthermore, it is believed that this method may offer the potential for improved spot detection over currently available software.  相似文献   

19.
We have developed and refined a system for quantitative computer analysis of two-dimensional polyacrylamide gel electrophoretograms. The system, named Elsie 4, is based on one described by Vo et al. (Anal. Biochem. 112, 258 (1981]. It is highly automated. Elsie 4 can find, and measure the intensity of, almost any spot resolvable on two-dimensional gels, including spots visible only as shoulders off larger spots and spots so close together that there is no "valley" between them. It can automatically match the spot patterns of different gels, potentially without the need for a user to provide landmark matches. The matches between paired gels let us follow the synthesis of any spot through a set of gels. Information about a group of matched spots can be obtained by referring to any spot in the group. There is generally no need for a standard or reference gel. Data for two experiments can be combined and compared by matching any gel in one experiment with any gel in the other. There are ways to automatically find possible mismatches in sets of gels. Scans and the results of the analysis can be shown on an image displayer. The programs use function libraries; this helps ensure consistency and increase portability. The programs and functions can be linked together in many ways; this lets users build custom programs for analysis of specific experiments.  相似文献   

20.
Dowsey AW  Dunn MJ  Yang GZ 《Proteomics》2004,4(12):3800-3812
The quest for high-throughput proteomics has revealed a number of critical issues. Whilst improved two-dimensional gel electrophoresis (2-DE) sample preparation, staining and imaging issues are being actively pursued by industry, reliable high-throughput spot matching and quantification remains a significant bottleneck in the bioinformatics pipeline, thus restricting the flow of data to mass spectrometry through robotic spot excision and protein digestion. To this end, it is important to establish a full multi-site Grid infrastructure for the processing, archival, standardisation and retrieval of proteomic data and metadata. Particular emphasis needs to be placed on large-scale image mining and statistical cross-validation for reliable, fully automated differential expression analysis, and the development of a statistical 2-DE object model and ontology that underpins the emerging HUPO PSI GPS (Human Proteome Organization Proteomics Standards Initiative General Proteomics Standards). The first step towards this goal is to overcome the computational and communications burden entailed by the image analysis of 2-DE gels with Grid enabled cluster computing. This paper presents the proTurbo framework as part of the ProteomeGRID, which utilises Condor cluster management combined with CORBA communications and JPEG-LS lossless image compression for task farming. A novel probabilistic eager scheduler has been developed to minimise make-span, where tasks are duplicated in response to the likelihood of the Condor machines' owners evicting them. A 60 gel experiment was pair-wise image registered (3540 tasks) on a 40 machine Linux cluster. Real-world performance and network overhead was gauged, and Poisson distributed worker evictions were simulated. Our results show a 4:1 lossless and 9:1 near lossless image compression ratio and so network overhead did not affect other users. With 40 workers a 32x speed-up was seen (80% resource efficiency), and the eager scheduler reduced the impact of evictions by 58%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号