首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
Ongoing optimization of proteomic methodologies seeks to improve both the coverage and confidence of protein identifications. The optimization of sample preparation, inclusion of technical replicates (repeated instrumental analysis of the same sample), and biological replicates (multiple individual samples) are crucial in proteomic studies to avoid the pitfalls associated with single point analysis and under-sampling. Phosphopeptides were isolated from HeLa cells and analyzed by nano-reversed phase liquid chromatography electrospray ionization tandem mass spectrometry (nano-RP-LC-MS/MS). We observed that a detergent-based protein extraction approach, followed with additional steps for nucleic acid removal, provided a simple alternative to the broadly used Trizol extraction. The evaluation of four technical replicates demonstrated measurement reproducibility with low percent variance in peptide responses at approximately 3%, where additional peptide identifications were made with each added technical replicate. The inclusion of six technical replicates for moderately complex protein extracts (approximately 4000 uniquely identified peptides per data set) affords the optimal collection of peptide information.  相似文献   

2.
A variance statistic was used to partition the total variance into that attributable to each step of a TMEN assay procedure. Estimation of the TMEN of wheat was used as an example. The variance statistic can also be used to optimize the design of a TMEN experiment with respect to cost of the experiment and desired accuracy of the result. Experimental design optimization is accomplished by providing a functional relationship between the accuracy of the estimate and the number of replicates of feed, the number of birds used in the experiment, and the cost of each step. The variance statistic is also a useful tool for identifying and removing outliers and highly variable measurements. This feature was demonstrated with the chosen example data. Gross energy of the feed will explain approximately 50% of the variance of the TMEN estimate depending on how many replicates are evaluated. Nitrogen content of the feed sample will explain approximately 40% of the total variance. It is recommended to replicate this measurement as many times as possible. Ten replicates were recommended for the example data. The energy content of excreta from fed birds represented the next largest source of variance, at approximately 4% of the total variance, respectively. If within-bird variance is large, better homogenization of the sample and more replicates are recommended. If among-bird variance is significantly different, more birds should be used. Nitrogen content of excreta from fed birds represented less than 2.5% of the total variance. Energy and nitrogen content of excreta from unfed birds combined represented less than 2% of the total variance, suggesting that the number of unfed birds and the amount of excreta sub-samples may be reduced without adversely affecting the accuracy of the TMEN estimate. Variance due to the amount of excreta collected from the fed birds, and variance due to the amount of feed consumed by the birds, are expected to be small. This result suggested that force-feeding may not be necessary for accurate TMEN estimates.  相似文献   

3.
If biological questions are to be answered using quantitative proteomics, it is essential to design experiments which have sufficient power to be able to detect changes in expression. Sample subpooling is a strategy that can be used to reduce the variance but still allow studies to encompass biological variation. Underlying sample pooling strategies is the biological averaging assumption that the measurements taken on the pool are equal to the average of the measurements taken on the individuals. This study finds no evidence of a systematic bias triggered by sample pooling for DIGE and that pooling can be useful in reducing biological variation. For the first time in quantitative proteomics, the two sources of variance were decoupled and it was found that technical variance predominates for mouse brain, while biological variance predominates for human brain. A power analysis found that as the number of individuals pooled increased, then the number of replicates needed declined but the number of biological samples increased. Repeat measures of biological samples decreased the numbers of samples required but increased the number of gels needed. An example cost benefit analysis demonstrates how researchers can optimise their experiments while taking into account the available resources.  相似文献   

4.
INTRODUCTION: Microarray experiments often have complex designs that include sample pooling, biological and technical replication, sample pairing and dye-swapping. This article demonstrates how statistical modelling can illuminate issues in the design and analysis of microarray experiments, and this information can then be used to plan effective studies. METHODS: A very detailed statistical model for microarray data is introduced, to show the possible sources of variation that are present in even the simplest microarray experiments. Based on this model, the efficacy of common experimental designs, normalisation methodologies and analyses is determined. RESULTS: When the cost of the arrays is high compared with the cost of samples, sample pooling and spot replication are shown to be efficient variance reduction methods, whereas technical replication of whole arrays is demonstrated to be very inefficient. Dye-swap designs can use biological replicates rather than technical replicates to improve efficiency and simplify analysis. When the cost of samples is high and technical variation is a major portion of the error, technical replication can be cost effective. Normalisation by centreing on a small number of spots may reduce array effects, but can introduce considerable variation in the results. Centreing using the bulk of spots on the array is less variable. Similarly, normalisation methods based on regression methods can introduce variability. Except for normalisation methods based on spiking controls, all normalisation requires that most genes do not differentially express. Methods based on spatial location and/or intensity also require that the nondifferentially expressing genes are at random with respect to location and intensity. Spotting designs should be carefully done so that spot replicates are widely spaced on the array, and genes with similar expression patterns are not clustered. DISCUSSION: The tools for statistical design of experiments can be applied to microarray experiments to improve both efficiency and validity of the studies. Given the high cost of microarray experiments, the benefits of statistical input prior to running the experiment cannot be over-emphasised.  相似文献   

5.
The experimental variance of enzymic steady-state kinetic experiments depends on velocity as approximated by a power function (Var(v) = K1 . valpha (Askel?f, P., Korsfeldt, M. and Mannervik, B. (1976) Eur. J. Biochem. 69, 61--67). The values of the constants (K1, alpha) can be estimated by making replicate measurements of velocity, and the inverse of the function can then be used as a weighting factor. In order to avoid measurement of a large number of replicates to establish the error structure of a kinetic data set, a different approach was tested. After a preliminary regression using a 'good model', which satisfies reasonable goodness-of-fit criteria, the residuals were taken to represent the experimental error. The neighbouring residuals were grouped together and the sum of their mean squared values was used as a measure of the variance in the neighbourhood of the corresponding measurements. The values of the constants obtained in this way agreed with those obtained by replicates.  相似文献   

6.
Experiments using quantitative real-time PCR to test hypotheses are limited by technical and biological variability; we seek to minimise sources of confounding variability through optimum use of biological and technical replicates. The quality of an experiment design is commonly assessed by calculating its prospective power. Such calculations rely on knowledge of the expected variances of the measurements of each group of samples and the magnitude of the treatment effect; the estimation of which is often uninformed and unreliable. Here we introduce a method that exploits a small pilot study to estimate the biological and technical variances in order to improve the design of a subsequent large experiment. We measure the variance contributions at several ‘levels’ of the experiment design and provide a means of using this information to predict both the total variance and the prospective power of the assay. A validation of the method is provided through a variance analysis of representative genes in several bovine tissue-types. We also discuss the effect of normalisation to a reference gene in terms of the measured variance components of the gene of interest. Finally, we describe a software implementation of these methods, powerNest, that gives the user the opportunity to input data from a pilot study and interactively modify the design of the assay. The software automatically calculates expected variances, statistical power, and optimal design of the larger experiment. powerNest enables the researcher to minimise the total confounding variance and maximise prospective power for a specified maximum cost for the large study.  相似文献   

7.
8.
Microarray experiments are being increasingly used in molecular biology. A common task is to detect genes with differential expression across two experimental conditions, such as two different tissues or the same tissue at two time points of biological development. To take proper account of statistical variability, some statistical approaches based on the t-statistic have been proposed. In constructing the t-statistic, one needs to estimate the variance of gene expression levels. With a small number of replicated array experiments, the variance estimation can be challenging. For instance, although the sample variance is unbiased, it may have large variability, leading to a large mean squared error. For duplicated array experiments, a new approach based on simple averaging has recently been proposed in the literature. Here we consider two more general approaches based on nonparametric smoothing. Our goal is to assess the performance of each method empirically. The three methods are applied to a colon cancer data set containing 2,000 genes. Using two arrays, we compare the variance estimates obtained from the three methods. We also consider their impact on the t-statistics. Our results indicate that the three methods give variance estimates close to each other. Due to its simplicity and generality, we recommend the use of the smoothed sample variance for data with a small number of replicates. Electronic Publication  相似文献   

9.
Applying appropriate error models and conservative estimates to microarray data helps to reduce the number of false predictions and allows one to focus on biologically relevant observations. Several key conclusions have been drawn from the statistical analysis of global gene expression data: it is worth keeping core information for each experiment, including raw and processed data; biological and technical replicates are needed; careful experimental design makes the analysis simpler and more powerful; the choice of the similarity measure is nontrivial and depends on the goal of an experiment; array information must be complemented with other data; and gene expression studies are 'hypothesis generators'.  相似文献   

10.
Summary The validity of limiting dilution assays can be compromised or negated by the use of statistical methodology which does not consider all issues surrounding the biological process. This study critically evaluates statistical methods for estimating the mean frequency of responding cells in multiple sample limiting dilution assays. We show that methods that pool limiting dilution assay data, or samples, are unable to estimate the variance appropriately. In addition, we use Monte Carlo simulations to evaluate an unweighted mean of the maximum likelihood estimator, an unweighted mean based on the jackknife estimator, and a log transform of the maximum likelihood estimator. For small culture replicate size, the log transform outperforms both unweighted mean procedures. For moderate culture replicate size, the unweighted mean based on the jackknife produces the most acceptable results. This study also addresses the important issue of experimental design in multiple sample limiting dilution assays. In particular, we demonstrate that optimization of multiple sample limiting dilution assays is achieved by increasing the number of biological samples at the expense of repeat cultures.  相似文献   

11.
Variance stabilization is a step in the preprocessing of microarray data that can greatly benefit the performance of subsequent statistical modeling and inference. Due to the often limited number of technical replicates for Affymetrix and cDNA arrays, achieving variance stabilization can be difficult. Although the Illumina microarray platform provides a larger number of technical replicates on each array (usually over 30 randomly distributed beads per probe), these replicates have not been leveraged in the current log2 data transformation process. We devised a variance-stabilizing transformation (VST) method that takes advantage of the technical replicates available on an Illumina microarray. We have compared VST with log2 and Variance-stabilizing normalization (VSN) by using the Kruglyak bead-level data (2006) and Barnes titration data (2005). The results of the Kruglyak data suggest that VST stabilizes variances of bead-replicates within an array. The results of the Barnes data show that VST can improve the detection of differentially expressed genes and reduce false-positive identifications. We conclude that although both VST and VSN are built upon the same model of measurement noise, VST stabilizes the variance better and more efficiently for the Illumina platform by leveraging the availability of a larger number of within-array replicates. The algorithms and Supplementary Data are included in the lumi package of Bioconductor, available at: www.bioconductor.org.  相似文献   

12.
13.
Two-dimensional difference gel electrophoresis (2-D DIGE) allows for reliable quantification of global protein abundance changes. The threshold of significance for protein abundance changes depends on the experimental variation (biological and technical). This study estimates biological, technical and total variation inherent to 2-D DIGE analysis of environmental bacteria, using the model organisms "Aromatoleum aromaticum" EbN1 and Phaeobacter gallaeciensis DSM 17395. Of both bacteria the soluble proteomes were analyzed from replicate cultures. For strains EbN1 and DSM 17395, respectively, CV revealed a total variation of below 19 and 15%, an average technical variation of 12 and 7%, and an average biological variation of 18 and 17%. Multivariate analysis of variance confirmed domination of biological over technical variance to be significant in most cases. To visualize variances, the complex protein data have been plotted with a multidimensional scaling technique. Furthermore, comparison of different treatment groups (different substrate conditions) demonstrated that variability within groups is significantly smaller than differences caused by treatment.  相似文献   

14.
We carried out a series of replicate experiments on DNA microarrays using two cell lines and two technologies--the Agilent Human 1A Microarray and the GE Amersham Codelink Uniset Human 20K I Bioarray. We demonstrated that quantifying the noise level as a function of signal strength allows identification of the absolute and differential mRNA expression levels at which biological variability can be resolved above measurement noise. This represents a new formulation of a sensitivity threshold that can be used to compare platforms. It was found that the correlation in expression level between platforms is considerably worse than the correlation between replicate measurements taken using the same platform. In addition, we carried out replicate measurements at different stages of sample processing. This novel approach enables us to quantify the noise introduced into the measurements at each step of the experimental protocol. We demonstrated how this information can be used to determine the most efficient means of using replicates to reduce experimental uncertainty.  相似文献   

15.
  • 1 When rigorous standards of collecting and analysing data are maintained, biological monitoring adds valuable information to water resource assessments. Decisions, from study design and field methods to laboratory procedures and data analysis, affect assessment quality. Subsampling ‐ a laboratory procedure in which researchers count and identify a random subset of field samples ‐ is widespread yet controversial. What are the consequences of subsampling?
  • 2 To explore this question, random subsamples were computer generated for subsample sizes ranging from 100 to 1000 individuals as compared with the results of counting whole samples. The study was done on benthic invertebrate samples collected from five Puget Sound lowland streams near Seattle, WA, USA. For each replicate subsample, values for 10 biological attributes (e.g. total number of taxa) and for the 10‐metric benthic index of biological integrity (B‐IBI) were computed.
  • 3 Variance of each metric and B‐IBI for each subsample size was compared with variance associated with fully counted samples generated using the bootstrap algorithm. From the measures of variance, we computed the maximum number of distinguishable classes of stream condition as a function of sample size for each metric and for B‐IBI.
  • 4 Subsampling significantly decreased the maximum number of distinguishable stream classes for B‐IBI, from 8.2 for fully counted samples to 2.8 classes for 100‐organism subsamples. For subsamples containing 100–300 individuals, discriminatory power was low enough to mislead water resource decision makers.
  相似文献   

16.
We assess the reliability of isobaric-tags for relative and absolute quantitation (iTRAQ), based on different types of replicate analyses taking into account technical, experimental, and biological variations. In total, 10 iTRAQ experiments were analyzed across three domains of life involving Saccharomyces cerevisiae KAY446, Sulfolobus solfataricus P2, and Synechocystis sp. PCC 6803. The coverage of protein expression of iTRAQ analysis increases as the variation tolerance increases. In brief, a cutoff point at +/-50% variation (+/-0.50) would yield 88% coverage in quantification based on an analysis of biological replicates. Technical replicate analysis produces a higher coverage level of 95% at a lower cutoff point of +/-30% variation. Experimental or iTRAQ variations exhibit similar behavior as biological variations, which suggest that most of the measurable deviations come from biological variations. These findings underline the importance of replicate analysis as a validation tool and benchmarking technique in protein expression analysis.  相似文献   

17.
Traditional analyses of feeding experiments that test consumer preference for an array of foods suffer from several defects. We have modified the experimental design to incorporate into a multivariate analysis the variance due to autogenic change in control replicates. Our design allows the multiple foods to be physically paired with their control counterparts. This physical proximity of the multiple food choices in control/experimental pairs ensures that the variance attributable to external environmental factors jointly affects all combinations within each replicate. Our variance term, therefore, is not a contrived estimate as is the case for the random pairing strategy proposed by previous studies. The statistical analysis then proceeds using standard multivariate statistical tests. We conducted a multiple choice feeding experiment using our experimental design and utilized a Monte Carlo analysis to compare our results with those obtained from an experimental design that employed the random pairing strategy. Our experimental design allowed detection of moderate differences among feeding means when the random design did not.  相似文献   

18.
In studies designed to compare different methods of measurement where more than two methods are compared or replicate measurements by each method are available, standard statistical approaches such as computation of limits of agreement are not directly applicable. A model is presented for comparing several methods of measurement in the situation where replicate measurements by each method are available. Measurements are viewed as classified by method, subject and replicate. Models assuming exchangeable as well as non-exchangeable replicates are considered. A fitting algorithm is presented that allows the estimation of linear relationships between methods as well as relevant variance components. The algorithm only uses methods already implemented in most statistical software.  相似文献   

19.
Differential analysis of DNA microarray gene expression data   总被引:6,自引:0,他引:6  
Here, we review briefly the sources of experimental and biological variance that affect the interpretation of high-dimensional DNA microarray experiments. We discuss methods using a regularized t-test based on a Bayesian statistical framework that allow the identification of differentially regulated genes with a higher level of confidence than a simple t-test when only a few experimental replicates are available. We also describe a computational method for calculating the global false-positive and false-negative levels inherent in a DNA microarray data set. This method provides a probability of differential expression for each gene based on experiment-wide false-positive and -negative levels driven by experimental error and biological variance.  相似文献   

20.
Environmental DNA (eDNA) is DNA that has been isolated from field samples, and it is increasingly used to infer the presence or absence of particular species in an ecosystem. However, the combination of sampling procedures and subsequent molecular amplification of eDNA can lead to spurious results. As such, it is imperative that eDNA studies include a statistical framework for interpreting eDNA presence/absence data. We reviewed published literature for studies that utilized eDNA where the species density was known and compared the probability of detecting the focal species to the sampling and analysis protocols. Although biomass of the target species and the volume per sample did not impact detectability, the number of field replicates and number of samples from each replicate were positively related to detection. Additionally, increased number of PCR replicates and increased primer specificity significantly increased detectability. Accordingly, we advocate for increased use of occupancy modelling as a method to incorporate effects of sampling effort and PCR sensitivity in eDNA study design. Based on simulation results and the hierarchical nature of occupancy models, we suggest that field replicates, as opposed to molecular replicates, result in better detection probabilities of target species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号