首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Variance stabilization is a step in the preprocessing of microarray data that can greatly benefit the performance of subsequent statistical modeling and inference. Due to the often limited number of technical replicates for Affymetrix and cDNA arrays, achieving variance stabilization can be difficult. Although the Illumina microarray platform provides a larger number of technical replicates on each array (usually over 30 randomly distributed beads per probe), these replicates have not been leveraged in the current log2 data transformation process. We devised a variance-stabilizing transformation (VST) method that takes advantage of the technical replicates available on an Illumina microarray. We have compared VST with log2 and Variance-stabilizing normalization (VSN) by using the Kruglyak bead-level data (2006) and Barnes titration data (2005). The results of the Kruglyak data suggest that VST stabilizes variances of bead-replicates within an array. The results of the Barnes data show that VST can improve the detection of differentially expressed genes and reduce false-positive identifications. We conclude that although both VST and VSN are built upon the same model of measurement noise, VST stabilizes the variance better and more efficiently for the Illumina platform by leveraging the availability of a larger number of within-array replicates. The algorithms and Supplementary Data are included in the lumi package of Bioconductor, available at: www.bioconductor.org.  相似文献   

2.
MOTIVATION: Standard statistical techniques often assume that data are normally distributed, with constant variance not depending on the mean of the data. Data that violate these assumptions can often be brought in line with the assumptions by application of a transformation. Gene-expression microarray data have a complicated error structure, with a variance that changes with the mean in a non-linear fashion. Log transformations, which are often applied to microarray data, can inflate the variance of observations near background. RESULTS: We introduce a transformation that stabilizes the variance of microarray data across the full range of expression. Simulation studies also suggest that this transformation approximately symmetrizes microarray data.  相似文献   

3.
A gene-expression microarray datum is modeled as an exponential expression signal (log-normal distribution) and additive noise. Variance-stabilizing transformation based on this model is useful for improving the uniformity of variance, which is often assumed for conventional statistical analysis methods. However, the existing method of estimating transformation parameters may not be perfect because of poor management of outliers. By employing an information normalization technique, we have developed an improved parameter estimation method, which enables statistically more straightforward outlier exclusion and works well even in the case of small sample size. Validation of this method with experimental data has suggested that it is superior to the conventional method.  相似文献   

4.
MOTIVATION: A variance stabilizing transformation for microarray data was recently introduced independently by several research groups. This transformation has sometimes been called the generalized logarithm or glog transformation. In this paper, we derive several alternative approximate variance stabilizing transformations that may be easier to use in some applications. RESULTS: We demonstrate that the started-log and the log-linear-hybrid transformation families can produce approximate variance stabilizing transformations for microarray data that are nearly as good as the generalized logarithm (glog) transformation. These transformations may be more convenient in some applications.  相似文献   

5.
6.
The high spectral congestion typically observed in one-dimensional (1D) 1H nuclear magnetic resonance (NMR) spectra of tissue extracts and biofluids limits the metabolic information that can be extracted. This study evaluates the application of two-dimensional J-resolved (JRES) spectroscopy for metabolomics, which can provide proton-decoupled projected 1D spectra (p-JRES). This approach is illustrated by an investigation of embryogenesis in Japanese medaka (Oryzias latipes), an established fish model for developmental toxicology. When combined with optimized spectral pre-processing,(2) including a 0.005-ppm bin width for data segmentation and a logarithmic transformation, the reduced congestion in the p-JRES spectra increases the likelihood that a specific metabolite can be accurately integrated and thus increases the extractable information content of the spectra. Principal components analysis of the p-JRES spectra reveals the concept of a developmental trajectory that summarizes the changes in the NMR-visible metabolome throughout medaka embryogenesis. Advantages and potential disadvantages of the p-JRES approach are discussed.  相似文献   

7.
8.
Summary .   Frailty models are widely used to model clustered survival data. Classical ways to fit frailty models are likelihood-based. We propose an alternative approach in which the original problem of "fitting a frailty model" is reformulated into the problem of "fitting a linear mixed model" using model transformation. We show that the transformation idea also works for multivariate proportional odds models and for multivariate additive risks models. It therefore bridges segregated methodologies as it provides a general way to fit conditional models for multivariate survival data by using mixed models methodology. To study the specific features of the proposed method we focus on frailty models. Based on a simulation study, we show that the proposed method provides a good and simple alternative for fitting frailty models for data sets with a sufficiently large number of clusters and moderate to large sample sizes within covariate-level subgroups in the clusters. The proposed method is applied to data from 27 randomized trials in advanced colorectal cancer, which are available through the Meta-Analysis Group in Cancer.  相似文献   

9.

Background  

Nuclear magnetic resonance spectroscopy is one of the primary tools in metabolomics analyses, where it is used to track and quantify changes in metabolite concentrations or profiles in response to perturbation through disease, toxicants or drugs. The spectra generated through such analyses are typically confounded by noise of various types, obscuring the signals and hindering downstream statistical analysis. Such issues are becoming increasingly significant as greater numbers of large-scale systems or longitudinal studies are being performed, in which many spectra from different conditions need to be compared simultaneously.  相似文献   

10.
11.
Predictive models in aerobiology: data transformation   总被引:1,自引:1,他引:0  
This paper attempts to evaluate the effect of mathematical transformations of pollen and meteorogical data used in aerobiological forecasting models. Stepwise multiple regression equations were developed in order to facilitate short term forecasts during the pre-peak period. The daily mean pollen data (x i) expressed as number of pollen grains per cubic metre of air were used directly and transformed into different scales: log(x i + 1), ln((x 11000/Σp) + 1) and √x i, where Σp is the sum of the daily mean values throughout the season. Thirteen meteorological parameters and the variable time were used as forecasting variables. The most reliable forecasts were obtained with data transformed by ‘square root’ and with untransformed data. Based on the results obtained, we recommend that the data be transformed by means of the square root if they do not show a normal distribution and that non-linear statistics be used in this kind of study.  相似文献   

12.
Semiparametric transformation models provide a very general framework for studying the effects of (possibly time-dependent) covariates on survival time and recurrent event times. Assessing the adequacy of these models is an important task because model misspecification affects the validity of inference and the accuracy of prediction. In this paper, we introduce appropriate time-dependent residuals for these models and consider the cumulative sums of the residuals. Under the assumed model, the cumulative sum processes converge weakly to zero-mean Gaussian processes whose distributions can be approximated through Monte Carlo simulation. These results enable one to assess, both graphically and numerically, how unusual the observed residual patterns are in reference to their null distributions. The residual patterns can also be used to determine the nature of model misspecification. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. Three medical studies are provided for illustrations.  相似文献   

13.
Semiparametric analysis of transformation models with censored data   总被引:1,自引:0,他引:1  
  相似文献   

14.
In order to make sense of the sheer volume of metabolomic data that can be generated using current technology, robust data analysis tools are essential. We propose the use of the growing self-organizing map (GSOM) algorithm and by doing so demonstrate that a deeper analysis of metabolomics data is possible in comparison to the widely used batch-learning self-organizing map, hierarchical cluster analysis and partitioning around medoids algorithms on simulated and real-world time-course metabolomic datasets. We then applied GSOM to a recently published dataset representing metabolome response patterns of three wheat cultivars subject to a field simulated cyclic drought stress. This novel and information rich analysis provided by the proposed GSOM framework can be easily extended to other high-throughput metabolomics studies.  相似文献   

15.
Functional mechanisms of biomolecules often manifest themselves precisely in transient conformational substates. Researchers have long sought to structurally characterize dynamic processes in non-coding RNA, combining experimental data with computer algorithms. However, adequate exploration of conformational space for these highly dynamic molecules, starting from static crystal structures, remains challenging. Here, we report a new conformational sampling procedure, KGSrna, which can efficiently probe the native ensemble of RNA molecules in solution. We found that KGSrna ensembles accurately represent the conformational landscapes of 3D RNA encoded by NMR proton chemical shifts. KGSrna resolves motionally averaged NMR data into structural contributions; when coupled with residual dipolar coupling data, a KGSrna ensemble revealed a previously uncharacterized transient excited state of the HIV-1 trans-activation response element stem–loop. Ensemble-based interpretations of averaged data can aid in formulating and testing dynamic, motion-based hypotheses of functional mechanisms in RNAs with broad implications for RNA engineering and therapeutic intervention.  相似文献   

16.
Over the past 10–15 years, nuclear magnetic resonance (NMR) spectroscopy has been employed to study metabolic events accompanying programmed cell death (apoptosis). The early studies were characterized by experiments focusing on specific metabolic parameters obtained by analyzing a limited number of biochemical compounds, e.g. selected metabolic species involved in the Krebs cycle, in energy metabolism, in phospholipid synthesis and degradation, or in mobile-lipid accumulation. However, during the past few years metabolic NMR spectroscopy has begun to refocus towards more comprehensive analyses of tissue metabolites detectable in NMR spectra. This review describes some requirements needed for the development of an integrated, metabolomic concept for NMR spectroscopy investigations of apoptotic cells, and presents recent studies approaching this goal. Metabolomic NMR spectroscopy allows one not only to distinguish between cells that are sensitive to apoptosis induction and resistant cells, but also, in conjunction with measurements of complementary biological parameters, to follow the temporal evolution of the apoptotic process and to analyze mechanisms of apoptosis resistance.  相似文献   

17.
18.
19.
We propose a general class of nonlinear transformation models for analyzing censored survival data, of which the nonlinear proportional hazards and proportional odds models are special cases. A cubic smoothing spline-based component-wise boosting algorithm is derived to estimate covariate effects nonparametrically using the gradient of the marginal likelihood, that is computed using importance sampling. The proposed method can be applied to survival data with high-dimensional covariates, including the case when the sample size is smaller than the number of predictors. Empirical performance of the proposed method is evaluated via simulations and analysis of a microarray survival data.  相似文献   

20.
Nuclear magnetic resonance (NMR) and Mass Spectroscopy (MS) are the two most common spectroscopic analytical techniques employed in metabolomics. The large spectral datasets generated by NMR and MS are often analyzed using data reduction techniques like Principal Component Analysis (PCA). Although rapid, these methods are susceptible to solvent and matrix effects, high rates of false positives, lack of reproducibility and limited data transferability from one platform to the next. Given these limitations, a growing trend in both NMR and MS-based metabolomics is towards targeted profiling or "quantitative" metabolomics, wherein compounds are identified and quantified via spectral fitting prior to any statistical analysis.?Despite the obvious advantages of this method, targeted profiling is hindered by the time required to perform manual or computer-assisted spectral fitting. In an effort to increase data analysis throughput for NMR-based metabolomics, we have developed an automatic method for identifying and quantifying metabolites in one-dimensional (1D) proton NMR spectra. This new algorithm is capable of using carefully constructed reference spectra and optimizing thousands of variables to reconstruct experimental NMR spectra of biofluids using rules and concepts derived from physical chemistry and NMR theory. The automated profiling program has been tested against spectra of synthetic mixtures as well as biological spectra of urine, serum and cerebral spinal fluid (CSF). Our results indicate that the algorithm can correctly identify compounds with high fidelity in each biofluid sample (except for urine). Furthermore, the metabolite concentrations exhibit a very high correlation with both simulated and manually-detected values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号