首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Joint moments are commonly used to characterize gait. Factors like height and weight influence these moments. This study determined which of two commonly used normalization methods, body mass or body weight times height, most reduced the effects of height and weight on peak hip, knee, and ankle external moments during walking. The effectiveness of each normalization method in reducing gender differences was then tested. Gait data from 158 normal subjects were analyzed using unnormalized values, body mass normalized values, and body weight times height normalized values. Without normalization, height or weight accounted for 7-82% of the variance in all 10 peak components of the moments. With normalization, height and weight accounted for at most 6% of the variance with the exception of the hip adduction moment normalized by body weight times height and the ankle dorsiflexion moment normalized by body mass. For the hip adduction moment normalized by body weight times height, height still accounted for 13% of the variance (p<0.001) and for the ankle dorsiflexion moment normalized by body mass, 22% of the variance (p<0.001). After normalization, significant differences between males and females remained for only two out of 10 moments with the body weight times height method compared to six out of 10 moments with the body mass method. When compared to the unnormalized data, both normalization methods were highly effective in reducing height and weight differences. Even for the two cases where one normalization method was less effective than the other (hip adduction-body weight times height; ankle dorsiflexion-body mass) the normalization process reduced the variance ascribed to height or weight by 48% and 63%, respectively, as compared to the unnormalized data.  相似文献   

2.
MOTIVATION: Authors of several recent papers have independently introduced a family of transformations (the generalized-log family), which stabilizes the variance of microarray data up to the first order. However, for data from two-color arrays, tests for differential expression may require that the variance of the difference of transformed observations be constant, rather than that of the transformed observations themselves. RESULTS: We introduce a transformation within the generalized-log family which stabilizes, to the first order, the variance of the difference of transformed observations. We also introduce transformations from the 'started-log' and log-linear-hybrid families which provide good approximate variance stabilization of differences. Examples using control-control data show that any of these transformations may provide sufficient variance stabilization for practical applications, and all perform well compared to log ratios.  相似文献   

3.
MOTIVATION AND RESULTS: Durbin et al. (2002), Huber et al. (2002) and Munson (2001) independently introduced a family of transformations (the generalized-log family) which stabilizes the variance of microarray data up to the first order. We introduce a method for estimating the transformation parameter in tandem with a linear model based on the procedure outlined in Box and Cox (1964). We also discuss means of finding transformations within the generalized-log family which are optimal under other criteria, such as minimum residual skewness and minimum mean-variance dependency. AVAILABILITY: R and Matlab code and test data are available from the authors on request.  相似文献   

4.
Normalization of expression levels applied to microarray data can help in reducing measurement error. Different methods, including cyclic loess, quantile normalization and median or mean normalization, have been utilized to normalize microarray data. Although there is considerable literature regarding normalization techniques for mRNA microarray data, there are no publications comparing normalization techniques for microRNA (miRNA) microarray data, which are subject to similar sources of measurement error. In this paper, we compare the performance of cyclic loess, quantile normalization, median normalization and no normalization for a single-color microRNA microarray dataset. We show that the quantile normalization method works best in reducing differences in miRNA expression values for replicate tissue samples. By showing that the total mean squared error are lowest across almost all 36 investigated tissue samples, we are assured that the bias correction provided by quantile normalization is not outweighed by additional error variance that can arise from a more complex normalization method. Furthermore, we show that quantile normalization does not achieve these results by compression of scale.  相似文献   

5.
MOTIVATION: A variance stabilizing transformation for microarray data was recently introduced independently by several research groups. This transformation has sometimes been called the generalized logarithm or glog transformation. In this paper, we derive several alternative approximate variance stabilizing transformations that may be easier to use in some applications. RESULTS: We demonstrate that the started-log and the log-linear-hybrid transformation families can produce approximate variance stabilizing transformations for microarray data that are nearly as good as the generalized logarithm (glog) transformation. These transformations may be more convenient in some applications.  相似文献   

6.
SUMMARY: We present a web server for Diagnosis and Normalization of MicroArray Data (DNMAD). DNMAD includes several common data transformations such as spatial and global robust local regression or multiple slide normalization, and allows for detecting several kinds of errors that result from the manipulation and the image analysis of the arrays. This tool offers a user-friendly interface, and is completely integrated within the Gene Expression Pattern Analysis Suite (GEPAS). AVAILABILITY: The tool is accessible on-line at http://dnmad.bioinfo.cnio.es.  相似文献   

7.
ObjectivesTo quantify the variance introduced to trapezius electromyography (EMG) through normalization by sub-maximal reference voluntary exertions (RVE), and to investigate the effect of increased normalization efforts as compared to other changes in data collection strategy on the precision of occupational EMG estimates.MethodsWomen performed four RVE contractions followed by 30 min of light, cyclic assembly work on each of two days. Work cycle EMG was normalized to each of the RVE trials and seven exposure parameters calculated. The proportions of exposure variance attributable to subject, day within subject, and cycle and normalization trial within day were determined. Using this data, the effect on the precision of the exposure mean of altering the number of subjects, days, cycles and RVEs during data collection was simulated.ResultsFor all exposure parameters a unique component of variance due to normalization was present, yet small: less than 4.4% of the total variance. The resource allocation simulations indicated that marginal improvements in the precision of a group exposure mean would occur above three RVE repeats for EMG collected on one day, or beyond two RVEs for EMG collected on two or more days.  相似文献   

8.
Quantification of LC-MS peak intensities assigned during peptide identification in a typical comparative proteomics experiment will deviate from run-to-run of the instrument due to both technical and biological variation. Thus, normalization of peak intensities across an LC-MS proteomics dataset is a fundamental step in pre-processing. However, the downstream analysis of LC-MS proteomics data can be dramatically affected by the normalization method selected. Current normalization procedures for LC-MS proteomics data are presented in the context of normalization values derived from subsets of the full collection of identified peptides. The distribution of these normalization values is unknown a priori. If they are not independent from the biological factors associated with the experiment the normalization process can introduce bias into the data, possibly affecting downstream statistical biomarker discovery. We present a novel approach to evaluate normalization strategies, which includes the peptide selection component associated with the derivation of normalization values. Our approach evaluates the effect of normalization on the between-group variance structure in order to identify the most appropriate normalization methods that improve the structure of the data without introducing bias into the normalized peak intensities.  相似文献   

9.
Simple total tag count normalization is inadequate for microRNA sequencing data generated from the next generation sequencing technology. However, so far systematic evaluation of normalization methods on microRNA sequencing data is lacking. We comprehensively evaluate seven commonly used normalization methods including global normalization, Lowess normalization, Trimmed Mean Method (TMM), quantile normalization, scaling normalization, variance stabilization, and invariant method. We assess these methods on two individual experimental data sets with the empirical statistical metrics of mean square error (MSE) and Kolmogorov-Smirnov (K-S) statistic. Additionally, we evaluate the methods with results from quantitative PCR validation. Our results consistently show that Lowess normalization and quantile normalization perform the best, whereas TMM, a method applied to the RNA-Sequencing normalization, performs the worst. The poor performance of TMM normalization is further evidenced by abnormal results from the test of differential expression (DE) of microRNA-Seq data. Comparing with the models used for DE, the choice of normalization method is the primary factor that affects the results of DE. In summary, Lowess normalization and quantile normalization are recommended for normalizing microRNA-Seq data, whereas the TMM method should be used with caution.  相似文献   

10.
The variance in intensities of MRI scans is a fundamental impediment for quantitative MRI analysis. Intensity values are not only highly dependent on acquisition parameters, but also on the subject and body region being scanned. This warrants the need for image normalization techniques to ensure that intensity values are consistent within tissues across different subjects and visits. Many intensity normalization methods have been developed and proven successful for the analysis of brain pathologies, but evaluation of these methods for images of the prostate region is lagging.In this paper, we compare four different normalization methods on 49 T2-w scans of prostate cancer patients: 1) the well-established histogram normalization, 2) the generalized scale normalization, 3) an extension of generalized scale normalization called generalized ball-scale normalization, and 4) a custom normalization based on healthy prostate tissue intensities. The methods are compared qualitatively and quantitatively in terms of behaviors of intensity distributions as well as impact on radiomic features.Our findings suggest that normalization based on prior knowledge of the healthy prostate tissue intensities may be the most effective way of acquiring the desired properties of normalized images. In addition, the histogram normalization method outperform the generalized scale and generalized ball-scale methods which have proven superior for other body regions.  相似文献   

11.
Computer simulations are used to examine the significance levels and powers of several tests which have been employed to compare the means of Poisson distributions. In particular, attention is focused on the behaviour of the tests when the means are small, as is often the case in ecological studies when populations of organisms are sampled using quadrats. Two approaches to testing are considered. The first assumes a log linear model for the Poisson data and leads to tests based on the deviance. The second employs standard analysis of variance tests following data transformations, including the often used logarithmic and square root transformations. For very small means it is found that a deviance-based test has the most favourable characteristics, generally outperforming analysis of variance tests on transformed data; none of the latter appears consistently better than any other. For larger means the standard analysis of variance on untransformed data performs well.  相似文献   

12.
13.
Lal Ahamed M  Singh SS  Sharma JB  Ram RB 《Hereditas》2004,141(3):323-327
Six varieties, Kundan (K), Galvez-87 (G), Trap (T), Chris (C), Mango (M) and PBW-348 (P) along with fast ruster, Agra Local (AL), were screened for seedling reaction and adult pant response to leaf rust. Seedlings of all six varieties were susceptible while adult plants showed lower susceptability response than Agra Local. The F1s among the varieties, and also with Agra Local, showed the values lesser than the respective mid parental values for AUDPC suggesting a polygenic mode of inheritance. ANOVA for combining ability effects indicated variation due to the GCA and SCA effects, which indicated that both additive as well as non-additive type of genetic variances, govern AUDPC. The higher values for the GCA variance over the SCA variance indicated the predominance of an additive component over the dominance component for AUDPC. Significant values for GCA effects indicated that Kundan, Galvez-87 and Trap can be used as good general combiners for AUDPC. The crosses, KxAL, GxAL and TxAL showed significant sca effects for AUDPC, which indicated the predominance of non-additive gene effects in these crosses. Additive x additive and dominance x dominance components of the 5- parameter model were highly significant and contributed maximum extent compared to the additive and dominance components in the cross KxG, while dominance and dominance x dominance components contributed maximum in the remaining crosses. Under such a situation, improvement in the character may be expected through standard selection procedure, which may first exploit the additive gene effects and simultaneously care should be taken to see that the dominance effects are not dissipated, but rather they should be concentrated.  相似文献   

14.

Introduction

Different normalization methods are available for urinary data. However, it is unclear which method performs best in minimizing error variance on a certain data-set as no generally applicable empirical criteria have been established so far.

Objectives

The main aim of this study was to develop an applicable and formally correct algorithm to decide on the normalization method without using phenotypic information.

Methods

We proved mathematically for two classical measurement error models that the optimal normalization method generates the highest correlation between the normalized urinary metabolite concentrations and its blood concentrations or, respectively, its raw urinary concentrations. We then applied the two criteria to the urinary 1H-NMR measured metabolomic data from the Study of Health in Pomerania (SHIP-0; n?=?4068) under different normalization approaches and compared the results with in silico experiments to explore the effects of inflated error variance in the dilution estimation.

Results

In SHIP-0, we demonstrated consistently that probabilistic quotient normalization based on aligned spectra outperforms all other tested normalization methods. Creatinine normalization performed worst, while for unaligned data integral normalization seemed to most reasonable. The simulated and the actual data were in line with the theoretical modeling, underlining the general validity of the proposed criteria.

Conclusions

The problem of choosing the best normalization procedure for a certain data-set can be solved empirically. Thus, we recommend applying different normalization procedures to the data and comparing their performances via the statistical methodology explicated in this work. On the basis of classical measurement error models, the proposed algorithm will find the optimal normalization method.
  相似文献   

15.
Extracting biomedical information from large metabolomic datasets by multivariate data analysis is of considerable complexity. Common challenges include among others screening for differentially produced metabolites, estimation of fold changes, and sample classification. Prior to these analysis steps, it is important to minimize contributions from unwanted biases and experimental variance. This is the goal of data preprocessing. In this work, different data normalization methods were compared systematically employing two different datasets generated by means of nuclear magnetic resonance (NMR) spectroscopy. To this end, two different types of normalization methods were used, one aiming to remove unwanted sample-to-sample variation while the other adjusts the variance of the different metabolites by variable scaling and variance stabilization methods. The impact of all methods tested on sample classification was evaluated on urinary NMR fingerprints obtained from healthy volunteers and patients suffering from autosomal polycystic kidney disease (ADPKD). Performance in terms of screening for differentially produced metabolites was investigated on a dataset following a Latin-square design, where varied amounts of 8 different metabolites were spiked into a human urine matrix while keeping the total spike-in amount constant. In addition, specific tests were conducted to systematically investigate the influence of the different preprocessing methods on the structure of the analyzed data. In conclusion, preprocessing methods originally developed for DNA microarray analysis, in particular, Quantile and Cubic-Spline Normalization, performed best in reducing bias, accurately detecting fold changes, and classifying samples.  相似文献   

16.
Differences or similarities in the variance of fitness traits are crucial in several biological disciplines, e.g. ecological, toxicological, developmental and evolutionary studies. For example the variance of traits can be utilized as a biomarker of differences in environmental conditions. In the absence of environmental variability, the differences of the variance of a trait can be interpreted as differences of the genetic background. Several tests and transformations are utilized when testing differences between variances. There is, however, a biological tendency for the variance to scale proportionally to the square of the mean (scaling effect) which can considerably bias the results of the tests. We propose a novel method which allows for a more precise correction of the scaling effect and proper comparisons among treatment groups and between investigations. This is relevant for all data sets of distributions with different means and suggests the reanalysis of comparisons among treatment groups. This correction will provide a more reliable method when using bioindicators.  相似文献   

17.
Yin BC  Li H  Ye BC 《Analytical biochemistry》2008,383(2):270-278
DNA microarray technology has become powerful and popular in mutation/single nucleotide polymorphism (SNP) discovery and genotyping. However, this method is often associated with considerable signal noise of nonbiological origin that may compromise the data quality and interpretation. To achieve a high degree of reliability, accuracy, and sensitivity in data analysis, an effective normalization method to minimize the technical variability is highly desired. In the current study, a simple and robust normalization method is described. The method is based on introduction of a reference probe coimmobilized with SNP probes on the microarray for a dual-probe hybridization (DPH) reaction. The reference probe is used as an intraspot control for the customized microarrays. Using this method, the interassay coefficient of variation (CV) was reduced significantly by approximately 10%. After DPH normalization, the CVs and ranges of the ratios were reduced by two to five times. The relative magnitudes of variation of different sources were also analyzed by analysis of variance. Glass slides were shown to contribute the most to the variance, whereas sampling and residual errors had relatively modest contribution. The results showed that this DPH-based spot-dependent normalization method is an effective solution for reducing experimental variation associated with microarray genotyping data.  相似文献   

18.

Extracting biomedical information from large metabolomic datasets by multivariate data analysis is of considerable complexity. Common challenges include among others screening for differentially produced metabolites, estimation of fold changes, and sample classification. Prior to these analysis steps, it is important to minimize contributions from unwanted biases and experimental variance. This is the goal of data preprocessing. In this work, different data normalization methods were compared systematically employing two different datasets generated by means of nuclear magnetic resonance (NMR) spectroscopy. To this end, two different types of normalization methods were used, one aiming to remove unwanted sample-to-sample variation while the other adjusts the variance of the different metabolites by variable scaling and variance stabilization methods. The impact of all methods tested on sample classification was evaluated on urinary NMR fingerprints obtained from healthy volunteers and patients suffering from autosomal polycystic kidney disease (ADPKD). Performance in terms of screening for differentially produced metabolites was investigated on a dataset following a Latin-square design, where varied amounts of 8 different metabolites were spiked into a human urine matrix while keeping the total spike-in amount constant. In addition, specific tests were conducted to systematically investigate the influence of the different preprocessing methods on the structure of the analyzed data. In conclusion, preprocessing methods originally developed for DNA microarray analysis, in particular, Quantile and Cubic-Spline Normalization, performed best in reducing bias, accurately detecting fold changes, and classifying samples.

  相似文献   

19.
Identifying changes in the relative abundance of proteins between different biological samples is often confounded by technical noise. In this work, we compared eight normalization methods commonly used in two-dimensional gel electrophoresis and difference gel electrophoresis (DIGE) experiments for their ability to reduce noise and for their influence on the list of proteins whose difference in abundance between two samples is determined to be statistically significant. With respect to reducing noise we find that, while all methods improve upon unnormalized data, cyclic linear normalization is the least well suited to gel-based proteomics and the performances of the other methods are similar. We also find in DIGE data that the choice of normalization method has less of an impact on the noise than does the decision to use an internal reference in the experimental design and that both normalization and standardization using the internal reference are required to maximally reduce variance. Despite the similar noise reduction achieved by most normalization methods, the list of proteins whose abundance was determined to differ significantly between biological groups differed depending on the choice of normalization method. This work provides a direct comparison of the impact of normalization methods in the context of common experimental designs.  相似文献   

20.
Both ecological field studies and attempts to extrapolate from laboratory experiments to natural populations generally encounter the high degree of natural variability and chaotic behavior that typify natural ecosystems. Regardless of this variability and non-normal distribution, most statistical models of natural systems use normal error which assumes independence between the variance and mean. However, environmental data are often random or clustered and are better described by probability distributions which have more realistic variance to mean relationships. Until recently statistical software packages modeled only with normal error and researchers had to assume approximate normality on the original or transformed scale of measurement and had to live with the consequences of often incorrectly assuming independence between the variance and mean. Recent developments in statistical software allow researchers to use generalized linear models (GLMs) and analysis can now proceed with probability distributions from the exponential family which more realistically describe natural conditions: binomial (even distribution with variance less than mean), Poisson (random distribution with variance equal mean), negative binomial (clustered distribution with variance greater than mean). GLMs fit parameters on the original scale of measurement and eliminate the need for obfuscating transformations, reduce bias for proportions with unequal sample size, and provide realistic estimates of variance which can increase power of tests. Because GLMs permit modeling according to the non-normal behavior of natural systems and obviate the need for normality assumptions, they will likely become a widely used tool for analyzing toxicity data. To demonstrate the broad-scale utility of GLMs, we present several examples where the use of GLMs improved the statistical power of field and laboratory studies to document the rapid ecological recovery of Prince William Sound following the Exxon Valdez oil spill.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号