首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When it comes to fitting simple allometric slopes through measurement data, evolutionary biologists have been torn between regression methods. On the one hand, there is the ordinary least squares (OLS) regression, which is commonly used across many disciplines of biology to fit lines through data, but which has a reputation for underestimating slopes when measurement error is present. On the other hand, there is the reduced major axis (RMA) regression, which is often recommended as a substitute for OLS regression in studies of allometry, but which has several weaknesses of its own. Here, we review statistical theory as it applies to evolutionary biology and studies of allometry. We point out that the concerns that arise from measurement error for OLS regression are small and straightforward to deal with, whereas RMA has several key properties that make it unfit for use in the field of allometry. The recommended approach for researchers interested in allometry is to use OLS regression on measurements taken with low (but realistically achievable) measurement error. If measurement error is unavoidable and relatively large, it is preferable to correct for slope attenuation rather than to turn to RMA regression, or to take the expected amount of attenuation into account when interpreting the data.  相似文献   

2.
In allometry, researchers are commonly interested in estimating the slope of the major axis or standardized major axis (methods of bivariate line fitting related to principal components analysis). This study considers the robustness of two tests for a common slope amongst several axes. It is of particular interest to measure the robustness of these tests to slight violations of assumptions that may not be readily detected in sample datasets. Type I error is estimated in simulations of data generated with varying levels of nonnormality, heteroscedasticity and nonlinearity. The assumption failures introduced in simulations were difficult to detect in a moderately sized dataset, with an expert panel only able to correct detect assumption violations 34-45% of the time. While the common slope tests were robust to nonnormal and heteroscedastic errors from the line, Type I error was inflated if the two variables were related in a slightly nonlinear fashion. Similar results were also observed for the linear regression case. The common slope tests were more liberal when the simulated data had greater nonlinearity, and this effect was more evident when the underlying distribution had longer tails than the normal. This result raises concerns for common slopes testing, as slight nonlinearities such as those in simulations are often undetectable in moderately sized datasets. Consequently, practitioners should take care in checking for nonlinearity and interpreting the results of a test for common slope. This work has implications for the robustness of inference in linear models in general.  相似文献   

3.
Most phylogenetically based statistical methods for the analysis of quantitative or continuously varying phenotypic traits assume that variation within species is absent or at least negligible, which is unrealistic for many traits. Within-species variation has several components. Differences among populations of the same species may represent either phylogenetic divergence or direct effects of environmental factors that differ among populations (phenotypic plasticity). Within-population variation also contributes to within-species variation and includes sampling variation, instrument-related error, low repeatability caused by fluctuations in behavioral or physiological state, variation related to age, sex, season, or time of day, and individual variation within such categories. Here we develop techniques for analyzing phylogenetically correlated data to include within-species variation, or "measurement error" as it is often termed in the statistical literature. We derive methods for (i) univariate analyses, including measurement of "phylogenetic signal," (ii) correlation and principal components analysis for multiple traits, (iii) multiple regression, and (iv) inference of "functional relations," such as reduced major axis (RMA) regression. The methods are capable of incorporating measurement error that differs for each data point (mean value for a species or population), but they can be modified for special cases in which less is known about measurement error (e.g., when one is willing to assume something about the ratio of measurement error in two traits). We show that failure to incorporate measurement error can lead to both biased and imprecise (unduly uncertain) parameter estimates. Even previous methods that are thought to account for measurement error, such as conventional RMA regression, can be improved by explicitly incorporating measurement error and phylogenetic correlation. We illustrate these methods with examples and simulations and provide Matlab programs.  相似文献   

4.
Many investigators use the reduced major axis (RMA) instead of ordinary least squares (OLS) to define a line of best fit for a bivariate relationship when the variable represented on the X‐axis is measured with error. OLS frequently is described as requiring the assumption that X is measured without error while RMA incorporates an assumption that there is error in X. Although an RMA fit actually involves a very specific pattern of error variance, investigators have prioritized the presence versus the absence of error rather than the pattern of error in selecting between the two methods. Another difference between RMA and OLS is that RMA is symmetric, meaning that a single line defines the bivariate relationship, regardless of which variable is X and which is Y, while OLS is asymmetric, so that the slope and resulting interpretation of the data are changed when the variables assigned to X and Y are reversed. The concept of error is reviewed and expanded from previous discussions, and it is argued that the symmetry‐asymmetry issue should be the criterion by which investigators choose between RMA and OLS. This is a biological question about the relationship between variables. It is determined by the investigator, not dictated by the pattern of error in the data. If X is measured with error but OLS should be used because the biological question is asymmetric, there are several methods available for adjusting the OLS slope to reflect the bias due to error. RMA is being used in many analyses for which OLS would be more appropriate. Am J Phys Anthropol, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

5.
We consider the proportional hazards model in which the covariates include the discretized categories of a continuous time-dependent exposure variable measured with error. Naively ignoring the measurement error in the analysis may cause biased estimation and erroneous inference. Although various approaches have been proposed to deal with measurement error when the hazard depends linearly on the time-dependent variable, it has not yet been investigated how to correct when the hazard depends on the discretized categories of the time-dependent variable. To fill this gap in the literature, we propose a smoothed corrected score approach based on approximation of the discretized categories after smoothing the indicator function. The consistency and asymptotic normality of the proposed estimator are established. The observation times of the time-dependent variable are allowed to be informative. For comparison, we also extend to this setting two approximate approaches, the regression calibration and the risk-set regression calibration. The methods are assessed by simulation studies and by application to data from an HIV clinical trial.  相似文献   

6.
Inverse‐probability‐of‐treatment weighted (IPTW) estimation has been widely used to consistently estimate the causal parameters in marginal structural models, with time‐dependent confounding effects adjusted for. Just like other causal inference methods, the validity of IPTW estimation typically requires the crucial condition that all variables are precisely measured. However, this condition, is often violated in practice due to various reasons. It has been well documented that ignoring measurement error often leads to biased inference results. In this paper, we consider the IPTW estimation of the causal parameters in marginal structural models in the presence of error‐contaminated and time‐dependent confounders. We explore several methods to correct for the effects of measurement error on the estimation of causal parameters. Numerical studies are reported to assess the finite sample performance of the proposed methods.  相似文献   

7.
Exposure measurement error can be seen as one of the most important sources of uncertainty in studies in epidemiology. When the aim is to assess the effects of measurement error on statistical inference or to compare the performance of several methods for measurement error correction, it is indispensable to be able to generate different types of measurement error. This paper compares two approaches for the generation of Berkson error, which have recently been applied in radiation epidemiology, in their ability to generate exposure data that satisfy the properties of the Berkson model. In particular, it is shown that the use of one of the methods produces results that are not in accordance with two important properties of Berkson error.  相似文献   

8.
9.
Regressions of biological variables across species are rarely perfect. Usually, there are residual deviations from the estimated model relationship, and such deviations commonly show a pattern of phylogenetic correlations indicating that they have biological causes. We discuss the origins and effects of phylogenetically correlated biological variation in regression studies. In particular, we discuss the interplay of biological deviations with deviations due to observational or measurement errors, which are also important in comparative studies based on estimated species means. We show how bias in estimated evolutionary regressions can arise from several sources, including phylogenetic inertia and either observational or biological error in the predictor variables. We show how all these biases can be estimated and corrected for in the presence of phylogenetic correlations. We present general formulas for incorporating measurement error in linear models with correlated data. We also show how alternative regression models, such as major axis and reduced major axis regression, which are often recommended when there is error in predictor variables, are strongly biased when there is biological variation in any part of the model. We argue that such methods should never be used to estimate evolutionary or allometric regression slopes.  相似文献   

10.
Bayesian methods are valuable, inter alia, whenever there is a need to extract information from data that are uncertain or subject to any kind of error or noise (including measurement error and experimental error, as well as noise or random variation intrinsic to the process of interest). Bayesian methods offer a number of advantages over more conventional statistical techniques that make them particularly appropriate for complex data. It is therefore no surprise that Bayesian methods are becoming more widely used in the fields of genetics, genomics, bioinformatics and computational systems biology, where making sense of complex noisy data is the norm. This review provides an introduction to the growing literature in this area, with particular emphasis on recent developments in Bayesian bioinformatics relevant to computational systems biology.  相似文献   

11.
In simple regression, two serious problems with the ordinary least squares (OLS) estimator are that its efficiency can be relatively poor when the error term is normal but heteroscedastic, and the usual confidence interval for the slope can have highly unsatisfactory probability coverage. When the error term is nonnormal, these problems become exacerbated. Two other concerns are that the OLS estimator has an unbounded influence function and a breakdown point of zero. Wilcox (1996) compared several estimators when there is heteroscedasticity and found two that have relatively good efficiency and simultaneously provide protection against outliers: an M-estimator with Schweppe weights and an estimator proposed by Cohen, Dalal and Tukey (1993). However, the M-estimator can handle only one outlier in the X-domain or among the Y values, and among the methods considered by Wilcox for computing confidence intervals for the slope, none performed well when working with the Cohen-Dalal-Tukey estimator. This note points out that the small-sample efficiency of theTheil-Sen estimator competes well with the estimators considered by Wilcox, and a method for computing a confidence interval was found that performs well in simulations. The Theil-Sen estimator has a reasonably high breakdown point, a bounded influence function, and in some cases its small-sample efficiency offers a substantial advantage over all of the estimators compared in Wilcox (1996).  相似文献   

12.
Summary This article presents some statistical methods for estimating the parameters of a population dynamics model for annual plants. The model takes account of reproduction, immigration, seed survival in a seed bank, and plant growth. The data consist of the number of plants in several developmental stages that were measured in a number of populations for a few consecutive years; they are incomplete since seeds could not be counted. It is assumed that there are no measurement errors or that measurement errors are binomial and not frequent. Some statistical methods are developed within the framework of estimating equations or Bayesian inference. These methods are applied to oilseed rape data.  相似文献   

13.
Qihuang Zhang  Grace Y. Yi 《Biometrics》2023,79(2):1089-1102
Zero-inflated count data arise frequently from genomics studies. Analysis of such data is often based on a mixture model which facilitates excess zeros in combination with a Poisson distribution, and various inference methods have been proposed under such a model. Those analysis procedures, however, are challenged by the presence of measurement error in count responses. In this article, we propose a new measurement error model to describe error-contaminated count data. We show that ignoring the measurement error effects in the analysis may generally lead to invalid inference results, and meanwhile, we identify situations where ignoring measurement error can still yield consistent estimators. Furthermore, we propose a Bayesian method to address the effects of measurement error under the zero-inflated Poisson model and discuss the identifiability issues. We develop a data-augmentation algorithm that is easy to implement. Simulation studies are conducted to evaluate the performance of the proposed method. We apply our method to analyze the data arising from a prostate adenocarcinoma genomic study.  相似文献   

14.
Inferring metabolic networks from metabolite concentration data is a central topic in systems biology. Mathematical techniques to extract information about the network from data have been proposed in the literature. This paper presents a critical assessment of the feasibility of reverse engineering of metabolic networks, illustrated with a selection of methods. Appropriate data are simulated to study the performance of four representative methods. An overview of sampling and measurement methods currently in use for generating time-resolved metabolomics data is given and contrasted with the needs of the discussed reverse engineering methods. The results of this assessment show that if full inference of a real-world metabolic network is the goal there is a large discrepancy between the requirements of reverse engineering of metabolic networks and contemporary measurement practice. Recommendations for improved time-resolved experimental designs are given.  相似文献   

15.
Likelihood ratio tests are derived for bivariate normal structural relationships in the presence of group structure. These tests may also be applied to less restrictive models where only errors are assumed to be normally distributed. Tests for a common slope amongst those from several datasets are derived for three different cases – when the assumed ratio of error variances is the same across datasets and either known or unknown, and when the standardised major axis model is used. Estimation of the slope in the case where the ratio of error variances is unknown could be considered as a maximum likelihood grouping method. The derivations are accompanied by some small sample simulations, and the tests are applied to data arising from work on seed allometry.  相似文献   

16.
Reliability, the consistency of a test or measurement, is frequently quantified in the movement sciences literature. A common metric is the intraclass correlation coefficient (ICC). In addition, the SEM, which can be calculated from the ICC, is also frequently reported in reliability studies. However, there are several versions of the ICC, and confusion exists in the movement sciences regarding which ICC to use. Further, the utility of the SEM is not fully appreciated. In this review, the basics of classic reliability theory are addressed in the context of choosing and interpreting an ICC. The primary distinction between ICC equations is argued to be one concerning the inclusion (equations 2,1 and 2,k) or exclusion (equations 3,1 and 3,k) of systematic error in the denominator of the ICC equation. Inferential tests of mean differences, which are performed in the process of deriving the necessary variance components for the calculation of ICC values, are useful to determine if systematic error is present. If so, the measurement schedule should be modified (removing trials where learning and/or fatigue effects are present) to remove systematic error, and ICC equations that only consider random error may be safely used. The use of ICC values is discussed in the context of estimating the effects of measurement error on sample size, statistical power, and correlation attenuation. Finally, calculation and application of the SEM are discussed. It is shown how the SEM and its variants can be used to construct confidence intervals for individual scores and to determine the minimal difference needed to be exhibited for one to be confident that a true change in performance of an individual has occurred.  相似文献   

17.
Normalization of expression levels applied to microarray data can help in reducing measurement error. Different methods, including cyclic loess, quantile normalization and median or mean normalization, have been utilized to normalize microarray data. Although there is considerable literature regarding normalization techniques for mRNA microarray data, there are no publications comparing normalization techniques for microRNA (miRNA) microarray data, which are subject to similar sources of measurement error. In this paper, we compare the performance of cyclic loess, quantile normalization, median normalization and no normalization for a single-color microRNA microarray dataset. We show that the quantile normalization method works best in reducing differences in miRNA expression values for replicate tissue samples. By showing that the total mean squared error are lowest across almost all 36 investigated tissue samples, we are assured that the bias correction provided by quantile normalization is not outweighed by additional error variance that can arise from a more complex normalization method. Furthermore, we show that quantile normalization does not achieve these results by compression of scale.  相似文献   

18.
Statistical inference for microarray experiments usually involves the estimation of error variance for each gene. Because the sample size available for each gene is often low, the usual unbiased estimator of the error variance can be unreliable. Shrinkage methods, including empirical Bayes approaches that borrow information across genes to produce more stable estimates, have been developed in recent years. Because the same microarray platform is often used for at least several experiments to study similar biological systems, there is an opportunity to improve variance estimation further by borrowing information not only across genes but also across experiments. We propose a lognormal model for error variances that involves random gene effects and random experiment effects. Based on the model, we develop an empirical Bayes estimator of the error variance for each combination of gene and experiment and call this estimator BAGE because information is Borrowed Across Genes and Experiments. A permutation strategy is used to make inference about the differential expression status of each gene. Simulation studies with data generated from different probability models and real microarray data show that our method outperforms existing approaches.  相似文献   

19.
Recently, in order to accelerate drug development, trials that use adaptive seamless designs such as phase II/III clinical trials have been proposed. Phase II/III clinical trials combine traditional phases II and III into a single trial that is conducted in two stages. Using stage 1 data, an interim analysis is performed to answer phase II objectives and after collection of stage 2 data, a final confirmatory analysis is performed to answer phase III objectives. In this paper we consider phase II/III clinical trials in which, at stage 1, several experimental treatments are compared to a control and the apparently most effective experimental treatment is selected to continue to stage 2. Although these trials are attractive because the confirmatory analysis includes phase II data from stage 1, the inference methods used for trials that compare a single experimental treatment to a control and do not have an interim analysis are no longer appropriate. Several methods for analysing phase II/III clinical trials have been developed. These methods are recent and so there is little literature on extensive comparisons of their characteristics. In this paper we review and compare the various methods available for constructing confidence intervals after phase II/III clinical trials.  相似文献   

20.
1. Despite a substantial body of work there remains much disagreement about the form of the relationship between organism abundance and body size. In an attempt at resolving these disagreements the shape and slope of samples from simulated and real abundance–mass distributions were assessed by ordinary least squares regression (OLS) and the reduced major axis method (RMA).
2. It is suggested that the data gathered by ecologists to assess these relationships are usually truncated in respect of density. Under these conditions RMA gives slope estimates which are consistently closer to the true slopes than OLS regression.
3. The triangular relationships reported by some workers are found over smaller mass and abundance ranges than linear relations. Scatter in slope estimates is much greater and positive slopes more common at small sample sizes and sample ranges. These results support the notion that inadequate and truncated sampling is responsible for much of the disagreement reported in the literature.
4. The results strongly support the notion that density declines with increasing body mass in a broad, linear band with a slope around −1. However there is some evidence to suggest that this overall relation results from a series of component relations with slopes which differ from the overall slope.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号