首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The ANOVA‐based F‐test used for testing the significance of the random effect variance component is a valid test for an unbalanced one‐way random model. However, it does not have an uniform optimum property. For example, this test is not uniformly most powerful invariant (UMPI). In fact, there is no UMPI test in the unbalanced case (see Khuri , Mathew , and Sinha , 1998). The power of the F‐test depends not only on the design used, but also on the true values of the variance components. As Khuri (1996) noted, we can gain a better insight into the effect of data imbalance on the power of the F‐test using a method for modelling the power in terms of the design parameters and the variance components. In this study, generalized linear modelling (GLM) techniques are used for this purpose. It is shown that GLM, in combination with a method of generating designs with a specified degree of imbalance, is an effective way of studying the behavior of the power of the F‐test in a one‐way random model.  相似文献   

2.
Repeatability (more precisely the common measure of repeatability, the intra‐class correlation coefficient, ICC) is an important index for quantifying the accuracy of measurements and the constancy of phenotypes. It is the proportion of phenotypic variation that can be attributed to between‐subject (or between‐group) variation. As a consequence, the non‐repeatable fraction of phenotypic variation is the sum of measurement error and phenotypic flexibility. There are several ways to estimate repeatability for Gaussian data, but there are no formal agreements on how repeatability should be calculated for non‐Gaussian data (e.g. binary, proportion and count data). In addition to point estimates, appropriate uncertainty estimates (standard errors and confidence intervals) and statistical significance for repeatability estimates are required regardless of the types of data. We review the methods for calculating repeatability and the associated statistics for Gaussian and non‐Gaussian data. For Gaussian data, we present three common approaches for estimating repeatability: correlation‐based, analysis of variance (ANOVA)‐based and linear mixed‐effects model (LMM)‐based methods, while for non‐Gaussian data, we focus on generalised linear mixed‐effects models (GLMM) that allow the estimation of repeatability on the original and on the underlying latent scale. We also address a number of methods for calculating standard errors, confidence intervals and statistical significance; the most accurate and recommended methods are parametric bootstrapping, randomisation tests and Bayesian approaches. We advocate the use of LMM‐ and GLMM‐based approaches mainly because of the ease with which confounding variables can be controlled for. Furthermore, we compare two types of repeatability (ordinary repeatability and extrapolated repeatability) in relation to narrow‐sense heritability. This review serves as a collection of guidelines and recommendations for biologists to calculate repeatability and heritability from both Gaussian and non‐Gaussian data.  相似文献   

3.
In oncology studies with immunotherapies, populations of “super‐responders” (patients in whom the treatment works particularly well) are often suspected to be related to biomarkers. In this paper, we explore various ways of confirmatory statistical hypothesis testing for joint inference on the subpopulation of putative “super‐responders” and the full study population. A model‐based testing framework is proposed, which allows to define, up‐front, the strength of evidence required from both full and subpopulations in terms of clinical efficacy. This framework is based on a two‐way analysis of variance (ANOVA) model with an interaction in combination with multiple comparison procedures. The ease of implementation of this model‐based approach is emphasized and details are provided for the practitioner who would like to adopt this approach. The discussion is exemplified by a hypothetical trial that uses an immune‐marker in oncology to define the subpopulation and tumor growth as the primary endpoint.  相似文献   

4.
Thomas Burger 《Proteomics》2023,23(18):2200406
In discovery proteomics, as well as many other “omic” approaches, the possibility to test for the differential abundance of hundreds (or of thousands) of features simultaneously is appealing, despite requiring specific statistical safeguards, among which controlling for the false discovery rate (FDR) has become standard. Moreover, when more than two biological conditions or group treatments are considered, it has become customary to rely on the one-way analysis of variance (ANOVA) framework, where a first global differential abundance landscape provided by an omnibus test can be subsequently refined using various post-hoc tests (PHTs). However, the interactions between the FDR control procedures and the PHTs are complex, because both correspond to different types of multiple test corrections (MTCs). This article surveys various ways to orchestrate them in a data processing workflow and discusses their pros and cons.  相似文献   

5.
In the past many multiple comparison procedure were difficult to perform. Usually, such procedures can be traced back to studentized multiple contrast tests. Numerical difficulties restricted the use of the exact procedures to simple, commonly balanced, designs. Conservative approximations or simulation based approaches have been used in the general cases. However, new efforts and results in the past few years have led to fast and efficient computations of the underlying multidimensional integrals. Inferences for any finite set of linear functions of normal means are now numerically feasible. These include all‐pairwise comparisons, comparisons with a control (including dose‐response contrasts), multiple comparison with the best, etc. The article applies the numerical progress on multiple comparisons procedures for common balanced and unbalanced designs within the general linear model.  相似文献   

6.
Many traits studied in ecology and evolutionary biology change their expression in response to a continuously varying environmental factor. One well‐studied example are thermal performance curves (TPCs); continuous reaction norms that describe the relationship between organismal performance and temperature and are useful for understanding the trade‐offs involved in thermal adaptation. We characterized curves describing the thermal sensitivity of voluntary locomotor activity in a set of 66 spontaneous mutation accumulation lines in the fly Drosophila serrata. Factor‐analytic modeling of the mutational variance–covariance matrix, M , revealed support for three axes of mutational variation in males and two in females. These independent axes of mutational variance corresponded well to the major axes of TPC variation required for different types of thermal adaptation; “faster‐slower” representing changes in performance largely independent of temperature, and the “hotter‐colder” and “generalist‐specialist” axes, representing trade‐offs. In contrast to its near‐absence from standing variance in this species, a “faster‐slower” axis, accounted for most mutational variance (75% in males and 66% in females) suggesting selection may easily fix or remove these types of mutations in outbred populations. Axes resembling the “hotter‐colder” and “generalist‐specialist” modes of variation contributed less mutational variance but nonetheless point to an appreciable input of new mutations that may contribute to thermal adaptation.  相似文献   

7.
Plant breeders and variety testing agencies routinely test candidate genotypes (crop varieties, lines, test hybrids) in multiple environments. Such multi‐environment trials can be efficiently analysed by mixed models. A single‐stage analysis models the entire observed data at the level of individual plots. This kind of analysis is usually considered as the gold standard. In practice, however, it is more convenient to use a two‐stage approach, in which experiments are first analysed per environment, yielding adjusted means per genotype, which are then summarised across environments in the second stage. Stage‐wise approaches suggested so far are approximate in that they cannot fully reproduce a single‐stage analysis, except in very simple cases, because the variance–covariance matrix of adjusted means from individual environments needs to be approximated by a diagonal matrix. This paper proposes a fully efficient stage‐wise method, which carries forward the full variance–covariance matrix of adjusted means from the individual environments to the analysis across the series of trials. Provided the variance components are known, this method can fully reproduce the results of a single‐stage analysis. Computations are made efficient by a diagonalisation of the residual variance–covariance matrix, which necessitates a corresponding linear transformation of both the first‐stage estimates (e.g. adjusted means and regression slopes for plot covariates) and the corresponding design matrices for fixed and random effects. We also exemplify the extension of the general approach to a three‐stage analysis. The method is illustrated using two datasets, one real and the other simulated. The proposed approach has close connections with meta‐analysis, where environments correspond to centres and genotypes to medical treatments. We therefore compare our theoretical results with recently published results from a meta‐analysis.  相似文献   

8.
Current methodology uses a multistage dose-response formula to represent the dose-response curve of laboratory bioassays adequately at high doses, and to extrapolate to low doses. Standard likelihood methods are described to evaluate an uncertainty distribution for the linear term of the multistage formula, exactly analogous to the current method of obtaining a “95% confidence limit.” In the standard methodology, the number of terms in the dose-response formula and the assumed form for the distribution used to obtain the “95% confidence limit” are somewhat arbitrarily chosen. Modifications are described that allow consistent (although still arbitrary) treatment of all experiments, and potentially allow incorporation of mechanistic ideas about the correctness of the low-dose extrapolation. Also described are extensions that allow incorporation of results from multiple experiments into the uncertainty distribution. The result is a probability distribution for cancer potency factor in laboratory animals.

Empirical results describing the extrapolation between species of the linear term of multistage dose-response formulas are presented, and a method is given for analysis of any dose-response model. It is shown that the interspecies extrapolation as currently performed is well represented by a lognormal distribution with well-defined standard deviation but a median that depends on the species, and that is not representable by any simple allometric scaling law. The best available animal/human comparisons are analyzed in similar fashion to show consistency with the ideas presented, and obtain the best estimates for animal to human extrapolation.  相似文献   


9.
The proposed procedure “SECOLICO” is based on the sequential construction of linear contrasts. After an analysis of variance the procedure is able to classify the treatments represented by mean values of unequal-sized samples made at random into distinguishable groups. For the purpose of illustrating the procedure “SECOLICO” a simplified example is given.  相似文献   

10.
Mathew T  Nordström K 《Biometrics》1999,55(4):1221-1223
When data come from several independent studies for the purpose of estimating treatment control differences, meta-analysis can be carried out either on the best linear unbiased estimators computed from each study or on the pooled individual patient data modelled as a two-way model without interaction, where the two factors represent the different studies and the different treatments. Assuming that observations within and between studies are independent having a common variance, Olkin and Sampson (1998) have obtained the surprising result that the two meta-analytic procedures are equivalent, i.e., they both produce the same estimator. In this article, the same equivalence is established for the two-way fixed-effects model without interaction with the only assumption that the observations across studies be independent. A consequence of the equivalence result is that, regardless of the covariance structure, it is possible to get an explicit representation for the best linear unbiased estimator of any vector of treatment contrasts in a two-way fixed-effects model without interaction as long as the studies are independent. Another interesting consequence is that, for the purpose of best linear unbiased estimation, an unbalanced two-way fixed-effects model without interaction can be treated as several independent unbalanced one-way models, regardless of the covariance structure, when the studies are independent.  相似文献   

11.
Characterizing an appropriate dose‐response relationship and identifying the right dose in a clinical trial are two main goals of early drug‐development. MCP‐Mod is one of the pioneer approaches developed within the last 10 years that combines the modeling techniques with multiple comparison procedures to address the above goals in clinical drug development. The MCP‐Mod approach begins with a set of potential dose‐response models, tests for a significant dose‐response effect (proof of concept, PoC) using multiple linear contrasts tests and selects the “best” model among those with a significant contrast test. A disadvantage of the method is that the parameter values of the candidate models need to be fixed a priori for the contrasts tests. This may lead to a loss in power and unreliable model selection. For this reason, several variations of the MCP‐Mod approach and a hierarchical model selection approach have been suggested where the parameter values need not be fixed in the proof of concept testing step and can be estimated after the model selection step. This paper provides a numerical comparison of the different MCP‐Mod variants and the hierarchical model selection approach with regard to their ability of detecting the dose‐response trend, their potential to select the correct model and their accuracy in estimating the dose response shape and minimum effective dose. Additionally, as one of the approaches is based on two‐sided model comparisons only, we make it more consistent with the common goals of a PoC study, by extending it to one‐sided comparisons between the constant and alternative candidate models in the proof of concept step.  相似文献   

12.
Summary Analysis of variance and principal components methods have been suggested for estimating repeatability. In this study, six estimation procedures are compared: ANOVA, principal components based on the sample covariance matrix and also on the sample correlation matrix, a related multivariate method (structural analysis) based on the sample covariance matrix and also on the sample correlation matrix, and maximum likelihood estimation. A simulation study indicates that when the standard linear model assumptions are met, the estimators are quite similar except when the repeatability is small. Overall, maximum likelihood appears the preferred method. If the assumption of equal variance is relaxed, the methods based on the sample correlation matrix perform better although others are surprisingly robust. The structural analysis method (with sample correlation matrix) appears to be best.Paper number 776 from the Department of Meat and Animal Science, University of Wisconsin-Madison.  相似文献   

13.
Completeness of registration is one of the quality indicators usually reported by cancer registries. This allows researchers to assess how useful and representative the data is. Several methods have been suggested to estimate completeness. In this paper a multi‐state model for the process of cancer diagnosis and treatment is presented. In principle, every contact with a doctor during diagnosis, treatment, and aftercare can give rise to a cancer registry notification with a certain probability. Therefore the states included in the model are “incident tumour” and “death” but also contacts with doctors such as consultation of a general practitioner or specialised doctor, diagnostic procedures, therapeutic interventions, and aftercare. In this model transitions between states and possible notifications to a cancer registry after entering a state are simulated. Transition intensities are derived and used in simulation. Several capture‐recapture methods have been applied to the simulated data. Simulated “true” numbers of new cases and simulated numbers of registrations are both available. This allows to assess the validity of the completeness estimates and to compare the relative merits of the methods. In the scenarios investigated here, all capture‐recapture estimators tended to underestimate completeness. While a modified DCN method and one type of log‐linear model yielded quite reasonable estimates other methods exhibited large variability or grossly underestimated completeness. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
Recent developments in the field of fluorescence lifetime imaging microscopy (FLIM) techniques allow the use of high repetition rate light sources in live cell experiments. For light sources with a repetition rate of 20–100 MHz, the time‐correlated single photon counting (TCSPC) FLIM systems suffer serious dead time related distortions, known as “inter‐pulse pile‐up”. The objective of this paper is to present a new method to quantify the level of signal distortion in TCSPC FLIM experiments, in order to determine the most efficient laser repetition rate for different FLT ranges. Optimization of the F ‐value, which is the relation between the relative standard deviation (RSD) in the measured FLT to the RSD in the measured fluorescence intensity (FI), allows quantification of the level of FI signal distortion, as well as determination of the correct FLT of the measurement. It is shown that by using a very high repetition rate (80 MHz) for samples characterized by high real FLT's (4–5 ns), virtual short FLT components are added to the FLT histogram while a F ‐value that is higher than 1 is obtained. For samples characterized with short real FLT's, virtual long FLT components are added to the FLT histogram with the lower repetition rate (20–50 MHz), while by using a higher repetition rate (80 MHz) the “inter‐pulse pile‐up” is eliminated as the F ‐value is close to 1. (© 2014 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

15.
Published X‐ray crystallographic structures for glycoside hydrolases (GHs) from 39 different families are surveyed according to some rigorous selection criteria, and the distances separating 208 pairs of catalytic carboxyl groups (20 α‐retaining, 87 β‐retaining, 38 α‐inverting, and 63 β‐inverting) are analyzed. First, the average of all four inter‐carboxyl OO distances for each pair is determined; second, the mean of all the pair‐averages within each GH family is determined; third, means are determined for groups of GH families. No significant differences are found for free structures compared with those complexed with a ligand in the active site of the enzyme, nor for α‐GHs as compared with β‐GHs. The mean and standard deviation (1σ) of the unimodal distribution of average OO distances for all families of inverting GHs is 8 ± 2Å, with a very wide range from 5Å (GH82) to nearly 13Å (GH46). The distribution of average OO distances for all families of retaining GHs appears to be bimodal: the means and standard deviations of the two groups are 4.8 ± 0.3Å and 6.4 ± 0.6Å. These average values are more representative, and more likely to be meaningful, than the often‐quoted literature values, which are based on a very small sample of structures. The newly‐updated average values proposed here may alter perceptions about what separations between catalytic residues are “normal” or “abnormal” for GHs. Proteins 2014; 82:1747–1755. © 2014 Wiley Periodicals, Inc.  相似文献   

16.
MIXED MODEL APPROACHES FOR ESTIMATING GENETIC VARIANCES AND COVARIANCES   总被引:62,自引:4,他引:58  
The limitations of methods for analysis of variance(ANOVA)in estimating genetic variances are discussed. Among the three methods(maximum likelihood ML, restricted maximum likelihood REML, and minimum norm quadratic unbiased estimation MINQUE)for mixed linear models, MINQUE method is presented with formulae for estimating variance components and covariances components and for predicting genetic effects. Several genetic models, which cannot be appropriately analyzed by ANOVA methods, are introduced in forms of mixed linear models. Genetic models with independent random effects can be analyzed by MINQUE(1)method whieh is a MINQUE method with all prior values setting 1. MINQUE(1)method can give unbiased estimation for variance components and covariance components, and linear unbiased prediction (LUP) for genetic effects. There are more complicate genetic models for plant seeds which involve correlated random effects. MINQUE(0/1)method, which is a MINQUE method with all prior covariances setting 0 and all prior variances setting 1, is suitable for estimating variance and covariance components in these models. Mixed model approaches have advantage over ANOVA methods for the capacity of analyzing unbalanced data and complicated models. Some problems about estimation and hypothesis test by MINQUE method are discussed.  相似文献   

17.
The mixed-model factorial analysis of variance has been used in many recent studies in evolutionary quantitative genetics. Two competing formulations of the mixed-model ANOVA are commonly used, the “Scheffe” model and the “SAS” model; these models differ in both their assumptions and in the way in which variance components due to the main effect of random factors are defined. The biological meanings of the two variance component definitions have often been unappreciated, however. A full understanding of these meanings leads to the conclusion that the mixed-model ANOVA could have been used to much greater effect by many recent authors. The variance component due to the random main effect under the two-way SAS model is the covariance in true means associated with a level of the random factor (e.g., families) across levels of the fixed factor (e.g., environments). Therefore the SAS model has a natural application for estimating the genetic correlation between a character expressed in different environments and testing whether it differs from zero. The variance component due to the random main effect under the two-way Scheffe model is the variance in marginal means (i.e., means over levels of the fixed factor) among levels of the random factor. Therefore the Scheffe model has a natural application for estimating genetic variances and heritabilities in populations using a defined mixture of environments. Procedures and assumptions necessary for these applications of the models are discussed. While exact significance tests under the SAS model require balanced data and the assumptions that family effects are normally distributed with equal variances in the different environments, the model can be useful even when these conditions are not met (e.g., for providing an unbiased estimate of the across-environment genetic covariance). Contrary to statements in a recent paper, exact significance tests regarding the variance in marginal means as well as unbiased estimates can be readily obtained from unbalanced designs with no restrictive assumptions about the distributions or variance-covariance structure of family effects.  相似文献   

18.
The coverage probabilities of several confidence limit estimators of genetic parameters, obtained from North Carolina I designs, were assessed by means of Monte Carlo simulations. The reliability of the estimators was compared under three different parental sample sizes. The coverage of confidence intervals set on the Normal distribution, and using standard errors either computed by the “delta” method or derived using an approximation for the variance of a variance component estimated by means of a linear combination of mean squares, was affected by the number of males and females included in the experiment. The “delta” method was found to provide reliable standard errors of the genetic parameters only when at least 48 males were each mated to six different females randomly selected from the reference population. Formulae are provided for obtaining “delta” method standard errors, and appropriate statistical software procedures are discussed. The error rates of confidence limits based on the Normal distribution and using standard errors obtained by an approximation for the variance of a variance component varied widely. The coverage of F-distribution confidence intervals for heritability estimates was not significantly affected by parental sample size and consistently provided a mean coverage near the stated coverage. For small parental sample sizes, confidence intervals for heritability estimates should be based on the F-distribution.  相似文献   

19.
Summary Cluster randomized trials in health care may involve three instead of two levels, for instance, in trials where different interventions to improve quality of care are compared. In such trials, the intervention is implemented in health care units (“clusters”) and aims at changing the behavior of health care professionals working in this unit (“subjects”), while the effects are measured at the patient level (“evaluations”). Within the generalized estimating equations approach, we derive a sample size formula that accounts for two levels of clustering: that of subjects within clusters and that of evaluations within subjects. The formula reveals that sample size is inflated, relative to a design with completely independent evaluations, by a multiplicative term that can be expressed as a product of two variance inflation factors, one that quantifies the impact of within‐subject correlation of evaluations on the variance of subject‐level means and the other that quantifies the impact of the correlation between subject‐level means on the variance of the cluster means. Power levels as predicted by the sample size formula agreed well with the simulated power for more than 10 clusters in total, when data were analyzed using bias‐corrected estimating equations for the correlation parameters in combination with the model‐based covariance estimator or the sandwich estimator with a finite sample correction.  相似文献   

20.
There are many situations where it is desired to make simultaneous tests or give simultaneous confidence intervals for linear combinations (contrasts) of population or treatment means. Somerville (1997, 1999) developed algorithms for calculating the critical values for a large class of simultaneous tests and simultaneous confidence intervals. Fortran 90 and SAS‐IML batch programs and interactive programs were developed. These programs calculate the critical values for 15 different simultaneous confidence interval procedures (and the corresponding simultaneous tests) and for arbitrary procedures where the user specifies a combination of one and two sided contrasts. The programs can also be used to obtain the constants for “step‐down” testing of multiple hypotheses. This paper gives examples of the use of the algorithms and programs and illustrates their versatility and generality. The designs need not be balanced, multiple covariates may be present and there may be many missing values. The use of multiple regression and dummy variables to obtain the required variance covariance matrix is illustrated. Under weak normality assumptions the methods are “exact” and make the use of approximate methods or “simulation” unnecessary.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号