首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 20 毫秒
1.
The exact generalization of GEHAN's (1965) two-sample test for arbitrarily censored survival data has been overlooked by subsequent work on the multisample problem. We give this general covariance matrix and show how it may be used in test procedures. While this permutation test is less powerful than its competitors in cases where both apply, it may be used on types of data not previously discussed.  相似文献   

2.
Spatial extent inference (SEI) is widely used across neuroimaging modalities to adjust for multiple comparisons when studying brain‐phenotype associations that inform our understanding of disease. Recent studies have shown that Gaussian random field (GRF)‐based tools can have inflated family‐wise error rates (FWERs). This has led to substantial controversy as to which processing choices are necessary to control the FWER using GRF‐based SEI. The failure of GRF‐based methods is due to unrealistic assumptions about the spatial covariance function of the imaging data. A permutation procedure is the most robust SEI tool because it estimates the spatial covariance function from the imaging data. However, the permutation procedure can fail because its assumption of exchangeability is violated in many imaging modalities. Here, we propose the (semi‐) parametric bootstrap joint (PBJ; sPBJ) testing procedures that are designed for SEI of multilevel imaging data. The sPBJ procedure uses a robust estimate of the spatial covariance function, which yields consistent estimates of standard errors, even if the covariance model is misspecified. We use the methods to study the association between performance and executive functioning in a working memory functional magnetic resonance imaging study. The sPBJ has similar or greater power to the PBJ and permutation procedures while maintaining the nominal type 1 error rate in reasonable sample sizes. We provide an R package to perform inference using the PBJ and sPBJ procedures.  相似文献   

3.
Phenotypic and additive genetic covariance matrices were estimated for 15 morphometric characters in three species and subspecies of Peromyscus. Univariate and multivariate ANOVAs indicate these groups are highly diverged in all characters, P. leucopus having the largest body size, P. maniculatus bairdii the smallest, and P. maniculatus nebrascensis being intermediate. Comparing the structure of P and G within each taxon revealed significant similarities in all three cases. This proportionality was strong enough to justify using P in the place of G to analyze evolutionary processes using quantitative genetic models when G can not be estimated, as in fossil material. However, the similarity between genetic and phenotypic covariance structures is sufficiently low that estimates of the genetic parameters should be used when possible. The additive genetic covariance matrices were compared to examine the assumption that they remain constant during evolution, an assumption which underlies many applications of quantitative-genetic models. While matrix permutation tests indicated statistically significant proportionality between the genetic covariance structures of the two P. maniculatus subspecies, there is no evidence of significant genetic structural similarity between species. This result suggests that the assumption of constant genetic covariance structure may be valid only within species. (It does not, however, necessarily imply a causal relationship between speciation and heterogeneity of genetic covariance structures.) The low matrix correlation for the two P. maniculatus subspecies' genetic covariance matrices indicates G may not be functionally constant, even within species. The lack of similarity observed here may be due partly to sampling variation.  相似文献   

4.
人类群体遗传结构的协方差阵主成分分析方法   总被引:3,自引:0,他引:3  
目的:探讨基因频率矩阵的中心化(或均值化)协方差阵主成分分析方法在人类群体遗传结构研究中的适用性和合理性。方法:从基因频率矩阵的结构特征入手,分析中心化、均值化协方差阵主成分分析与标准化相关阵主成分分析在特征根、特征向量以及降维效果等方面的差异,并通过实例比较不同方法在解释群体遗传结构特征上合理性。结果:中心化(或均值化)协方差阵的主成分不仅反映了基因变异程度的“方差信息量权”,而且反映了基因间相互影响程度的“相关信息量权”;标准化相关阵的主成分反映的仅是“相关信息量权”,不包括“方差信息量权”。通过比较中国26个汉族人群HLA-A基因座中心化协方差阵和标准化相关阵2种主成分分析结果,证实中心化协方差阵主成分分析方法在特征根与特征向量、保留主成分的个数和对主成分的群体遗传学解释的合理性等方面均优于标准化相关阵主成分分析方法。结论:在对群体遗传结构进行主成分分析时,应使用中心化(或均值化)变换消除基因频率矩阵中量级的影响,然后在用其协方差阵提取主成分。  相似文献   

5.
本文在“金翅夜蛾亚科的数值分类研究”的基础上用主成分分析的结果对金翅夜蛾亚科分类的性状做进一步分析,说明各性状对分类的重要性和性状的变异方向,并对用协方差矩阵和相关矩阵进行的主成分分析结果进行比较,说明对本问题(分类指标全是定性指标)适宜用协方差矩阵做主成分分析。  相似文献   

6.
Fu J  Murphy RW 《Systematic biology》1999,48(2):380-395
The ability of permutation tail probability (PTP) analyses to discriminate between character covariance and noise is investigated with both hypothetical and published data sets. PTP is shown to be a powerful tool, not only for detecting character covariance, but also for locating that covariance on trees. PTP is especially useful for evaluating DNA sequence data that may have a high level of homoplasy. A three-step PTP procedure for locating covaried characters is presented.  相似文献   

7.
Rosenbaum PR 《Biometrics》2007,63(2):456-464
Huber's m-estimates use an estimating equation in which observations are permitted a controlled level of influence. The family of m-estimates includes least squares and maximum likelihood, but typical applications give extreme observations limited weight. Maritz proposed methods of exact and approximate permutation inference for m-tests, confidence intervals, and estimators, which can be derived from random assignment of paired subjects to treatment or control. In contrast, in observational studies, where treatments are not randomly assigned, subjects matched for observed covariates may differ in terms of unobserved covariates, so differing outcomes may not be treatment effects. In observational studies, a method of sensitivity analysis is developed for m-tests, m-intervals, and m-estimates: it shows the extent to which inferences would be altered by biases of various magnitudes due to nonrandom treatment assignment. The method is developed for both matched pairs, with one treated subject matched to one control, and for matched sets, with one treated subject matched to one or more controls. The method is illustrated using two studies: (i) a paired study of damage to DNA from exposure to chromium and nickel and (ii) a study with one or two matched controls comparing side effects of two drug regimes to treat tuberculosis. The approach yields sensitivity analyses for: (i) m-tests with Huber's weight function and other robust weight functions, (ii) the permutational t-test which uses the observations directly, and (iii) various other procedures such as the sign test, Noether's test, and the permutation distribution of the efficient score test for a location family of distributions. Permutation inference with covariance adjustment is briefly discussed.  相似文献   

8.
Statistical methods to map quantitative trait loci (QTL) in outbred populations are reviewed, extensions and applications to human and plant genetic data are indicated, and areas for further research are identified. Simple and computationally inexpensive methods include (multiple) linear regression of phenotype on marker genotypes and regression of squared phenotypic differences among relative pairs on estimated proportions of identity-by-descent at a locus. These methods are less suited for genetic parameter estimation in outbred populations but allow the determination of test statistic distributions via simulation or data permutation; however, further inferences including confidence intervals of QTL location require the use of Monte Carlo or bootstrap sampling techniques. A method which is intermediate in computational requirements is residual maximum likelihood (REML) with a covariance matrix of random QTL effects conditional on information from multiple linked markers. Testing for the number of QTLs on a chromosome is difficult in a classical framework. The computationally most demanding methods are maximum likelihood and Bayesian analysis, which take account of the distribution of multilocus marker-QTL genotypes on a pedigree and permit investigators to fit different models of variation at the QTL. The Bayesian analysis includes the number of QTLs on a chromosome as an unknown.  相似文献   

9.
In a multivariate growth-curve model, the estimator of the parameter matrix is a function of the matrix of the sums of squares and of the cross-products due to error. However, if the assumption of a patterned covariance matrix is valid, then the parameter estimator does not depend on the error matrix. A likelihood ratio test of this patterned covariance matrix is constructed and its distribution is discussed. A numerical example is provided in which the design consists of two treatment groups, with three repeated measures being taken of the three response variables.  相似文献   

10.
Summary Analysis of variance and principal components methods have been suggested for estimating repeatability. In this study, six estimation procedures are compared: ANOVA, principal components based on the sample covariance matrix and also on the sample correlation matrix, a related multivariate method (structural analysis) based on the sample covariance matrix and also on the sample correlation matrix, and maximum likelihood estimation. A simulation study indicates that when the standard linear model assumptions are met, the estimators are quite similar except when the repeatability is small. Overall, maximum likelihood appears the preferred method. If the assumption of equal variance is relaxed, the methods based on the sample correlation matrix perform better although others are surprisingly robust. The structural analysis method (with sample correlation matrix) appears to be best.Paper number 776 from the Department of Meat and Animal Science, University of Wisconsin-Madison.  相似文献   

11.
The paper deals with a problem arising for tests in clinical trials. The outcomes of a standard and a new treatment to be compared are multivariate normally distributed with common but unknown covariance matrix. Under the null hypothesis the means of the outcomes are equal, under the alternative the new treatment is assumed to be superior, i.e. the means are larger without further quantification. For known covariance matrix there is a variety of tests for this problem. Some of these procedures can be extended to the case of unknown covariances if one is willing to accept a bias. There is, however, also an efficient unbiased test. The paper contains some numerical comparisons of these different procedures and takes a look on the minimax properties of the unbiased test.  相似文献   

12.
An efficient new method is presented for the characterization of motional correlations derived from a set of protein structures without requiring the separation of overall and internal motion. In this method, termed isotropically distributed ensemble (IDE) analysis, each structure is represented by an ensemble of isotropically distributed replicas corresponding to the situation found in an isotropic protein solution. This leads to a covariance matrix of the cartesian atomic positions with elements proportional to the ensemble average of scalar products of the position vectors with respect to the center of mass. Diagonalization of the covariance matrix yields eigenmodes and amplitudes that describe concerted motions of atoms, including overall rotational and intramolecular dynamics. It is demonstrated that this covariance matrix naturally distinguishes between "rigid" and "mobile" parts without necessitating a priori selection of a reference structure and an atom set for the orientational alignment process. The method was applied to the analysis of a 5-ns molecular dynamics trajectory of native ubiquitin and a 40-ns trajectory of a partially folded state of ubiquitin. The results were compared with essential dynamics analysis. By taking advantage of the spherical symmetry of the IDE covariance matrix, more than a 10-fold speed up is achieved for the computation of eigenmodes and mode amplitudes. IDE analysis is particularly suitable for studying the correlated dynamics of flexible and large molecules.  相似文献   

13.
MOTIVATION: Discriminant analysis for high-dimensional and low-sample-sized data has become a hot research topic in bioinformatics, mainly motivated by its importance and challenge in applications to tumor classifications for high-dimensional microarray data. Two of the popular methods are the nearest shrunken centroids, also called predictive analysis of microarray (PAM), and shrunken centroids regularized discriminant analysis (SCRDA). Both methods are modifications to the classic linear discriminant analysis (LDA) in two aspects tailored to high-dimensional and low-sample-sized data: one is the regularization of the covariance matrix, and the other is variable selection through shrinkage. In spite of their usefulness, there are potential limitations with each method. The main concern is that both PAM and SCRDA are possibly too extreme: the covariance matrix in the former is restricted to be diagonal while in the latter there is barely any restriction. Based on the biology of gene functions and given the feature of the data, it may be beneficial to estimate the covariance matrix as an intermediate between the two; furthermore, more effective shrinkage schemes may be possible. RESULTS: We propose modified LDA methods to integrate biological knowledge of gene functions (or variable groups) into classification of microarray data. Instead of simply treating all the genes independently or imposing no restriction on the correlations among the genes, we group the genes according to their biological functions extracted from existing biological knowledge or data, and propose regularized covariance estimators that encourages between-group gene independence and within-group gene correlations while maintaining the flexibility of any general covariance structure. Furthermore, we propose a shrinkage scheme on groups of genes that tends to retain or remove a whole group of the genes altogether, in contrast to the standard shrinkage on individual genes. We show that one of the proposed methods performed better than PAM and SCRDA in a simulation study and several real data examples.  相似文献   

14.
The asymptotic covariance matrix of the maximum likelihood estimator for the log-linear model is given for a general class of conditional Poisson distributions which include the unconditional Poisson, multinomial and product-multinomial, as special cases. The general conditions are given under which the maximum likelihood covariance matrix is equal to the covariance matrix of an equivalent closed-form weighted least squares estimator.  相似文献   

15.
H Gao  T Zhang  Y Wu  Y Wu  L Jiang  J Zhan  J Li  R Yang 《Heredity》2014,113(6):526-532
Given the drawbacks of implementing multivariate analysis for mapping multiple traits in genome-wide association study (GWAS), principal component analysis (PCA) has been widely used to generate independent ‘super traits'' from the original multivariate phenotypic traits for the univariate analysis. However, parameter estimates in this framework may not be the same as those from the joint analysis of all traits, leading to spurious linkage results. In this paper, we propose to perform the PCA for residual covariance matrix instead of the phenotypical covariance matrix, based on which multiple traits are transformed to a group of pseudo principal components. The PCA for residual covariance matrix allows analyzing each pseudo principal component separately. In addition, all parameter estimates are equivalent to those obtained from the joint multivariate analysis under a linear transformation. However, a fast least absolute shrinkage and selection operator (LASSO) for estimating the sparse oversaturated genetic model greatly reduces the computational costs of this procedure. Extensive simulations show statistical and computational efficiencies of the proposed method. We illustrate this method in a GWAS for 20 slaughtering traits and meat quality traits in beef cattle.  相似文献   

16.
Plant breeders and variety testing agencies routinely test candidate genotypes (crop varieties, lines, test hybrids) in multiple environments. Such multi‐environment trials can be efficiently analysed by mixed models. A single‐stage analysis models the entire observed data at the level of individual plots. This kind of analysis is usually considered as the gold standard. In practice, however, it is more convenient to use a two‐stage approach, in which experiments are first analysed per environment, yielding adjusted means per genotype, which are then summarised across environments in the second stage. Stage‐wise approaches suggested so far are approximate in that they cannot fully reproduce a single‐stage analysis, except in very simple cases, because the variance–covariance matrix of adjusted means from individual environments needs to be approximated by a diagonal matrix. This paper proposes a fully efficient stage‐wise method, which carries forward the full variance–covariance matrix of adjusted means from the individual environments to the analysis across the series of trials. Provided the variance components are known, this method can fully reproduce the results of a single‐stage analysis. Computations are made efficient by a diagonalisation of the residual variance–covariance matrix, which necessitates a corresponding linear transformation of both the first‐stage estimates (e.g. adjusted means and regression slopes for plot covariates) and the corresponding design matrices for fixed and random effects. We also exemplify the extension of the general approach to a three‐stage analysis. The method is illustrated using two datasets, one real and the other simulated. The proposed approach has close connections with meta‐analysis, where environments correspond to centres and genotypes to medical treatments. We therefore compare our theoretical results with recently published results from a meta‐analysis.  相似文献   

17.
Analysis of motor performance variability in tasks with redundancy affords insight about synergies underlying central nervous system (CNS) control. Preferential distribution of variability in ways that minimally affect task performance suggests sophisticated neural control. Unfortunately, in the analysis of variability the choice of coordinates used to represent multi-dimensional data may profoundly affect analysis, introducing an arbitrariness which compromises its conclusions. This paper assesses the influence of coordinates. Methods based on analyzing a covariance matrix are fundamentally dependent on an investigator''s choices. Two reasons are identified: using anisotropy of a covariance matrix as evidence of preferential distribution of variability; and using orthogonality to quantify relevance of variability to task performance. Both are exquisitely sensitive to coordinates. Unless coordinates are known a priori, these methods do not support unambiguous inferences about CNS control. An alternative method uses a two-level approach where variability in task execution (expressed in one coordinate frame) is mapped by a function to its result (expressed in another coordinate frame). An analysis of variability in execution using this function to quantify performance at the level of results offers substantially less sensitivity to coordinates than analysis of a covariance matrix of execution variables. This is an initial step towards developing coordinate-invariant analysis methods for movement neuroscience.  相似文献   

18.
Linear models are typically used to analyze multivariate longitudinal data. With these models, estimating the covariance matrix is not easy because the covariance matrix should account for complex correlated structures: the correlation between responses at each time point, the correlation within separate responses over time, and the cross-correlation between different responses at different times. In addition, the estimated covariance matrix should satisfy the positive definiteness condition, and it may be heteroscedastic. However, in practice, the structure of the covariance matrix is assumed to be homoscedastic and highly parsimonious, such as exchangeable or autoregressive with order one. These assumptions are too strong and result in inefficient estimates of the effects of covariates. Several studies have been conducted to solve these restrictions using modified Cholesky decomposition (MCD) and linear covariance models. However, modeling the correlation between responses at each time point is not easy because there is no natural ordering of the responses. In this paper, we use MCD and hypersphere decomposition to model the complex correlation structures for multivariate longitudinal data. We observe that the estimated covariance matrix using the decompositions is positive-definite and can be heteroscedastic and that it is also interpretable. The proposed methods are illustrated using data from a nonalcoholic fatty liver disease study.  相似文献   

19.
Covariance matrix estimation is a fundamental statistical task in many applications, but the sample covariance matrix is suboptimal when the sample size is comparable to or less than the number of features. Such high-dimensional settings are common in modern genomics, where covariance matrix estimation is frequently employed as a method for inferring gene networks. To achieve estimation accuracy in these settings, existing methods typically either assume that the population covariance matrix has some particular structure, for example, sparsity, or apply shrinkage to better estimate the population eigenvalues. In this paper, we study a new approach to estimating high-dimensional covariance matrices. We first frame covariance matrix estimation as a compound decision problem. This motivates defining a class of decision rules and using a nonparametric empirical Bayes g-modeling approach to estimate the optimal rule in the class. Simulation results and gene network inference in an RNA-seq experiment in mouse show that our approach is comparable to or can outperform a number of state-of-the-art proposals.  相似文献   

20.
In this paper the analysis of covariance in the split block design with many concomitant variables is presented. The problems concerning the estimation of parametric functions and testing hypotheses are discussed. In the presentation of the model three kinds of regression coefficients for individual sources of variation are taken into consideration. It is shown that for every estimable function of fixed effects, the best linear unbiased estimator under the assumed model is the same as the best linear unbiased estimator under the model with covariance matrix equal to identity matrix multiplied by a positive constant. A variance of this estimator can be calculated by the method presented here. Test functions for standard hypotheses concerning fixed effects are obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号