首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 12 毫秒
1.
Principal component analysis (PCA) is a one-group method. Its purpose is to transform correlated variables into uncorrelated ones and to find linear combinations accounting for a relatively large amount of the total variability, thus reducing the number of original variables to a few components only.
In the simultaneous analysis of different groups, similarities between the principal component structures can often be modelled by the methods of common principal components (CPCs) or partial CPCs. These methods assume that either all components or only some of them are common to all groups, the discrepancies being due mainly to sampling error.
Previous authors have dealt with the k-group situation either by pooling the data of all groups, or by pooling the within-group variance-covariance matrices before performing a PCA. The latter technique is known as multiple group principal component analysis or MGPCA (Thorpe, 1983a). We argue that CPC- or partial CPC-analysis is often more appropriate than these previous methods.
A morphometrical example using males and females of Microtus californicus and M. ochrogaster is presented, comparing PCA, CPC and partial CPC analyses. It is shown that the new methods yield estimated components having smaller standard errors than when groupwise analyses are performed. Formulas are given for estimating standard errors of the eigenvalues and eigenvectors, as well as for computing the likelihood ratio statistic used to test the appropriateness of the CPC- or partial CPC-model.  相似文献   

2.
Multiple group principal component analysis and population differentiation   总被引:1,自引:0,他引:1  
This paper explores the requirements and advantages of multiple group principal component analysis (MGPCA) when it is used to investigate population differentiation. A distinction is drawn between equality of orientation of the within-group axes and equality of variance along these axes. Several examples of the use of MGPCA are discussed and it is shown that MGPCA per se does not require equality of variance along the axes although it may be a requirement of some of the techniques subsequently used to analyse the component scores. MGPCA is simple and direct, being based on the mathematically well defined eigenvector analysis of a symmetric positive definite (pooled within-group covariance) matrix and it can be thought of as a step in the computation of canonical variate analysis (CVA). It can be used with CVA (which is the most popular method of biometrically assessing population affinities) to assess the contribution of within-group components to among-group discrimination. It is also one of a range of appropriate techniques that can be used to define (and delete if required) within-group growth effects and is particularly suitable when CVA is being used to assess the population affinities. When used in this way it has the advantage of being more influenced by the groups with the greatest growth range.  相似文献   

3.
When the explanatory variables of a linear model are split into two groups, two notions of collinearity are defined: a collinearity between the variables of each group, of which the mean is called residual collinearity, and a collinearity between the two groups called explained collinearity. Canonical correlation analysis provides information about the collinearity: large canonical correlation coefficients correspond to some small eigenvalues and eigenvectors of the correlation matrix and characterise the explained collinearity. Other small eigenvalues of this matrix correspond to the residual collinearity. A selection of predictors can be performed from the canonical correlation variables, according to their partial correlation coefficient with the explained variable. In the proposed application, the results obtained by the selection of canonical variables are better than those given by classical regression and by principal component regression.  相似文献   

4.
Summary Do morphogenetic processes cause common patterns of phenotypic covariation, and do those patterns evolve over microevolutionary timescales? Evolution of molar shape variance–covariance (P) matrixes was studied in five populations of the common shrew, Sorex araneus. P matrix evolution was assessed using matrix correlation, matrix disparity, and common principal component analysis (CPCA). Significant changes in covariance structure were found among the populations, but the differences were small. A computer model was used to estimate the theoretical covariance introduced into the phenotype by developmental interactions. Molar developmental processes explained some of the covariance in the shrew samples, especially as measured by matrix correlation, but the proportion was relatively small. Developmental principal components (PCs) were only infrequently associable with common principal components. The results suggest that molar shape P matrixes can evolve quickly in a manner only loosely constrained by development, and that their shared covariance is probably dominated by factors more proximate than development. Rarefaction showed that sample size severely affected P comparisons when n < 15 for matrix correlation and disparity, and when n < 30 for CPCA. Among CPCA evaluation criteria, Akaike Information Criterion performed better than jump‐up at n < 30, but worse at n > 30.  相似文献   

5.
Principal component analysis is a widely used ''dimension reduction'' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given.  相似文献   

6.
We present a simple and effective method for combining distance matrices from multiple genes on identical taxon sets to obtain a single representative distance matrix from which to derive a combined-gene phylogenetic tree. The method applies singular value decomposition (SVD) to extract the greatest common signal present in the distances obtained from each gene. The first right eigenvector of the SVD, which corresponds to a weighted average of the distance matrices of all genes, can thus be used to derive a representative tree from multiple genes. We apply our method to three well known data sets and estimate the uncertainty using bootstrap methods. Our results show that this method works well for these three data sets and that the uncertainty in these estimates is small. A simulation study is conducted to compare the performance of our method with several other distance based approaches (namely SDM, SDM* and ACS97), and we find the performances of all these approaches are comparable in the consensus setting. The computational complexity of our method is similar to that of SDM. Besides constructing a representative tree from multiple genes, we also demonstrate how the subsequent eigenvalues and eigenvectors may be used to identify if there are conflicting signals in the data and which genes might be influential or outliers for the estimated combined-gene tree.  相似文献   

7.
Ecological and evolutionary studies are often concerned with the properties of covariance matrices. The method of random skewers (RS method) has been used compare a matrix to an a priori vector or to compare two matrices. The method involves multiplying a matrix by many random vectors drawn from a uniform distribution over all possible vector directions. The comparisons are usually made using the average angle (or cosine) of the response vectors to an a priori vector or to the response vectors corresponding from another matrix. Angles are usually constrained to the interval 0°–90° because the distribution of response vectors is bipolar bimodal. The size of the average angle or cosine depends strongly on the relative sizes of the eigenvalues (especially the first). The distribution of angles between pairs of response vectors from two covariance matrices is more complicated because it depends on the differences in orientation of the eigenvectors and the relative sizes of the eigenvalues of the both matrices. The average absolute value of the angles between these pairs of response vectors depends on the relative sizes of the eigenvalues of the matrices making it difficult to interpret its meaning without knowledge of the eigenvalues and eigenvectors of the two matrices. Thus, it is simpler to just directly compare matrices in terms of these quantities.  相似文献   

8.
Zhang Z  Wriggers W 《Proteins》2006,64(2):391-403
Multivariate statistical methods are widely used to extract functional collective motions from macromolecular molecular dynamics (MD) simulations. In principal component analysis (PCA), a covariance matrix of positional fluctuations is diagonalized to obtain orthogonal eigenvectors and corresponding eigenvalues. The first few eigenvectors usually correspond to collective modes that approximate the functional motions in the protein. However, PCA representations are globally coherent by definition and, for a large biomolecular system, do not converge on the time scales accessible to MD. Also, the forced orthogonalization of modes leads to complex dependencies that are not necessarily consistent with the symmetry of biological macromolecules and assemblies. Here, we describe for the first time the application of local feature analysis (LFA) to construct a topographic representation of functional dynamics in terms of local features. The LFA representations are low dimensional, and like PCA provide a reduced basis set for collective motions, but they are sparsely distributed and spatially localized. This yields a more reliable assignment of essential dynamics modes across different MD time windows. Also, the intrinsic dynamics of local domains is more extensively sampled than that of globally coherent PCA modes.  相似文献   

9.
In clinical trials, the comparison of two different populations is a common problem. Nonlinear (parametric) regression models are commonly used to describe the relationship between covariates, such as concentration or dose, and a response variable in the two groups. In some situations, it is reasonable to assume some model parameters to be the same, for instance, the placebo effect or the maximum treatment effect. In this paper, we develop a (parametric) bootstrap test to establish the similarity of two regression curves sharing some common parameters. We show by theoretical arguments and by means of a simulation study that the new test controls its significance level and achieves a reasonable power. Moreover, it is demonstrated that under the assumption of common parameters, a considerably more powerful test can be constructed compared with the test that does not use this assumption. Finally, we illustrate the potential applications of the new methodology by a clinical trial example.  相似文献   

10.
In this paper, a new method for QRS complex analysis and estimation based on principal component analysis (PCA) and polynomial fitting techniques is presented. Multi-channel ECG signals were recorded and QRS complexes were obtained from every channel and aligned perfectly in matrices. For every channel, the covariance matrix was calculated from the QRS complex data matrix of many heartbeats. Then the corresponding eigenvectors and eigenvalues were calculated and reconstruction parameter vectors were computed by expansion of every beat in terms of the principal eigenvectors. These parameter vectors show short-term fluctuations that have to be discriminated from abrupt changes or long-term trends that might indicate diseases. For this purpose, first-order poly-fit methods were applied to the elements of the reconstruction parameter vectors. In healthy volunteers, subsequent QRS complexes were estimated by calculating the corresponding reconstruction parameter vectors derived from these functions. The similarity, absolute error and RMS error between the original and predicted QRS complexes were measured. Based on this work, thresholds can be defined for changes in the parameter vectors that indicate diseases.  相似文献   

11.
Among the statistical methods available to control for phylogenetic autocorrelation in ecological data, those based on eigenfunction analysis of the phylogenetic distance matrix among the species are becoming increasingly important tools. Here, we evaluate a range of criteria to select eigenvectors extracted from a phylogenetic distance matrix (using phylogenetic eigenvector regression, PVR) that can be used to measure the level of phylogenetic signal in ecological data and to study correlated evolution. We used a principal coordinate analysis to represent the phylogenetic relationships among 209 species of Carnivora by a series of eigenvectors, which were then used to model log‐transformed body size. We first conducted a series of PVRs in which we increased the number of eigenvectors from 1 to 70, following the sequence of their associated eigenvalues. Second, we also investigated three non‐sequential approaches based on the selection of 1) eigenvectors significantly correlated with body size, 2) eigenvectors selected by a standard stepwise algorithm, and 3) the combination of eigenvectors that minimizes the residual phylogenetic autocorrelation. We mapped the mean specific component of body size to evaluate how these selection criteria affect the interpretation of non‐phylogenetic signal in Bergmann's rule. For comparison, the same patterns were analyzed using autoregressive model (ARM) and phylogenetic generalized least‐squares (PGLS). Despite the robustness of PVR to the specific approaches used to select eigenvectors, using a relatively small number of eigenvectors may be insufficient to control phylogenetic autocorrelation, leading to flawed conclusions about patterns and processes. The method that minimizes residual autocorrelation seems to be the best choice according to different criteria. Thus, our analyses show that, when the best criterion is used to control phylogenetic structure, PVR can be a valuable tool for testing hypotheses related to heritability at the species level, phylogenetic niche conservatism and correlated evolution between ecological traits.  相似文献   

12.
Because bone tissue adapts to loading conditions, finite element simulations of remodelling bone require a precise prediction of dynamically changing anisotropic elastic parameters. We present a phenomenological theory that refers to the tissue in terms of the tendency of the structure to align with principal stress directions. We describe the material parameters of remodelling bone. This work follows findings by the same research group and independently by Danilov (1971) in the field of plasticity, where the dependencies of the components of the stiffness tensor in terms of time are based on Hill's anisotropy. We modify such an approach in this novel theory that addresses bone tissue that can regenerate. The computational assumption of the theory is that bone trabeculae have the tendency to orient along one of the principal stress directions but during remodelling the principal stresses change continuously and the resulting orientation of the trabeculae can differ from the principal stress direction at any given time. The novelty of this work consists in the limited number of parameters needed to compute the twenty-one anisotropic material parameters at any given location in the bone tissue. In addition to the theory, we present here two cases of simplified geometry, loading and boundary conditions to show the effect of (1) time on the material properties; and (2) change of loading conditions on the anisotropic parameters. The long term goal is to experimentally verify that the predictions generated by theory provide a reliable simulation of cancellous bone properties.  相似文献   

13.
Meyer K  Kirkpatrick M 《Genetics》2008,180(2):1153-1166
Eigenvalues and eigenvectors of covariance matrices are important statistics for multivariate problems in many applications, including quantitative genetics. Estimates of these quantities are subject to different types of bias. This article reviews and extends the existing theory on these biases, considering a balanced one-way classification and restricted maximum-likelihood estimation. Biases are due to the spread of sample roots and arise from ignoring selected principal components when imposing constraints on the parameter space, to ensure positive semidefinite estimates or to estimate covariance matrices of chosen, reduced rank. In addition, it is shown that reduced-rank estimators that consider only the leading eigenvalues and -vectors of the "between-group" covariance matrix may be biased due to selecting the wrong subset of principal components. In a genetic context, with groups representing families, this bias is inverse proportional to the degree of genetic relationship among family members, but is independent of sample size. Theoretical results are supplemented by a simulation study, demonstrating close agreement between predicted and observed bias for large samples. It is emphasized that the rank of the genetic covariance matrix should be chosen sufficiently large to accommodate all important genetic principal components, even though, paradoxically, this may require including a number of components with negligible eigenvalues. A strategy for rank selection in practical analyses is outlined.  相似文献   

14.
Shannon entropy is used to provide an estimate of the number of interpretable components in a principal component analysis. In addition, several ad hoc stopping rules for dimension determination are reviewed and a modification of the broken stick model is presented. The modification incorporates a test for the presence of an "effective degeneracy" among the subspaces spanned by the eigenvectors of the correlation matrix of the data set then allocates the total variance among subspaces. A summary of the performance of the methods applied to both published microarray data sets and to simulated data is given.  相似文献   

15.
The dynamics of directionally tuned linear multi-input single-output systems varies generally as a function of the spatial orientation of the inputs. A linear system receiving directionally specific inputs is represented by a linear combination of the respective input transfer functions. The input-output behaviour of such systems can be described by a vector transfer function which specifies the polarization directions of the system in real space. These directions, which can be either one (unidirectional vector transfer function) or two (bidirectional vector transfer function) but never three, are obtained by computing the eigenvectors and eigenvalues of the system matrix that is defined by the gain and phase values of the system's response to harmonic stimulation directed along three orthogonal directions in space. The spatial tuning behaviour is determined by the quadratic form associated with the system matrix. Neuronal systems with bidirectional vector transfer functions process input information in a plane-specific way and exhibit novel characteristics, very much different from those of systems with unidirectional vector transfer functions.  相似文献   

16.
Larvae of all three southern hemisphere anadromous parasitic lampreys were collected from rivers in Australia, New Zealand and South America. Body intervals were measured, trunk myomeres counted and the frequency of pigmentation in different body regions recorded. Morphometric data were subjected to multiple group principal components analysis (MGPCA) which took into account changes during growth. The components (together with myomere counts) and the pigmentation data were both subjected to discriminant analysis. Ordination and rank correlation tests revealed no evidence for either latitudinal clines or a continuum of circumpolar change amongst larval lamprey populations. Clustering of population centroids clearly distinguished between Mordacia lapicida (Gray) from Chile and M. mordax (Richardson) from south-eastern Australia. Populations of Geotria australis Gray divided into groups representing three geographical regions, namely Argentina, Chile and Australasia (Western Australia, Tasmania and New Zealand). Ammocoetes from Argentina were the most divergent, possessing a more posterior cloaca, taller dorsal fins, a greater gap between dorsal fins, and distinctive pigmentation on the head and caudal fin. Within the Australasian group, Western Australian and New Zealand populations clustered closer than either did with those from Tasmania. The cluster analyses for larval populations of G. australis suggest that, during their marine trophic phase, the adults of this species originating from Argentinian and Chilean rivers follow different migratory routes, whereas those from Western Australia, New Zealand and, to a lesser extent, Tasmania intermix.  相似文献   

17.
18.
We use principal component analysis (PCA) to detect functionally interesting collective motions in molecular-dynamics simulations of membrane-bound gramicidin A. We examine the statistical and structural properties of all PCA eigenvectors and eigenvalues for the backbone and side-chain atoms. All eigenvalue spectra show two distinct power-law scaling regimes, quantitatively separating large from small covariance motions. Time trajectories of the largest PCs converge to Gaussian distributions at long timescales, but groups of small-covariance PCs, which are usually ignored as noise, have subdiffusive distributions. These non-Gaussian distributions imply anharmonic motions on the free-energy surface. We characterize the anharmonic components of motion by analyzing the mean-square displacement for all PCs. The subdiffusive components reveal picosecond-scale oscillations in the mean-square displacement at frequencies consistent with infrared measurements. In this regime, the slowest backbone mode exhibits tilting of the peptide planes, which allows carbonyl oxygen atoms to provide surrogate solvation for water and cation transport in the channel lumen. Higher-frequency modes are also apparent, and we describe their vibrational spectra. Our findings expand the utility of PCA for quantifying the essential features of motion on the anharmonic free-energy surface made accessible by atomistic molecular-dynamics simulations.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号