Optimal Representation of Supplementary Variables in Biplots from Principal Component Analysis and Correspondence Analysis |
| |
Authors: | Jan Graffelman Toms Aluja‐Banet |
| |
Institution: | Jan Graffelman,Tomàs Aluja‐Banet |
| |
Abstract: | This paper treats the topic of representing supplementary variables in biplots obtained by principal component analysis (PCA) and correspondence analysis (CA). We follow a geometrical approach where we minimize errors that are obtained when the scores of the PCA or CA solution are projected onto a vector that represents a supplementary variable. This paper shows that optimal directions for supplementary variables can be found by solving a regression problem, and justifies that earlier formulae from Gabriel are optimal in the least squares sense. We derive new results regarding the geometrical properties, goodness of fit statistics and the interpretation of supplementary variables. It is shown that supplementary variables can be represented by plotting their correlation coefficients with the axes of the biplot only when the proper type of scaling is used. We discuss supplementary variables in an ecological context and give illustrations with data from an environmental monitoring survey. |
| |
Keywords: | Additional variable Conditional biplot Environmental variable Indirect gradient analysis Passive sample Passive variable Supplementary point |
|