Practical aspects of correspondence factor analysis and related multidimensional methods

Bester, Dirk

Practical aspects of correspondence factor analysis and related multidimensional methods

Files

BesterD.pdf (4.14 MB)

Date

1977-06

Authors

Bester, Dirk

Publisher

University of the Free State

Abstract

After the first trial run the Correspondence Analysis program prints all the factors as column and row loadings. Select two factors with the highest characteristic value (Ignore the characteristic value 1). If these factors contribute 60 or more to the percentage of inertia, the two dimensional graph will be obtained by plotting the subjects and objects on the same axis. If this is not the case the method of Andrew's and Multidimensional Scaling should be followed. The association between the subjects and objects is obtained by grouping the graphs as plotted by the Andrew's program with the aid of the difference tables. The representations of the objects and subjects on the same axis is obtained by the Multidimensional Scaling program. Sometimes, when struggling with a particularly difficult interpretation and situation, we ask ourselves whether our efforts are worthwhile. If it is certain that Correspondence analysis reveal structural relationships between elements, can we trust in the individual axes which generate the planes in which we observe the projections of the clouds? Other statisticians do not use Correspondence analysis but other ways of constructing and analyzing clouds (e.g. classical factor analysis). Here it is common practice to rotate the axes, i.e. in the subspace spanned by the first principal axes of inertia of the cloud other axes (orthogonal or not) are chosen the interpretation of which appear easier. For these analysis the more a factor coincides with on of the variables of the table or a group of strongly correlated variables, the more it is interpretable. We are completely opposed to this practice for many reasons. Firstly, if the data is fairly homogenous and the sampling not too sparse the interpretation is often rather easy, thanks to the principle of distributional equivalence. Secondly, it has appeared in numerous applications that a modification of the original data usually does not modify the nature of the computed factors, causing perhaps only a permutation of their order. This kind of stability leads us to believe that one has to pay some respect to the individuality of the factors as computed in Correspondence analysis. Some practitioners often ask: among the set of characteristics (subjects) used to describe the objects, which are the least useful concerning the interpretation of the first computed factors. Another question may be measuring only what determines the important factors of a first study (the experimental basis of which is perhaps chosen to confirm some simple concepts) are we not running the risk of restricting ourselves. This is the sort of questions representative of an attitude which we are trying to avoid. However, having established the conclusions of a study performed on a rather exhaustive table (k(I,J) it is conceivable that one might wish to locate a set, say S, of supplementary elements on the factorial axes not by taking into account the whole table, but only a reduced table. One can compute the factors without repeating the analysis, with the usual formula: [see full text for formula]. Here we are starting to tackle difficulties that can really only be evaluated by someone who has already practised Correspondence analysis.

Keywords

Factor analysis, Correspondence factor analysis, Multidimensional scaling, High-dimensional data, Dissertation (M.Sc. (Mathematical Statistics))--University of the Free State, 1977

URI

http://hdl.handle.net/11660/9486

Collections

All Electronic Theses and Dissertations

Full item page