Which kind of data (normally distributed, standardized, transformed) for a realistic factor analysis?
Taking into account that the problem of the best pretreatment of data for factor analysis has not yet arrived at generally accepted solutions, it has been tried to estimate, by an empiric procedure, the goodness of the results of repeated factor analyses with different pretreatments, conditions and statistical distribution of variables. Sets of multivariate data for river waters have been constructed firstly, after fixing the number and nature of the latent factors corresponding to the sources of pollution. A series of R-mode factor analyses has then been performed on these data, using various pretreatments (autoscaling, logarithmic transformation and their combinations), various factor rotations (rigid and oblique) and methods of computation of significant factors (40 data processing on the whole). Factor analyses have also been performed on real data of the Po river in the Piedmont region. Comparisons between the results obtained by factor analysis and the actual situation of the sys