Content deleted Content added
Marcocapelle (talk | contribs) removed grandparent category of Category:Multivariate statistics |
→Introductory example: Minor corrections on format and phrasing |
||
Line 6:
== Introductory example ==
Why introduce several active groups of variables
'' data''
Let us consider the case of quantitative variables, that is to say, within the framework of the PCA. An example of data from ecological research provides a useful illustration.
There are, for 72 stations, two types # The abundance-dominance coefficient of 50 plant species (coefficient ranging from 0 = the plant is absent, to 9 = the species covers more than three-quarters of the surface). The whole set of the 50 coefficients defines the floristic profile of a station.
# Eleven pedological measurements ([[Pedology]] = soil science): particle size, physical, chemistry, etc. The set of these eleven measures defines the pedological profile of a station.
▲This analysis focuses on the variability of the floristic profiles. Two stations are close one another if they have similar floristic profiles. In a second step, the main dimensions of this variability (i.e. the principal components) are related to the pedological variables introduced as supplementary.
▲This analysis focuses on the variability of soil profiles. Two stations are close if they have the same soil profile. The main dimensions of this variability (i.e. the principal components) are then related to the abundance of plants.
▲One may want to study the variability of stations from both the point of view of flora and soil. In this approach, two stations should be close if they have both similar flora'' 'and''' similar soils.
== Balance between groups of variables ==
|