Content deleted Content added
Joe Decker (talk | contribs) wl factoral |
m clean up spacing around commas and other punctuation fixes, replaced: ; → ; |
||
(26 intermediate revisions by 21 users not shown) | |||
Line 1:
'''Multiple
▲'''Multiple Factor Analysis''' (MFA) is a [[Factorial experiment|factorial]] method<ref name="GreenacreBlasius2006">{{cite book|last1=Greenacre|first1=Michael|last2=Blasius|first2=Jorg|title=Multiple Correspondence Analysis and Related Methods|url=http://books.google.com/books?id=ZvYV1lfU5zIC&pg=PA352|accessdate=11 June 2014|date=2006-06-23|publisher=CRC Press|isbn=9781420011319|pages=352–}}</ref> devoted to the study of tables in which a group of individuals is described by a set of variables (quantitative and / or qualitative) structured in groups. It may be seen as an extension of:
* [[Principal component analysis]] (PCA) when variables are quantitative,
* [[Multiple correspondence analysis]] (MCA) when variables are qualitative,
* [[Factor analysis of mixed data]] (FAMD) when the active variables belong to the two types.
== Introductory
Why introduce several active groups of variables
''
There are, for 72 stations, two types of measurements # The abundance-dominance coefficient of 50 plant species (coefficient ranging from 0 = the plant is absent, to 9 = the species covers more than three-quarters of the surface). The whole set of the 50 coefficients defines the floristic profile of a station.
# Eleven pedological measurements ([[Pedology]] = soil science): particle size, physical, chemistry, etc. The set of these eleven measures defines the pedological profile of a station.
This analysis focuses on the variability of the floristic profiles. Two stations are close one another if they have similar floristic profiles. In a second step, the main dimensions of this variability (i.e. the principal components) are related to the pedological variables introduced as supplementary.▼
This analysis focuses on the variability of soil profiles. Two stations are close if they have the same soil profile. The main dimensions of this variability (i.e. the principal components) are then related to the abundance of plants.▼
Three analyses are possible:
▲
One may want to study the variability of stations from both the point of view of flora and soil. In this approach, two stations should be close if they have both similar flora'' 'and''' similar soils.▼
▲
▲
== Balance between groups of variables ==
Line 57 ⟶ 51:
{| class="wikitable centre" width="60%"
|+ Table 1. MFA. Test data. A
|-
! !! <math>A</math> !! <math>B</math> !! <math>C_1</math>!! <math>C_2</math>
Line 112 ⟶ 106:
Table 2 summarizes the inertia of the first two axes of the PCA and of the MFA applied to Table 1.
Group 2 variables contribute to 88.95
The first axis of the MFA (on Table 1 data) shows the balance between the two groups of variables: the contribution of each group to the inertia of this axis is strictly equal to 50%.
Line 149 ⟶ 143:
The core of MFA is a weighted factorial analysis: MFA firstly provides the classical results of the factorial analyses.
1. ''Representations of individuals'' in which two individuals are
2.''Representations of quantitative variables'' as in PCA (correlation circle).
Line 169 ⟶ 163:
=== Graphics specific to this kind of multiple table ===
5. ''Superimposed representations of individuals'' « seen » by each group. An individual considered from the point of view of a single group is called ''partial individual'' (in parallel, an individual considered from the point of view of all variables is said ''mean individual'' because it lies at the center of gravity of its partial points). Partial cloud <math>N_i^j</math> gathers the <math>I</math> individuals from the perspective of the single group <math>j</
[[File:AFM fig3.jpg|center|thumb|Figure 3. MFA. Test data. Superimposed representation of mean and partial clouds.]]
Line 182 ⟶ 176:
7. ''Representations of factors of separate analyses'' of the different groups. These factors are represented as supplementary quantitative variables (correlation circle).
[[File:AFM fig5.jpg|center|thumb|Figure 5. MFA. Test data. Representation of the principal components of separate PCA of each group.]]
In the example (figure 5), the first axis of the MFA is relatively strongly correlated (r = .80) to the first component of the group 2. This group, consisting of two identical variables, possesses only one principal component (confounded with the variable). The group 1 consists of two orthogonal variables: any direction of the subspace generated by these two variables has the same inertia (equal to 1). So there is uncertainty in the choice of principal components and there is no reason to be interested in one of them in particular. However, the two components provided by the program are well represented: the plane of the MFA is close to the plane spanned by the two variables of group 1.
Line 194 ⟶ 188:
The small size and simplicity of the example allow simple validation of the rules of interpretation. But the method will be more valuable when the data set is large and complex.
Other methods suitable for this type of data are available. [[Procrustes analysis]] is compared to the MFA in.<ref>Pagès Jérôme (2014). Multiple Factor Analysis by Example Using R. Chapman & Hall/CRC The R Series, London. 272p
== History ==
MFA was developed by Brigitte Escofier and Jérôme Pagès in the 1980s. It is at the heart of two books written by these authors:<ref>
== Software ==
MFA is available in two R packages ([http://factominer.free.fr FactoMineR] and [http://pbil.univ-lyon1.fr/ADE-4 ADE4]) and in many software packages, including SPAD, Uniwin, [[XLSTAT]], etc. There is also a function [http://www.ensai.fr/userfiles/AFMULT%20and%20PLOTAFM%20aout%202010.pdf SAS]{{dead link|date=February 2018 |bot=InternetArchiveBot |fix-attempted=yes }} . The graphs in this article come from the R package FactoMineR.
== References ==
{{Reflist}}
== External links ==
* [http://factominer.free.fr/ FactoMineR] A R software devoted to exploratory data analysis.
[[Category:
|