Multiple factor analysis: Difference between revisions

Content deleted Content added
m clean up spacing around commas and other punctuation fixes, replaced: ; → ;
 
(21 intermediate revisions by 17 users not shown)
Line 1:
'''Multiple factor analysis (MFA)''' is a [[Factorial experiment|factorial]] method<ref name="GreenacreBlasius2006">{{cite book|last1=Greenacre|first1=Michael|last2=Blasius|first2=Jorg|author-link2=Jörg Blasius|title=Multiple Correspondence Analysis and Related Methods|url=httphttps://books.google.com/books?id=ZvYV1lfU5zIC&pg=PA352|accessdate=11 June 2014|date=2006-06-23|publisher=CRC Press|isbn=9781420011319|pages=352–}}</ref> devoted to the study of tables in which a group of individuals is described by a set of variables (quantitative and / or qualitative) structured in groups. It is a [[Multivariate statistics|multivariate method]] from the field of [[Ordination (statistics)|ordination]] used to simplify [[Dimensionality reduction|multidimensional data]] structures. MFA treats all involved tables in the same way (symmetrical analysis). It may be seen as an extension of:
* [[Principal component analysis]] (PCA) when variables are quantitative,
* [[Multiple correspondence analysis]] (MCA) when variables are qualitative,
Line 6:
== Introductory example ==
 
Why introduce several active groups of variables active in the same factorial analysis?
 
'' Datadata''
 
Let us considerConsider the case of quantitative variables, that is to say, within the framework of the PCA. An example of data from ecological research provides a useful illustration.
There are, for 72 stations, two types of measurements.:
# The abundance-dominance coefficient of 50 plant species (coefficient ranging from 0 = the plant is absent, to 9 = the species covers more than three-quarters of the surface). The whole set of the 50 coefficients defines the floristic profile of a station.
# Eleven pedological measurements ([[Pedology]] = soil science): particle size, physical, chemistry, etc. The set of these eleven measures defines the pedological profile of a station.
 
'' Three possible analyses'' are possible:
This# PCA of flora (pedology as supplementary): this analysis focuses on the variability of the floristic profiles. Two stations are close one another if they have similar floristic profiles. In a second step, the main dimensions of this variability (i.e. the principal components) are related to the pedological variables introduced as supplementary.
 
This# PCA of pedology (flora as supplementary): this analysis focuses on the variability of soil profiles. Two stations are close if they have the same soil profile. The main dimensions of this variability (i.e. the principal components) are then related to the abundance of plants.
'' PCA of flora (pedology as supplementary)''
One# PCA of the two groups of variables as active: one may want to study the variability of stations from both the point of view of flora and soil. In this approach, two stations should be close if they have both similar flora'' 'and''' similar soils.
This analysis focuses on the variability of the floristic profiles. Two stations are close one another if they have similar floristic profiles. In a second step, the main dimensions of this variability (i.e. the principal components) are related to the pedological variables introduced as supplementary.
 
'' PCA of pedology (flora as supplementary)''
This analysis focuses on the variability of soil profiles. Two stations are close if they have the same soil profile. The main dimensions of this variability (i.e. the principal components) are then related to the abundance of plants.
 
'' PCA of the two groups of variables as active''
One may want to study the variability of stations from both the point of view of flora and soil. In this approach, two stations should be close if they have both similar flora'' 'and''' similar soils.
 
== Balance between groups of variables ==
Line 56 ⟶ 51:
 
{| class="wikitable centre" width="60%"
|+ Table 1. MFA. Test data. A etand B (group 1) are uncorrelated. C1 and C2 (group 2) are identical.
|-
! !! <math>A</math> !! <math>B</math> !! <math>C_1</math>!! <math>C_2</math>
Line 148 ⟶ 143:
The core of MFA is a weighted factorial analysis: MFA firstly provides the classical results of the factorial analyses.
 
1. ''Representations of individuals'' in which two individuals are muchclose closerto thaneach other if they haveexhibit similar values for allmany variables in allthe different variable groups; in practice the user particularly studies the first factorial plane.
 
2.''Representations of quantitative variables'' as in PCA (correlation circle).
Line 193 ⟶ 188:
 
The small size and simplicity of the example allow simple validation of the rules of interpretation. But the method will be more valuable when the data set is large and complex.
Other methods suitable for this type of data are available. [[Procrustes analysis]] is compared to the MFA in.<ref>Pagès Jérôme (2014). Multiple Factor Analysis by Example Using R. Chapman & Hall/CRC The R Series, London. 272p</ref>
 
== History ==
 
MFA was developed by Brigitte Escofier and Jérôme Pagès in the 1980s. It is at the heart of two books written by these authors:<ref>''Ibidem''</ref> and.<ref>Escofier Brigitte & Pagès Jérôme (2008). Analyses factorielles simples et multiples ; objectifs, méthodes et interprétation. Dunod, Paris. 318 p. isbn={{ISBN|978-2-10-051932-3}}</ref> The MFA and its extensions (hierarchical MFA, MFA on contingency tables, etc.) are a research topic of applied mathematics laboratory Agrocampus ([http://math.agrocampus-ouest.fr LMA ²]) which published a book presenting basic methods of exploratory multivariate analysis.<ref>Husson F., Lê S. & Pagès J. (2009). Exploratory Multivariate Analysis by Example Using R. Chapman & Hall/CRC The R Series, London. isbn={{ISBN|978-2-7535-0938-2}}</ref>
 
== Software ==
 
MFA is available in two R packages ([http://factominer.free.fr FactoMineR] and [http://pbil.univ-lyon1.fr/ADE-4 ADE4]) and in many software packages, including SPAD, Uniwin, [[XLSTAT]], etc. There is also a function [http://www.ensai.fr/userfiles/AFMULT%20and%20PLOTAFM%20aout%202010.pdf SAS]{{dead link|date=February 2018 |bot=InternetArchiveBot |fix-attempted=yes }} . The graphs in this article come from the R package FactoMineR.
 
== References ==
 
<!--- See [[Wikipedia:Footnotes]] on how to create references using<ref></ref> tags, these references will then appear here automatically -->
{{Reflist}}
 
Line 211 ⟶ 204:
* [http://factominer.free.fr/ FactoMineR] A R software devoted to exploratory data analysis.
 
[[Category:DataFactor analysis]]
<!-- Just press the "Save page" button below without changing anything! Doing so will submit your article submission for review. Once you have saved this page you will find a new yellow 'Review waiting' box at the bottom of your submission page. If you have submitted your page previously, either the old pink 'Submission declined' template or the old grey 'Draft' template will still appear at the top of your submission page, but you should ignore it. Again, please don't change anything in this text box. Just press the "Save page" button below. -->
 
[[Category:Multivariate statistics]]
[[Category:Data analysis]]
[[Category:Dimension reduction]]
[[Category:Articles created via the Article Wizard]]