Content deleted Content added
m →Distance covariance: "squared" was incorrect here |
Drbabinski (talk | contribs) - updated intro to be more approachable for non-experts. larger focus on the ability to detect linear + nonlinear interactions, how dcorr can be used as a statistical test, and more scope for dcorr (i.e., kernel based methods, and its use in CCA and ICA) |
||
Line 1:
In [[statistics]] and in [[probability theory]], '''distance correlation''' or '''distance covariance''' is a measure of [[statistical dependence]] between two
Distance correlation can be used to perform a [[Statistical hypothesis testing|statistical test]] of dependence with a [[permutation test]]. One first computes the distance correlation (involving the re-centering of Euclidean distance matrices) between two random vectors, and then compares this value to the distance correlations of many shuffles of the data.
The distance correlation is derived from a number of other quantities that are used in its specification, specifically: '''distance variance''', '''distance standard deviation''' and '''distance covariance'''. These quantities take the same roles as the ordinary [[Moment (mathematics)|moment]]s with corresponding names in the specification of the [[Pearson product-moment correlation coefficient]].▼
[[Image:Distance Correlation Examples.svg|thumb|400px|right|Several sets of (''x'', ''y'') points, with the distance correlation coefficient of ''x'' and ''y'' for each set. Compare to the graph on [[correlation]]]]
Line 10:
The classical measure of dependence, the [[Pearson product-moment correlation coefficient|Pearson correlation coefficient]],<ref>Pearson (1895)</ref> is mainly sensitive to a linear relationship between two variables. Distance correlation was introduced in 2005 by [[Gabor J Szekely]] in several lectures to address this deficiency of Pearson’s [[correlation]], namely that it can easily be zero for dependent variables. Correlation = 0 (uncorrelatedness) does not imply independence while distance correlation = 0 does imply independence. The first results on distance correlation were published in 2007 and 2009.<ref name=SR2007>{{citation|author1=G. J. Szekely |author2=M. L. Rizzo |author3=N. K. Bakirov | year=2007| title= Measuring and Testing Independence by Correlation of Distances| journal= Annals of Statistics| volume=35| issue=6| pages=2769–2794| url=http://dx.doi.org/10.1214/009053607000000505}}.</ref><ref name=SR2009>Székely & Rizzo (2009)</ref> It was proved that distance covariance is the same as the Brownian covariance.<ref name=SR2009/> These measures are examples of [[energy distance]]s.
▲The distance correlation is derived from a number of other quantities that are used in its specification, specifically: '''distance variance''', '''distance standard deviation''' and '''distance covariance'''. These quantities take the same roles as the ordinary [[Moment (mathematics)|moment]]s with corresponding names in the specification of the [[Pearson product-moment correlation coefficient]].
==Definitions==
|