Revision as of 13:55, 12 October 2017 edit Mathstat (talk \| contribs) Extended confirmed users 959 edits m →Distance covariance: "squared" was incorrect here ← Previous edit		Revision as of 22:01, 10 November 2017 edit undo Drbabinski (talk \| contribs) 5 edits - updated intro to be more approachable for non-experts. larger focus on the ability to detect linear + nonlinear interactions, how dcorr can be used as a statistical test, and more scope for dcorr (i.e., kernel based methods, and its use in CCA and ICA) Next edit →
Line 1: In [[statistics]] and in [[probability theory]], '''distance correlation''' or '''distance covariance''' is a measure of [[statistical dependence]] between two ~~[[random variable]]s or two~~paired [[random vector]]s of arbitrary, not necessarily equal, [[Euclidean vector\|dimension]]. ItIn the limit of an infinite number of samples, the distance correlation is zero if and only if the ~~[[multivariate~~ random ~~variable\|random variables]]~~vectors are [[statistically independent]]. Thus, ~~unlike~~distance correlation can detect both linear and nonlinear interactions between two random vectors. This is in contrast to [[Pearson's correlation]], which can beonly ~~zero~~detect ~~for~~linear ~~dependent~~interactions between two [[random ~~variables~~variable]]s. Distance correlation can be used to perform a [[Statistical hypothesis testing\|statistical test]] of dependence with a [[permutation test]]. One first computes the distance correlation (involving the re-centering of Euclidean distance matrices) between two random vectors, and then compares this value to the distance correlations of many shuffles of the data. The distance correlation is derived from a number of other quantities that are used in its specification, specifically: '''distance variance''', '''distance standard deviation''' and '''distance covariance'''. These quantities take the same roles as the ordinary [[Moment (mathematics)\|moment]]s with corresponding names in the specification of the [[Pearson product-moment correlation coefficient]].▼ ~~These~~Distance ~~distance-based measures~~correlation can be put into an indirect relationship to the ordinary moments by an [[#Alternative formulation: Brownian covariance\|alternative formulation]] ~~(described below)~~ using ideas related to [[Brownian motion]],. ~~and this~~This has led to the use of names such as '''Brownian covariance''' and '''Brownian distance covariance'''. Other correlational metrics, including kernel-based correlational metrics (such as the Hilbert-Schmidt Independence Criterion or HSIC) can also detect linear and nonlinear interactions. Both distance correlation and kernel-based metrics can be used in methods such as [[canonical correlation analysis]] and [[independent component analysis]] to yield stronger [[statistical power]]. [[Image:Distance Correlation Examples.svg\|thumb\|400px\|right\|Several sets of (''x'', ''y'') points, with the distance correlation coefficient of ''x'' and ''y'' for each set. Compare to the graph on [[correlation]]]] Line 10: The classical measure of dependence, the [[Pearson product-moment correlation coefficient\|Pearson correlation coefficient]],<ref>Pearson (1895)</ref> is mainly sensitive to a linear relationship between two variables. Distance correlation was introduced in 2005 by [[Gabor J Szekely]] in several lectures to address this deficiency of Pearson’s [[correlation]], namely that it can easily be zero for dependent variables. Correlation = 0 (uncorrelatedness) does not imply independence while distance correlation = 0 does imply independence. The first results on distance correlation were published in 2007 and 2009.<ref name=SR2007>{{citation\|author1=G. J. Szekely \|author2=M. L. Rizzo \|author3=N. K. Bakirov \| year=2007\| title= Measuring and Testing Independence by Correlation of Distances\| journal= Annals of Statistics\| volume=35\| issue=6\| pages=2769–2794\| url=http://dx.doi.org/10.1214/009053607000000505}}.</ref><ref name=SR2009>Székely & Rizzo (2009)</ref> It was proved that distance covariance is the same as the Brownian covariance.<ref name=SR2009/> These measures are examples of [[energy distance]]s. ▲The distance correlation is derived from a number of other quantities that are used in its specification, specifically: '''distance variance''', '''distance standard deviation''' and '''distance covariance'''. These quantities take the same roles as the ordinary [[Moment (mathematics)\|moment]]s with corresponding names in the specification of the [[Pearson product-moment correlation coefficient]]. ==Definitions==

Distance correlation: Difference between revisions