Kernel embedding of distributions: Difference between revisions

Content deleted Content added
Point out the input variable of the feature map in "Definitions" section.
m Added the condition that P(X=s)>0 for the conditional operator to be well defined. Also I corrected a typo, replacing Q by P
Line 329:
:<math>\mathcal{C}_{XY} = \mathbb{E} [\mathbf{e}_X \otimes \mathbf{e}_Y] = ( P(X=s, Y=t))_{s,t \in \{1,\ldots,K\}} </math>
 
TheWhen <math> P(X=s)>0 </math>, for all <math> s \in \{1,\ldots,K\} </math>, the conditional distribution embedding operator,
 
:<math>\mathcal{C}_{Y\mid X} = \mathcal{C}_{YX} \mathcal{C}_{XX}^{-1},</math>
Line 347:
In this discrete-valued setting with the Kronecker delta kernel, the [[#Rules of probability as operations in the RKHS|kernel sum rule]] becomes
 
:<math>\underbrace{\begin{pmatrix} QP(X=1) \\ \vdots \\ P(X = N) \\ \end{pmatrix}}_{\mu_X^\pi} = \underbrace{\begin{pmatrix} \\ P(X=s \mid Y=t) \\ \\ \end{pmatrix}}_{\mathcal{C}_{X\mid Y}} \underbrace{\begin{pmatrix} \pi(Y=1) \\ \vdots \\ \pi(Y = N) \\ \end{pmatrix}}_{ \mu_Y^\pi}</math>
 
The [[#Rules of probability as operations in the RKHS|kernel chain rule]] in this case is given by