Revision as of 18:39, 27 March 2020 edit Michael Hardy (talk \| contribs) Administrators 210,597 edits →Kernel two-sample test ← Previous edit		Revision as of 18:40, 27 March 2020 edit undo Michael Hardy (talk \| contribs) Administrators 210,597 edits →Domain adaptation under covariate, target, and conditional shift Next edit →
Line 279: === Domain adaptation under covariate, target, and conditional shift === The goal of [[___domain adaptation]] is the formulation of learning algorithms which generalize well when the training and test data have different distributions. Given training examples <math>\{(x_i^{tr}, y_i^{tr})\}_{i=1}^n</math> and a test set <math>\{(x_j^{te}, y_j^{te}) \}_{j=1}^m </math> where the <math>y_j^{te}</math> are unknown, three types of differences are commonly assumed between the distribution of the training examples <math>P^{tr}(X,Y)</math> and the test distribution <math> P^{te}(X,Y)</math>:<ref name = "DA">K. Zhang, B. Schölkopf, K. Muandet, Z. Wang. (2013). [http://jmlr.org/proceedings/papers/v28/zhang13d.pdf Domain adaptation under target and conditional shift]. ''Journal of Machine Learning Research, '''28'''(3): 819–827.</ref><ref name = "CovS">A. Gretton, A. Smola, J. Huang, M. Schmittfull, K. Borgwardt, B. Schölkopf. (2008). Covariate shift and local learning by distribution matching. ''In J. Quinonero-Candela, M. Sugiyama, A. Schwaighofer, N. Lawrence (eds.). Dataset shift in machine learning'', MIT Press, Cambridge, MA: 131–160.</ref> # '''Covariate ~~Shift~~shift''' in which the marginal distribution of the covariates changes across domains: <math> P^{tr}(X) \neq P^{te}(X)</math> # '''Target ~~Shift~~shift''' in which the marginal distribution of the outputs changes across domains: <math> P^{tr}(Y) \neq P^{te}(Y)</math> # '''Conditional ~~Shift~~shift''' in which <math>P(Y)</math> remains the same across domains, but the conditional distributions differ: <math>P^{tr}(X \mid Y) \neq P^{te}(X \mid Y)</math>. In general, the presence of conditional shift leads to an [[Well-posed problem\|ill-posed]] problem, and the additional assumption that <math>P(X \mid Y)</math> changes only under [[Location parameter\|___location]]-[[Scale parameter\|scale]] (LS) transformations on <math> X </math> is commonly imposed to make the problem tractable. By utilizing the kernel embedding of marginal and conditional distributions, practical approaches to deal with the presence of these types of differences between training and test domains can be formulated. Covariate shift may be accounted for by reweighting examples via estimates of the ratio <math>P^{te}(X)/P^{tr}(X)</math> obtained directly from the kernel embeddings of the marginal distributions of <math>X</math> in each ___domain without any need for explicit estimation of the distributions.<ref name = "CovS"/> Target shift, which cannot be similarly dealt with since no samples from <math>Y</math> are available in the test ___domain, is accounted for by weighting training examples using the vector <math>\boldsymbol{\beta}^*(\mathbf{y}^{tr}) </math> which solves the following optimization problem (where in practice, empirical approximations must be used) <ref name = "DA"/>

Kernel embedding of distributions: Difference between revisions