Revision as of 11:17, 23 June 2016 edit Erel Segal (talk \| contribs) Extended confirmed users, IP block exemptions 14,576 edits →Definition: - improve notation ← Previous edit		Revision as of 11:29, 23 June 2016 edit undo Erel Segal (talk \| contribs) Extended confirmed users, IP block exemptions 14,576 edits →No Free Lunch Theorem Next edit →
Line 37: ==No Free Lunch Theorem== One can ask whether there exists a learning algorithm so that the sample complexity is finite in the strong sense, that is, there is a bound on the number of samples needed so that the algorithm can learn any distribution over the input-output space with a specified target error. More formally, one asks whether there exists a learning algorithm ~~''A''~~<math>ALG</math> such that, for all ''ε'', ''δ'' > 0, there exists a positive integer ''N'' such that for all ''n'' ≥ ''N'', we have<math display="block"> \sup_\rho\left(\Pr_{\rho^n}[\mathcal E(~~f_n~~h_n) - \mathcal E^_\mathcal{H}\geq\varepsilon]\right)<\delta, </math>where <math>~~f_n~~h_n=AALG(S_n)</math>, with <math>S_n = ((x_1,y_1),\ldots,(x_n,y_n))\sim \rho^n</math> as above. The [[No free lunch in search and optimization\|No Free Lunch Theorem]] says that without restrictions on the hypothesis space <math>\mathcal H</math>, this is not the case, i.e., there always exist "bad" distributions for which the sample complexity is arbitrarily large.<ref name=":0">{{citation \|last = Vapnik \| first = Vladimir \| title = Statistical Learning Theory \| place = New York \| publisher = Wiley. \| year = 1998}}</ref> Thus in order to make statements about the rate of convergence of the quantity<math display="block"> \sup_\rho\left(\Pr_{\rho^n}[\mathcal E(f_n) - \mathcal E^_\mathcal{H}\geq\varepsilon]\right), </math>one must either

Sample complexity: Difference between revisions