Sample complexity: Difference between revisions

Content deleted Content added
Definition: - improve notation
Line 37:
 
==No Free Lunch Theorem==
One can ask whether there exists a learning algorithm so that the sample complexity is finite in the strong sense, that is, there is a bound on the number of samples needed so that the algorithm can learn any distribution over the input-output space with a specified target error. More formally, one asks whether there exists a learning algorithm ''A''<math>ALG</math> such that, for all ''ε'', ''δ'' > 0, there exists a positive integer ''N'' such that for all ''n'' ≥ ''N'', we have<math display="block">
\sup_\rho\left(\Pr_{\rho^n}[\mathcal E(f_nh_n) - \mathcal E^*_\mathcal{H}\geq\varepsilon]\right)<\delta,
</math>where <math>f_nh_n=AALG(S_n)</math>, with <math>S_n = ((x_1,y_1),\ldots,(x_n,y_n))\sim \rho^n</math> as above. The [[No free lunch in search and optimization|No Free Lunch Theorem]] says that without restrictions on the hypothesis space <math>\mathcal H</math>, this is not the case, i.e., there always exist "bad" distributions for which the sample complexity is arbitrarily large.<ref name=":0">{{citation |last = Vapnik | first = Vladimir | title = Statistical Learning Theory | place = New York | publisher = Wiley. | year = 1998}}</ref> Thus in order to make statements about the rate of convergence of the quantity<math display="block">
\sup_\rho\left(\Pr_{\rho^n}[\mathcal E(f_n) - \mathcal E^*_\mathcal{H}\geq\varepsilon]\right),
</math>one must either