Revision as of 06:49, 31 October 2021 edit 104.156.64.158 (talk) No edit summary ← Previous edit		Revision as of 19:46, 30 January 2022 edit undo Citation bot (talk \| contribs) Bots 5,864,657 edits Alter: template type. Add: arxiv. \| Use this bot. Report bugs. \| Suggested by AManWithNoPlan \| #UCB_webform 862/1682 Next edit →
Line 67: Suppose <math>\mathcal H</math> is a class of binary functions (functions to <math>\{0,1\}</math>). Then, <math>\mathcal H</math> is <math>(\epsilon,\delta)</math>-PAC-learnable with a sample of size: <ref>{{Cite journal\|title=The optimal sample complexity of PAC learning \|journal=J. Mach. Learn. Res.\|volume=17\|issue=1\|pages=1319–1333\|author=Steve Hanneke\|year=2016\|arxiv=1507.00473\|url=https://www.jmlr.org/papers/v17/15-389.html}}</ref> <math display="block"> N = O\bigg(\frac{VC(\mathcal H) + \ln{1\over \delta}}{\epsilon}\bigg) Line 89: ==Efficiency in robotics== A high sample complexity means, that many calculations are needed for running a [[Monte Carlo tree search]].<ref>{{cite conference \|title=Monte-carlo tree search by best arm identification \|author=Kaufmann, Emilie and Koolen, Wouter M \|conference=Advances in Neural Information Processing Systems \|pages=4897–4906 \|year=2017 }}</ref> Its equal to a [[Model-free (reinforcement learning)\|model free]] brute force search in the state space. In contrast, a high efficiency algorithm has a low sample complexity.<ref>{{cite conference \|title=The chin pinch: A case study in skill learning on a legged robot \|author=Fidelman, Peggy and Stone, Peter \|conference=Robot Soccer World Cup \|pages=59–71 \|year=2006 \|publisher=Springer }}</ref> Possible techniques for reducing the sample complexity are [[metric learning]]<ref>{{cite conference \|title=Sample complexity of learning mahalanobis distance metrics \|author=Verma, Nakul and Branson, Kristin \|conference=Advances in neural information processing systems \|pages=2584–2592 \|year=2015 }}</ref> and model based reinforcement learning.<ref>{{cite ~~arxiv~~arXiv \|title=Model-ensemble trust-region policy optimization \|author=Kurutach, Thanard and Clavera, Ignasi and Duan, Yan and Tamar, Aviv and Abbeel, Pieter \|eprint=1802.10592 \|year=2018 \|class=cs.LG }}</ref> ==References==

Sample complexity: Difference between revisions