Revision as of 10:00, 16 April 2023 edit Manyplacesiveseen (talk \| contribs) 355 edits No edit summary Tag: Visual edit ← Previous edit		Revision as of 19:43, 3 September 2023 edit undo Olexa Riznyk (talk \| contribs) Extended confirmed users 2,363 edits →Common approaches: Adding/removing wikilinks, fixing style/layout errors Next edit →
Line 30: ==Common approaches== There are three common approaches:<ref name="paper1">[{{cite blog\|url=https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html] \|first=Lilian \|last=Weng(\|year=2018). \|title=Meta-Learning: Learning to Learn Fast. \|website=OpenAI Blog .\|date=30 November 2018 ~~. Retrieved~~ \|access-date=27 October 2019\|language=en}}</ref> * 1)# using (cyclic) networks with external or internal memory (model-based) * 2)# learning effective distance metrics (metrics-based) * 3)# explicitly optimizing model parameters for fast learning (optimization-based). ===Model-Based=== Line 39: ====Memory-Augmented Neural Networks==== A Memory-Augmented [[Neural Network]], or MANN for short, is claimed to be able to encode new information quickly and thus to adapt to new tasks after only a few examples.<ref name="paper2">[{{cite paper\|url=http://proceedings.mlr.press/v48/santoro16.pdf] \|first1=Adam \|last1=Santoro, \|first2=Sergey \|last2=Bartunov, \|first3=Daan \|last3=Wierstra, \|first4=Timothy \|last4=Lillicrap. \|title=Meta-Learning with Memory-Augmented Neural Networks. \|publisher=Google DeepMind~~. Retrieved~~ \|access-date=29 October 2019\|language=en}}</ref> ====Meta Networks==== Meta Networks (MetaNet) learns a meta-level knowledge across tasks and shifts its inductive biases via fast parameterization for rapid generalization.<ref name="paper3">~~[https://~~{{cite arxiv~~.org/abs/1703.00837]~~ \|first1=Tsendsuren \|last1=Munkhdalai, \|first2=Hong \|last2=Yu (\|year=2017). \|title=Meta Networks~~.arXiv:~~\|arxiv=1703.00837 ~~[cs.LG]~~\|language=en}}</ref> ===Metric-Based=== The core idea in metric-based meta-learning is similar to [[K-nearest neighbor algorithm\|nearest neighbors]] algorithms, which weight is generated by a kernel function. It aims to learn a metric or distance function over objects. The notion of a good metric is problem-dependent. It should represent the relationship between inputs in the task space and facilitate problem solving.<ref name="paper1" /> ====Convolutional Siamese [[Neural Network]]==== [[Siamese neural network]] is composed of two twin networks whose output is jointly trained. There is a function above to learn the relationship between input data sample pairs. The two networks are the same, sharing the same weight and network parameters.<ref name="paper4">[{{cite paper\|url=http://www.cs.toronto.edu/~rsalakhu/papers/oneshot1.pdf] \|first1=Gregory \|last1=Koch, \|first2=Richard \|last2=Zemel, \|first3=Ruslan \|last3=Salakhutdinov (\|year=2015). \|title=Siamese Neural Networks for One-shot Image Recognition. \|publisher=Department of Computer Science, University of Toronto. \|___location=Toronto, Ontario, Canada.\|language=en}}</ref> ====Matching Networks==== Matching Networks learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.<ref name="paper5">[{{cite paper\|url=http://papers.nips.cc/paper/6385-matching-networks-for-one-shot-learning.pdf] \|last1=Vinyals, \|first1=O., \|last2=Blundell, \|first2=C., \|last3=Lillicrap, \|first3=T., \|last4=Kavukcuoglu, \|first4=K.~~, &~~ \|last5=Wierstra, \|first5=D. (\|year=2016). \|title=Matching networks for one shot learning. \|publisher=Google DeepMind~~. Retrieved~~ \|access-date=3 November 2019\|language=en}}</ref> ====Relation Network==== The Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting.<ref name="paper6">[{{cite paper\|url=http://openaccess.thecvf.com/content_cvpr_2018/papers_backup/Sung_Learning_to_Compare_CVPR_2018_paper.pdf] \|last1=Sung, \|first1=F., \|last2=Yang, \|first2=Y., \|last3=Zhang, \|first3=L., \|last4=Xiang, \|first4=T., \|last5=Torr, \|first5=P. H. S.~~, &~~ \|last6=Hospedales, \|first6=T. M. (\|year=2018). \|title=Learning to compare: relation network for few-shot learning\|language=en}}</ref> ====Prototypical Networks==== Prototypical Networks learn a [[metric space]] in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve satisfied results.<ref name="paper7">[{{cite paper\|url=http://papers.nips.cc/paper/6996-prototypical-networks-for-few-shot-learning.pdf] \|last1=Snell, \|first1=J., \|last2=Swersky, \|first2=K.~~, &~~ \|last3=Zemel, \|first3=R. S. (\|year=2017). \|title=Prototypical networks for few-shot learning.\|language=en}}</ref> ===Optimization-Based=== Line 63: ====LSTM Meta-Learner==== [[LSTM]]-based meta-learner is to learn the exact [[optimization algorithm]] used to train another learner [[neural network]] [[classification rule\|classifier]] in the few-shot regime. The parametrization allows it to learn appropriate parameter updates specifically for the [[scenario]] where a set amount of updates will be made, while also learning a general initialization of the learner (classifier) network that allows for quick convergence of training.<ref name="paper8">[{{cite conference\|url=https://openreview.net/pdf?id=rJY0-Kcll] \|first1=Sachin \|last1=Ravi ~~and~~ \|first2=Hugo \|last2=Larochelle (\|year=2017~~).”~~ \|title=Optimization as a model for few-shot ~~learning”.~~ learning\|conference=ICLR 2017~~. Retrieved~~ \|access-date=3 November 2019\|language=en}}</ref> ====Temporal Discreteness==== MAML, short for Model-Agnostic Meta-Learning, is a fairly general [[optimization algorithm]], compatible with any model that learns through gradient descent.<ref name="paper9">~~[https://~~{{cite arxiv~~.org/abs/1703.03400]~~ \|first1=Chelsea \|last1=Finn, \|first2=Pieter \|last2=Abbeel, \|first3=Sergey \|last3=Levine (\|year=2017~~). “Model~~\|title=Model-Agnostic Meta-Learning for Fast Adaptation of Deep ~~Networks” arXiv:~~Networks\|arxiv=1703.03400 ~~[cs.LG]~~\|language=en}}</ref> ====Reptile==== Reptile is a remarkably simple meta-learning optimization algorithm, given that both of its components rely on [[meta-optimization]] through gradient descent and both are model-agnostic.<ref name="paper10">~~[https://~~{{cite arxiv~~.org/abs/1803.02999]~~ \|first1=Alex \|last1=Nichol, \|first2=Joshua \|last2=Achiam~~, and~~ \|first3=John \|last3=Schulman (\|year=2018~~).”~~ \|title=On First-Order Meta-Learning ~~Algorithms”. arXiv:~~Algorithms\|arxiv=1803.02999 ~~[cs.LG]~~\|language=en}}</ref> ==Examples==

Meta-learning (computer science): Difference between revisions