Meta-learning (computer science): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 01:45, 14 January 2022 edit Jarble (talk \| contribs) Autopatrolled, Extended confirmed users 150,084 edits m linking ← Previous edit		Latest revision as of 16:53, 17 April 2025 edit undo Citation bot (talk \| contribs) Bots 5,868,548 edits Altered template type. Add: pmc, pages, volume, journal. \| Use this bot. Report bugs. \| Suggested by Schützenpanzer \| Category:CS1 errors: unsupported parameter \| #UCB_Category 214/244
(29 intermediate revisions by 17 users not shown)
Line 1: {{short description\|Subfield of machine learning}} {{About\|meta -learning in machine learning\|meta -learning in social psychology\|Meta -learning\|metalearning in neuroscience\|Metalearning (neuroscience)}} {{See also\|Ensemble learning}} ~~{{More citations needed\|date=August 2010}}~~ {{machine learning\|Paradigms}} '''Meta -learning'''<ref name="sch1987">{{cite journal \| last1 = Schmidhuber \| first1 = Jürgen \| year = 1987\| title = Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-... hook \| url= http://people.idsia.ch/~juergen/diploma1987ocr.pdf \| journal = Diploma Thesis, Tech. Univ. Munich \| language = en}}</ref><ref name="scholarpedia">{{cite journal \| last1 = Schaul \| first1 = Tom \| last2 = Schmidhuber \| first2 = Jürgen \| year = 2010\| title = Metalearning \| journal = Scholarpedia \| volume = 5 \| issue = 6\| page = 4650 \| doi=10.4249/scholarpedia.4650\| bibcode = 2010SchpJ...5.4650S \| doi-access = free \| language = en }}</ref> is a subfield of [[machine learning]] where automatic learning algorithms are applied to [[meta-data\|metadata]] about machine learning experiments. As of 2017, the term had not found a standard interpretation, however the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing [[learning algorithms]] or to learn (induce) the learning algorithm itself, hence the alternative term '''learning to learn'''.<ref name="sch1987" /> Flexibility is important because each learning algorithm is based on a set of assumptions about the data, its [[inductive bias]].<ref name="utgoff1986">{{Cite ~~journal~~book \| author = P. E. Utgoff \| ~~title~~chapter = Shift of bias for inductive concept learning \| ~~journal~~editor = In R. Michalski~~, J. Carbonell, & T. Mitchell: Machine Learning~~ \|editor2=J. Carbonell \|editor3=T. Mitchell \| title = Machine Learning: An Artificial Intelligence Approach \| pages = 163–190 \| year = 1986 \| publisher = Morgan Kaufmann \| isbn = 978-0-934613-00-2 \| language = en \| chapter-url = https://books.google.com/books?id=f9RylgKpHZsC&q=utgoff&pg=PA107 }}</ref> This means that it will only learn well if the bias matches the learning problem. A learning algorithm may perform very well in one ___domain, but not on the next. This poses strong restrictions on the use of [[machine learning]] or [[data mining]] techniques, since the relationship between the learning problem (often some kind of [[database]]) and the effectiveness of different learning algorithms is not yet understood. By using different kinds of metadata, like properties of the learning problem, algorithm properties (like performance measures), or patterns previously derived from the data, it is possible to learn, select, alter or combine different learning algorithms to effectively solve a given learning problem. Critiques of meta -learning approaches bear a strong resemblance to the critique of [[metaheuristic]], a possibly related problem. A good analogy to meta-learning, and the inspiration for [[Jürgen Schmidhuber]]'s early work (1987)<ref name="sch1987" /> and [[Yoshua Bengio]] et al.'s work (1991),<ref>{{cite conference\|~~last~~last1=Bengio\|~~first~~first1=Yoshua\|last2=Bengio\|first2=Samy\|last3=Cloutier\|first3=Jocelyn\|conference=IJCNN'91\|url=http://bengio.abracadoudou.com/publications/pdf/bengio_1991_ijcnn.pdf\|date=1991\|title=Learning to learn a synaptic rule\|language=en}}</ref> considers that genetic evolution learns the learning procedure encoded in genes and executed in each individual's brain. In an open-ended hierarchical meta -learning system<ref name="sch1987" /> using [[genetic programming]], better evolutionary methods can be learned by meta evolution, which itself can be improved by meta meta evolution, etc.<ref name="sch1987" /> == Definition == A proposed definition<ref>{{Cite journal\|~~last~~last1=Lemke\|~~first~~first1=Christiane\|last2=Budka\|first2=Marcin\|last3=Gabrys\|first3=Bogdan\|date=2013-07-20\|title=Metalearning: a survey of trends and technologies\|journal=Artificial Intelligence Review\|language=en\|volume=44\|issue=1\|pages=117–130\|doi=10.1007/s10462-013-9406-y\|issn=0269-2821\|pmc=4459543\|pmid=26069389}}</ref> for a meta -learning system combines three requirements: * The system must include a learning subsystem. * Experience is gained by exploiting meta knowledge extracted Line 22 ⟶ 30: ** from different domains. * Learning bias must be chosen dynamically. ''Bias'' refers to the assumptions that influence the choice of explanatory hypotheses<ref>{{Cite book\|title=Metalearning - Springer\|doi=10.1007/978-3-540-73263-1\|series = Cognitive Technologies\|year = 2009\|isbn = 978-3-540-73262-4\|last1 = Brazdil\|first1 = Pavel\|last2=Carrier\|first2=Christophe Giraud\|last3=Soares\|first3=Carlos\|last4=Vilalta\|first4=Ricardo\|language=en}}</ref> and not the notion of bias represented in the [[bias-variance dilemma]]. Meta -learning is concerned with two aspects of learning bias. * Declarative bias specifies the representation of the space of hypotheses, and affects the size of the search space (e.g., represent hypotheses using linear functions only). * Procedural bias imposes constraints on the ordering of the inductive hypotheses (e.g., preferring smaller hypotheses).<ref>{{cite journal \|last1=Gordon \|first1=Diana \|last2=Desjardins \|first2=Marie \|title=Evaluation and Selection of Biases in Machine Learning \|journal=Machine Learning \|date=1995 \|volume=20 \|pages=5–22 \|doi=10.1023/A:1022630017346 \|url=https://link.springer.com/content/pdf/10.1023/A:1022630017346.pdf \|access-date=27 March 2020\|doi-access=free\|language=en }}</ref> ==Common approaches== There are three common approaches:<ref name="paper1">[{{cite web\|url=https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html] \|first=Lilian \|last=Weng~~(2018).~~ \|title=Meta-Learning: Learning to Learn Fast. \|website=OpenAI Blog .\|date=30 November 2018 ~~. Retrieved~~ \|access-date=27 October 2019\|language=en}}</ref> * 1)# using (cyclic) networks with external or internal memory (model-based) * 2)# learning effective distance metrics (metrics-based) * 3)# explicitly optimizing model parameters for fast learning (optimization-based). ===Model-Based=== Line 36 ⟶ 44: ====Memory-Augmented Neural Networks==== A Memory-Augmented [[Neural Network]], or MANN for short, is claimed to be able to encode new information quickly and thus to adapt to new tasks after only a few examples.<ref name="paper2">[{{cite web\|url=http://proceedings.mlr.press/v48/santoro16.pdf] \|first1=Adam \|last1=Santoro, \|first2=Sergey \|last2=Bartunov, \|first3=Daan \|last3=Wierstra, \|first4=Timothy \|last4=Lillicrap. \|title=Meta-Learning with Memory-Augmented Neural Networks. \|publisher=Google DeepMind~~. Retrieved~~ \|access-date=29 October 2019\|language=en}}</ref> ====Meta Networks==== Meta Networks (MetaNet) learns a meta-level knowledge across tasks and shifts its inductive biases via fast parameterization for rapid generalization.<ref name="paper3">~~[https://arxiv.org/abs/1703.00837]~~{{cite journal\|first1=Tsendsuren \|last1=Munkhdalai , \|first2=Hong \|last2=Yu(\|year=2017). \|title=Meta Networks~~.arXiv:~~\|journal=Proceedings of Machine Learning Research \|volume=70 \|pages=2554–2563 \|pmid=31106300 \|pmc=6519722 \|arxiv=1703.00837 ~~[cs.LG]~~\|language=en}}</ref> ===Metric-Based=== The core idea in metric-based meta-learning is similar to [[K-nearest neighbor algorithm\|nearest neighbors]] algorithms, which weight is generated by a kernel function. It aims to learn a metric or distance function over objects. The notion of a good metric is problem-dependent. It should represent the relationship between inputs in the task space and facilitate problem solving.<ref name="paper1" /> ====Convolutional Siamese [[Neural Network]]==== ~~Siamese~~ [[Siamese neural network]] is composed of two twin networks whose output is jointly trained. There is a function above to learn the relationship between input data sample pairs. The two networks are the same, sharing the same weight and network parameters.<ref name="paper4">[{{cite web\|url=http://www.cs.toronto.edu/~rsalakhu/papers/oneshot1.pdf] \|first1=Gregory \|last1=Koch ~~GKOCH,~~ \|first2=Richard \|last2=Zemel ~~ZEMEL,~~ \|first3=Ruslan \|last3=Salakhutdinov(\|year=2015).\|title=Siamese Neural Networks for One-shot Image Recognition. \|publisher=Department of Computer Science, University of Toronto. \|___location=Toronto, Ontario, Canada.\|language=en}}</ref> ====Matching Networks==== Matching Networks learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.<ref name="paper5">[{{cite web\|url=http://papers.nips.cc/paper/6385-matching-networks-for-one-shot-learning.pdf] \|last1=Vinyals, \|first1=O. , \|last2=Blundell, \|first2=C. , \|last3=Lillicrap, \|first3=T. , \|last4=Kavukcuoglu, \|first4=K. ~~, &~~ \|last5=Wierstra, \|first5=D. ~~. (~~\|year=2016). \|title=Matching networks for one shot learning. \|publisher=Google DeepMind~~. Retrieved~~ \|access-date=3 November 2019\|language=en}}</ref> ====Relation Network==== The Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting.<ref name="paper6">[{{cite web\|url=http://openaccess.thecvf.com/content_cvpr_2018/papers_backup/Sung_Learning_to_Compare_CVPR_2018_paper.pdf] \|last1=Sung, \|first1=F. , \|last2=Yang, \|first2=Y. , \|last3=Zhang, \|first3=L. , \|last4=Xiang, \|first4=T. , \|last5=Torr, \|first5=P. H. S. ~~, &~~ \|last6=Hospedales, \|first6=T. M. ~~. (~~\|year=2018). \|title=Learning to compare: relation network for few-shot learning\|language=en}}</ref> ====Prototypical Networks==== Prototypical Networks learn a [[metric space]] in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve satisfied results.<ref name="paper7">[{{cite web\|url=http://papers.nips.cc/paper/6996-prototypical-networks-for-few-shot-learning.pdf] \|last1=Snell, \|first1=J. , \|last2=Swersky, \|first2=K. ~~, &~~ \|last3=Zemel, \|first3=R. S. ~~. (~~\|year=2017). \|title=Prototypical networks for few-shot learning.\|language=en}}</ref> ===Optimization-Based=== Line 60 ⟶ 68: ====LSTM Meta-Learner==== [[LSTM]]-based meta-learner is to learn the exact [[optimization algorithm]] used to train another learner [[Artificial neural network\|neural network]] [[classification rule\|classifier]] in the few-shot regime. The parametrization allows it to learn appropriate parameter updates specifically for the [[scenario]] where a set amount of updates will be made, while also learning a general initialization of the learner (classifier) network that allows for quick convergence of training.<ref name="paper8">[{{cite conference\|url=https://openreview.net/pdf?id=rJY0-Kcll] \|first1=Sachin ~~Ravi∗and~~ \|last1=Ravi\|first2=Hugo \|last2=Larochelle(\|year=2017~~).”~~ \|title=Optimization as a model for few-shot ~~learning”.~~ learning\|conference=ICLR 2017~~. Retrieved~~ \|access-date=3 November 2019\|language=en}}</ref> ====Temporal Discreteness==== ~~MAML, short for~~ Model-Agnostic Meta-Learning, (MAML) is a fairly general [[optimization algorithm]], compatible with any model that learns through gradient descent.<ref name="~~paper9~~maml"~~>[https://arxiv.org/abs/1703.03400]~~ ~~Chelsea Finn, Pieter Abbeel, Sergey Levine(2017). “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks” arXiv:1703.03400 [cs.LG]<~~/~~ref~~> ====Reptile==== Reptile is a remarkably simple meta-learning optimization algorithm, given that both of its components rely on [[meta-optimization]]] through gradient descent and both are model-agnostic.<ref name="paper10">~~[https://arxiv.org/abs/1803.02999] Chelsea Finn, Pieter Abbeel, Sergey Levine(2017).~~{{cite arXiv\|first1=Alex \|last1=Nichol ~~and~~ \|first2=Joshua \|last2=Achiam ~~and~~ \|first3=John \|last3=Schulman(\|year=2018~~).”~~ \|title=On First-Order Meta-Learning ~~Algorithms”. arXiv:~~Algorithms\|eprint=1803.02999 [\|class=cs.LG]\|language=en}}</ref> ==Examples== Some approaches which have been viewed as instances of meta -learning: * [[Recurrent neural networks]] (RNNs) are universal computers. In 1993, [[Jürgen Schmidhuber]] showed how "self-referential" RNNs can in principle learn by [[backpropagation]] to run their own weight change algorithm, which may be quite different from backpropagation.<ref name="sch1993">{{cite journal \| last1 = Schmidhuber \| first1 = Jürgen \| year = 1993\| title = A self-referential weight matrix \| journal = Proceedings of ICANN'93, Amsterdam \| pages = 446–451 \| language = en}}</ref> In 2001, [[Sepp Hochreiter]] & A.S. Younger & P.R. Conwell built a successful supervised meta -learner based on [[Long short-term memory]] RNNs. It learned through backpropagation a learning algorithm for quadratic functions that is much faster than backpropagation.<ref name="hoch2001">{{cite journal \| last1 = Hochreiter \| first1 = Sepp \| last2 = Younger \| first2 = A. S. \| last3 = Conwell \| first3 = P. R. \| year = 2001\| title = Learning to Learn Using Gradient Descent \| journal = Proceedings of ICANN'01\| pages = 87–94\| language = en}}</ref><ref name="scholarpedia" /> Researchers at [[Deepmind]] (Marcin Andrychowicz et al.) extended this approach to optimization in 2017.<ref name="marcin2017">{{cite journal \| last1 = Andrychowicz \| first1 = Marcin \| last2 = Denil \| first2 = Misha \| last3 = Gomez \| first3 = Sergio \| last4 = Hoffmann \| first4 = Matthew \| last5 = Pfau \| first5 = David \| last6 = Schaul \| first6 = Tom \| last7 = Shillingford \| first7 = Brendan \| last8 = de Freitas \| first8 = Nando \| year = 2017\| title = Learning to learn by gradient descent by gradient descent \| journal = Proceedings of ICML'17, Sydney, Australia\| arxiv = 1606.04474 }}</ref> * In the 1990s, Meta [[Reinforcement Learning]] or Meta RL was achieved in Schmidhuber's research group through self-modifying policies written in a universal programming language that contains special instructions for changing the policy itself. There is a single lifelong trial. The goal of the RL agent is to maximize reward. It learns to accelerate reward intake by continually improving its own learning algorithm which is part of the "self-referential" policy.<ref name="sch1994">{{cite journal \| last1 = Schmidhuber \| first1 = Jürgen \| year = 1994\| title = On learning how to learn learning strategies \| journal = Technical Report FKI-198-94, Tech. Univ. Munich \| language = en \| url = http://people.idsia.ch/~juergen/FKI-198-94ocr.pdf}}</ref><ref name="sch1997">{{cite journal \| last1 = Schmidhuber \| first1 = Jürgen \| last2 = Zhao \| first2 = J. \| last3 = Wiering \| first3 = M. \| year = 1997\| title = Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement \| journal = Machine Learning \| volume = 28 \| pages = 105–130 \| doi=10.1023/a:1007383707642\| doi-access = free\| language = en }}</ref> * An extreme type of Meta [[Reinforcement Learning]] is embodied by the [[Gödel machine]], a theoretical construct which can inspect and modify any part of its own software which also contains a general [[Automated theorem proving\|theorem prover]]. It can achieve [[recursive self-improvement]] in a provably optimal way.<ref name="goedelmachine">{{cite journal \| last1 = Schmidhuber \| first1 = Jürgen \| year = 2006\| title = Gödel machines: Fully Self-Referential Optimal Universal Self-Improvers \| url=https://archive.org/details/arxiv-cs0309048\| journal = In B. Goertzel & C. Pennachin, Eds.: Artificial General Intelligence \| pages = 199–226 \| language=en}}</ref><ref name="scholarpedia" /> * ''Model-Agnostic Meta-Learning'' (MAML) was introduced in 2017 by [[Chelsea Finn]] et al.<ref name="maml" /> Given a sequence of tasks, the parameters of a given model are trained such that few iterations of gradient descent with few training data from a new task will lead to good generalization performance on that task. MAML "trains the model to be easy to fine-tune."<ref name="maml" /> MAML was successfully applied to few-shot image classification benchmarks and to policy-gradient-based reinforcement learning.<ref name="maml">{{cite ~~arxiv~~arXiv \| last1 = Finn \| first1 = Chelsea \| last2 = Abbeel \| first2 = Pieter \| last3 = Levine \| first3 = Sergey \|year = 2017\| title = Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks \| eprint=1703.03400\|class=cs.LG\|language=en }}</ref> * ''Variational Bayes-Adaptive Deep RL'' (VariBAD) was introduced in 2019.<ref>{{Cite journal \|last1=Zintgraf \|first1=Luisa \|last2=Schulze \|first2=Sebastian \|last3=Lu \|first3=Cong \|last4=Feng \|first4=Leo \|last5=Igl \|first5=Maximilian \|last6=Shiarlis \|first6=Kyriacos \|last7=Gal \|first7=Yarin \|last8=Hofmann \|first8=Katja \|last9=Whiteson \|first9=Shimon \|date=2021 \|title=VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning \|url=http://jmlr.org/papers/v22/21-0657.html \|journal=Journal of Machine Learning Research \|volume=22 \|issue=289 \|pages=1–39 \|issn=1533-7928}}</ref> While MAML is optimization-based, VariBAD is a model-based method for meta reinforcement learning, and leverages a [[variational autoencoder]] to capture the task information in an internal memory, thus conditioning its decision making on the task. * When addressing a set of tasks, most meta learning approaches optimize the average score across all tasks. Hence, certain tasks may be sacrificed in favor of the average score, which is often unacceptable in real-world applications. By contrast, ''Robust Meta Reinforcement Learning'' (RoML) focuses on improving low-score tasks, increasing robustness to the selection of task.<ref>{{Cite journal \|last1=Greenberg \|first1=Ido \|last2=Mannor \|first2=Shie \|last3=Chechik \|first3=Gal \|last4=Meirom \|first4=Eli \|date=2023-12-15 \|title=Train Hard, Fight Easy: Robust Meta Reinforcement Learning \|url=https://proceedings.neurips.cc/paper_files/paper/2023/hash/d74e6bfe9ce029526e69db14d2c281ec-Abstract-Conference.html \|journal=Advances in Neural Information Processing Systems \|language=en \|volume=36 \|pages=68276–68299}}</ref> RoML works as a meta-algorithm, as it can be applied on top of other meta learning algorithms (such as MAML and VariBAD) to increase their robustness. It is applicable to both supervised meta learning and meta [[reinforcement learning]]. * ''Discovering [[meta-knowledge]]'' works by inducing knowledge (e.g. rules) that expresses how each learning method will perform on different learning problems. The metadata is formed by characteristics of the data (general, statistical, information-theoretic,... ) in the learning problem, and characteristics of the learning algorithm (type, parameter settings, performance measures,...). Another learning algorithm then learns how the data characteristics relate to the algorithm characteristics. Given a new learning problem, the data characteristics are measured, and the performance of different learning algorithms are predicted. Hence, one can predict the algorithms best suited for the new problem. * ''Stacked generalisation'' works by combining multiple (different) learning algorithms. The metadata is formed by the predictions of those different algorithms. Another learning algorithm learns from this metadata to predict which combinations of algorithms give generally good results. Given a new learning problem, the predictions of the selected set of algorithms are combined (e.g. by (weighted) voting) to provide the final prediction. Since each algorithm is deemed to work on a subset of problems, a combination is hoped to be more flexible and able to make good predictions. Line 82 ⟶ 92: * ''[[Inductive transfer]]'' studies how the learning process can be improved over time. Metadata consists of knowledge about previous learning episodes and is used to efficiently develop an effective hypothesis for a new task. A related approach is called [[learning to learn]], in which the goal is to use acquired knowledge from one ___domain to help learning in other domains. * Other approaches using metadata to improve automatic learning are [[learning classifier system]]s, [[case-based reasoning]] and [[constraint satisfaction]]. * Some initial, theoretical work has been initiated to use ''[[Applied Behavioral Analysis]]'' as a foundation for agent-mediated meta-learning about the performances of human learners, and adjust the instructional course of an artificial agent.<ref name="Begoli, PRS-ABA, ABA Ontology">{{cite ~~book~~journal\|last1=Begoli\|first1=Edmon\|title=Procedural-Reasoning Architecture for Applied Behavior Analysis-based Instructions\|journal=Doctoral Dissertations\|date=May 2014\|publisher=University of Tennessee, Knoxville\|___location=Knoxville, Tennessee, USA\|pages=44–79\|url=http://trace.tennessee.edu/utk_graddiss/2749\|access-date=14 October 2017\|language=en}}</ref> * [[AutoML]] such as Google Brain's "AI building AI" project, which according to Google briefly exceeded existing [[ImageNet]] benchmarks in 2017.<ref>{{cite news\|title=Robots Are Now 'Creating New Robots,' Tech Reporter Says\|url=https://www.npr.org/2018/03/15/593863645/robots-are-now-creating-new-robots-tech-reporter-says\|access-date=29 March 2018\|work=NPR.org\|date=2018\|language=en}}</ref><ref>{{cite news\|title=AutoML for large scale image classification and object detection\|url=https://research.googleblog.com/2017/11/automl-for-large-scale-image.html\|access-date=29 March 2018\|work=Google Research Blog\|date=November 2017\|language=en}}</ref> <!--==See also== Line 92 ⟶ 102: == External links == * [http://www.scholarpedia.org/article/Metalearning Metalearning] article in [[Scholarpedia]] * {{cite journal\|last1=Vilalta \|first1=R. ~~and~~ \|last2=Drissi \|first2=Y. (\|year=2002~~). ''[~~\|url=http://axon.cs.byu.edu/Dan/478/misc/Vilalta.pdf \|title=A perspective view and survey of meta-learning~~]'',~~ \|journal=Artificial Intelligence Review, \|volume=18(\|issue=2), \|pages=77–95\|doi=10.1023/A:1019956318069 \|language=en}} * {{cite book\|last1=Giraud-Carrier, \|first1=C.~~, &~~ \|last2=Keller, \|first2=J. (\|year=2002). \|title=Dealing with the data flood, \|editor-first=J. \|editor-last=Meij ~~(ed),~~ \|chapter =Meta-Learning. \|publisher=STT/Beweton, \|___location=The Hague\|language=en\|url=https://stt.nl/en/futures-studies/dealing-with-the-data-flood/stt65-dealing-with-the-data-flood-mining-data-text-and-multimedia}} * ~~Brazdil~~{{cite book\|last1=Brazdil\|first1=P., \|last2=Giraud-Carrier \|first2=C., \|last3=Soares \|first3=C., \|last4=Vilalta \|first4=R. (\|year=2009~~) [~~\|url=https://books.google.com/books?id=-Gsi_cxZGpcC~~&printsec~~\|title=~~frontcover#v=onepage&q&f=false~~ Metalearning: applications to data mining], \|chapter =Metalearning: Concepts and Systems, \|publisher=Springer\|isbn=978-3-540-73262-4 \|language=en}} * Video courses about Meta-Learning with step-by-step explanation of [https://www.youtube.com/watch?v=IkDw22a8BDE MAML], [https://www.youtube.com/watch?v=rHGPfl0pvLY Prototypical Networks], and [https://www.youtube.com/watch?v=j8qDaVfrO_c Relation Networks]. {{DEFAULTSORT:Meta -Learning (Computer Science)}} [[Category:Machine learning]]