Content deleted Content added
Fix typo |
No edit summary |
||
(45 intermediate revisions by 27 users not shown) | |||
Line 1:
{{Short description|Machine learning strategy}}
{{about|a machine learning method|active learning in the context of education|active learning}}
{{Machine learning bar}}
'''Active learning''' is a special case of [[machine learning]] in which a learning algorithm can interactively query a human user (or some other information source), to [[Labeled data|label]] new data points with the desired outputs. The human user must possess knowledge/expertise in the problem ___domain, including the ability to consult/research authoritative sources when necessary. <ref name="settles">{{cite
| title = Active Learning Literature Survey
| url = http://pages.cs.wisc.edu/~bsettles/pub/settles.activelearning.pdf
Line 45 ⟶ 46:
|isbn=978-1-5090-5473-2
|s2cid=15285595
}}</ref> In statistics literature, it is sometimes also called [[optimal experimental design]].<ref name="olsson">{{cite
There are situations in which unlabeled data is abundant but manual labeling is expensive. In such a scenario, learning algorithms can actively query the user/teacher for labels. This type of iterative supervised learning is called active learning. Since the learner chooses the examples, the number of examples to learn a concept can often be much lower than the number required in normal supervised learning. With this approach, there is a risk that the algorithm is overwhelmed by uninformative examples. Recent developments are dedicated to multi-label active learning,<ref name="multi"/> hybrid active learning<ref name="hybrid"/> and active learning in a single-pass (on-line) context,<ref name="single-pass"/> combining concepts from the field of machine learning (e.g. conflict and ignorance) with adaptive, [[incremental learning]] policies in the field of [[online machine learning]]. Using active learning allows for faster development of a machine learning algorithm, when comparative updates would require a quantum or super computer.<ref>{{Cite journal |last=Novikov |first=Ivan |date=2021 |title=The MLIP package: moment tensor potentials with MPI and active learning |journal= Machine Learning: Science and Technology|volume=2 |issue=2 |pages=3,4 |doi=10.1088/2632-2153/abc9fe |doi-access=free |arxiv=2007.08555 }}</ref>
Large-scale active learning projects may benefit from [[crowdsourcing]] frameworks such as [[Amazon Mechanical Turk]] that include many [[human-in-the-loop|humans in the active learning loop]].
Line 62 ⟶ 63:
== Scenarios ==
*'''Pool-based sampling''': In this approach, which is the most well known scenario,<ref>{{cite web |last1=DataRobot |title=Active learning machine learning: What it is and how it works |url=https://www.datarobot.com/blog/active-learning-machine-learning |website=DataRobot Blog |publisher=DataRobot Inc. |access-date=30 January 2024}}</ref> the learning algorithm attempts to evaluate ''the entire dataset'' before selecting data points (instances) for labeling. It is often initially trained on a fully labeled subset of the data using a machine-learning method such as logistic regression or SVM that yields class-membership probabilities for individual data instances. The candidate instances are those for which the prediction is most ambiguous. Instances are drawn from the entire data pool and assigned a confidence score, a measurement of how well the learner "understands" the data. The system then selects the instances for which it is the least confident and queries the teacher for the labels. <br />The theoretical drawback of pool-based sampling is that it is memory-intensive and is therefore limited in its capacity to handle enormous datasets, but in practice, the rate-limiting factor is that the teacher is typically a (fatiguable) human expert who must be paid for their effort, rather than computer memory.
*'''Stream-based selective sampling''': Here, each consecutive unlabeled instance is examined ''one at a time'' with the machine evaluating the informativeness of each item against its query parameters. The learner decides for itself whether to assign a label or query the teacher for each datapoint. As contrasted with Pool-based sampling, the obvious drawback of stream-based methods is that the learning algorithm does not have sufficient information, early in the process, to make a sound assign-label-vs ask-teacher decision, and it does not capitalize as efficiently on the presence of already labeled data. Therefore, the teacher is likely to spend more effort in supplying labels than with the pool-based approach.
*'''Membership
==Query strategies==
Line 78:
*'''Querying from diverse subspaces or partitions''':<ref name="shubhomoydas_github"/> When the underlying model is a forest of trees, the leaf nodes might represent (overlapping) partitions of the original [[feature (machine learning)|feature space]]. This offers the possibility of selecting instances from non-overlapping or minimally overlapping partitions for labeling.
*'''Variance reduction''': label those points that would minimize output variance, which is one of the components of error.
*'''[[Conformal
*'''Mismatch-first farthest-traversal''': The primary selection criterion is the prediction mismatch between the current model and nearest-neighbour prediction. It targets on wrongly predicted data points. The second selection criterion is the distance to previously selected data, the farthest first. It aims at optimizing the diversity of selected data.<ref name='zhaos' />
*'''User-centered
A wide variety of algorithms have been studied that fall into these categories.<ref name="settles" /><ref name="olsson" /> While the traditional AL strategies can achieve remarkable performance, it is often challenging to predict in advance which strategy is the most suitable in aparticular situation. In recent years, meta-learning algorithms have been gaining in popularity. Some of them have been proposed to tackle the problem of learning AL strategies instead of relying on manually designed strategies. A benchmark which compares 'meta-learning approaches to active learning' to 'traditional heuristic-based Active Learning' may give intuitions if 'Learning active learning' is at the crossroads <ref>{{cite conference|last1=Desreumaux |first1=Louis |last2=Lemaire|first2=Vincent|title=Learning Active Learning at the Crossroads? Evaluation and Discussion |date=2020 |conference=Proceedings of the Workshop on Interactive Adaptive Learning co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases ({ECML} {PKDD} 2020), Ghent, Belgium, 2020 |s2cid=221794570 }}</ref>
==Minimum marginal hyperplane==
Line 91:
==See also==
* [[List of datasets for machine learning research]]
* [[Sample complexity]]
* [[Bayesian Optimization]]
* [[Reinforcement learning]]
== Literature ==
* Improving Generalization with Active Learning, David Cohn, Les Atlas & Richard Ladner, Machine Learning 15, 201–221 (1994). https://doi.org/10.1007/BF00993277
* Balcan, Maria-Florina & Hanneke, Steve & Wortman, Jennifer. (2008). The True Sample Complexity of Active Learning.. 45-56. https://link.springer.com/article/10.1007/s10994-010-5174-y
* Active Learning and [[Bayesian Optimization]]: a Unified Perspective to Learn with a Goal, Francesco Di Fiore, Michela Nardelli, Laura Mainini, https://arxiv.org/abs/2303.01560v2
* Learning how to Active Learn: A Deep Reinforcement Learning Approach, Meng Fang, Yuan Li, Trevor Cohn, https://arxiv.org/abs/1708.02383v1
==
{{reflist |refs=
<ref name="hybrid">{{cite journal |last1=Lughofer |first1=Edwin |title=Hybrid active learning for reducing the annotation effort of operators in classification systems |journal=Pattern Recognition |date=February 2012 |volume=45 |issue=2 |pages=884–896 |doi=10.1016/j.patcog.2011.08.009|bibcode=2012PatRe..45..884L }}</ref>
<ref name="Bouneffouf(2014)">{{cite book |first1=Djallel |last1=Bouneffouf |first2=Romain |last2=Laroche |first3=Tanguy |last3=Urvoy |first4=Raphael |last4=Féraud |first5=Robin |last5=Allesiardo |year=2014 |chapter-url=https://hal.archives-ouvertes.fr/hal-01069802 |chapter=Contextual Bandit for Active Learning: Active Thompson |doi=10.1007/978-3-319-12637-1_51 |isbn=978-3-319-12636-4 |id=HAL Id: hal-01069802 |editor=Loo, C. K. |editor2=Yap, K. S. |editor3=Wong, K. W. |editor4=Teoh, A. |editor5=Huang, K. |title=Neural Information Processing |volume=8834 |pages=405–412 |series=Lecture Notes in Computer Science |s2cid=1701357 |url=https://hal.archives-ouvertes.fr/hal-01069802/file/Contextual_Bandit_for_Active_Learning.pdf }}</ref>
<ref name="multi">{{cite
<ref name="single-pass">{{Cite journal | doi=10.1007/s12530-012-9060-7 |title = Single-pass active learning with conflict and ignorance| journal=Evolving Systems| volume=3| issue=4| pages=251–271|year = 2012|last1 = Lughofer|first1 = Edwin|s2cid = 43844282}}</ref>
<ref name="Bouneffouf(2016)">{{cite journal |last1=Bouneffouf |first1=Djallel |title=Exponentiated Gradient Exploration for Active Learning |journal=Computers |date=8 January 2016 |volume=5 |issue=1 |pages=1 |doi=10.3390/computers5010001|arxiv=1408.2196 |s2cid=14313852 |doi-access=free }}</ref>
<ref name="shubhomoydas_github">{{Cite web|url=https://github.com/shubhomoydas/ad_examples#query-diversity-with-compact-descriptions|title=shubhomoydas/ad_examples|website=GitHub|language=en|access-date=2018-12-04}}</ref>
<ref name="zhaos">{{Cite journal|arxiv=2002.05033|title=Active learning for sound event detection|language=en|journal=IEEE/ACM Transactions on Audio, Speech, and Language Processing|last1=Zhao|first1=Shuyang|last2=Heittola|first2=Toni|last3=Virtanen|first3=Tuomas|year=2020|doi=10.1109/TASLP.2020.3029652}}</ref>
}}
|