Revision as of 10:54, 11 March 2023 edit Liu1126 (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 12,270 edits Adding local short description: "Machine learning strategy", overriding Wikidata description "machine learning strategy in which a learning algorithm interactively queries for new labels" Tag: Shortdesc helper ← Previous edit		Revision as of 16:18, 21 March 2023 edit undo Dyperd (talk \| contribs) 9 edits added more strategies Next edit →
Line 76: '''Exponentiated Gradient Exploration for Active Learning''':<ref name="Bouneffouf(2016)" /> In this paper, the author proposes a sequential algorithm named exponentiated gradient (EG)-active that can improve any active learning algorithm by an optimal random exploration. '''Uncertainty sampling''': label those points for which the current model is least certain as to what the correct output should be. '''Entropy Sampling:''' The entropy formula is used on each sample, and the sample with the highest value will be selected as the following query.<ref>{{cite journal \|last1=Faria \|first1=Bruno \|last2=Perdigão \|first2=Dylan \|last3=Brás \|first3=Joana \|last4=Macedo \|first4=Luis \|title=The Joint Role of Batch Size and Query Strategy in Active Learning-Based Prediction - A Case Study in the Heart Attack Domain \|journal=Progress in Artificial Intelligence \|date=2022 \|pages=464–475 \|doi=https://doi.org/10.1007/978-3-031-16474-3_38}}</ref> '''Random Sampling:''' This method implies that the following query is randomly selected.<ref>{{cite journal \|last1=Faria \|first1=Bruno \|last2=Perdigão \|first2=Dylan \|last3=Brás \|first3=Joana \|last4=Macedo \|first4=Luis \|title=The Joint Role of Batch Size and Query Strategy in Active Learning-Based Prediction - A Case Study in the Heart Attack Domain \|journal=Progress in Artificial Intelligence \|date=2022 \|pages=464–475 \|doi=https://doi.org/10.1007/978-3-031-16474-3_38}}</ref> '''Margin Sampling:''' It considers the probability of the certainty of the output of a given sample subtracting the two highest probabilities, and selects the query whose result is the lowest.<ref>{{cite journal \|last1=Faria \|first1=Bruno \|last2=Perdigão \|first2=Dylan \|last3=Brás \|first3=Joana \|last4=Macedo \|first4=Luis \|title=The Joint Role of Batch Size and Query Strategy in Active Learning-Based Prediction - A Case Study in the Heart Attack Domain \|journal=Progress in Artificial Intelligence \|date=2022 \|pages=464–475 \|doi=https://doi.org/10.1007/978-3-031-16474-3_38}}</ref> '''Least Confident Sampling:''' It also considers the probability of the certainty of the output of a given sample but only considers the highest probability.<ref>{{cite journal \|last1=Faria \|first1=Bruno \|last2=Perdigão \|first2=Dylan \|last3=Brás \|first3=Joana \|last4=Macedo \|first4=Luis \|title=The Joint Role of Batch Size and Query Strategy in Active Learning-Based Prediction - A Case Study in the Heart Attack Domain \|journal=Progress in Artificial Intelligence \|date=2022 \|pages=464–475 \|doi=https://doi.org/10.1007/978-3-031-16474-3_38}}</ref> '''Query by committee''': a variety of models are trained on the current labeled data, and vote on the output for unlabeled data; label those points for which the "committee" disagrees the most '''Querying from diverse subspaces or partitions''':<ref name="shubhomoydas_github"/> When the underlying model is a forest of trees, the leaf nodes might represent (overlapping) partitions of the original [[feature (machine learning)\|feature space]]. This offers the possibility of selecting instances from non-overlapping or minimally overlapping partitions for labeling.

Active learning (machine learning): Difference between revisions