Learning classifier system: Difference between revisions

Content deleted Content added
m Early years: clean up, replaced: ISBN 0-7803-3481-7 → {{ISBN|0-7803-3481-7}} using AWB (12151)
typo, typo(s) fixed: 1990's → 1990s (2) using AWB
Line 28:
 
==== Parameter updates/credit assignment/learning ====
In the sixth step, the rule parameters of any rule in [M] are updated to reflect the new experience gained from the current training instance. Depending on the LCS algorithm, a number of updates can take place at this step. For supervised learning, we can simply update the accuracy/error of a rule. Rule accuracy/error is different than model accuracy/error, since it is not calculated over the entire training data, but only over all instances that it matched. Rule accuracy is calculated by dividing the number of times the rule was in a correct set [C] by the number of times it was in ana match set [M]. Rule accuracy can be thought of as a 'local accuracy'. Rule fitness is also updated here, and is commonly calculated as a function of rule accuracy. The concept of fitness is taken directly from classic [[genetic algorithm]]s. Be aware that there are many variations on how LCS updates parameters in order to perform credit assignment and learning.
 
==== Subsumption ====
Line 76:
 
=== The revolution ===
Interest in learning classifier systems was reinvigorated in the mid 1990's1990s largely due to two events; the development of the [[Q-learning|Q-Learning]] algorithm<ref>Watkins, Christopher John Cornish Hellaby. "Learning from delayed rewards." PhD diss., University of Cambridge, 1989.</ref> for [[reinforcement learning]], and the introduction of significantly simplified Michigan-style LCS architectures by Stewart Wilson.<ref name=":10">{{Cite journal|last=Wilson|first=Stewart W.|date=1995-06-01|title=Classifier Fitness Based on Accuracy|url=http://dx.doi.org/10.1162/evco.1995.3.2.149|journal=Evol. Comput.|volume=3|issue=2|pages=149–175|doi=10.1162/evco.1995.3.2.149|issn=1063-6560}}</ref><ref name=":6">{{Cite journal|last=Wilson|first=Stewart W.|date=1994-03-01|title=ZCS: A Zeroth Level Classifier System|url=http://dx.doi.org/10.1162/evco.1994.2.1.1|journal=Evolutionary Computation|volume=2|issue=1|pages=1–18|doi=10.1162/evco.1994.2.1.1|issn=1063-6560}}</ref> Wilson's '''Zeroth-level Classifier System (ZCS)'''<ref name=":6" /> focused on increasing algorithmic understandability based on Hollands standard LCS implementation.<ref name=":4" /> This was done, in part, by removing rule-bidding and the internal message list, essential to the original BBA credit assignment, and replacing it with a hybrid BBA/[[Q-learning|Q-Learning]] strategy. ZCS demonstrated that a much simpler LCS architecture could perform as well as the original, more complex implementations. However, ZCS still suffered from performance drawbacks including the proliferation of over-general classifiers.
 
In 1995, Wilson published his landmark paper, "Classifier fitness based on accuracy" in which he introduced the classifier system '''XCS'''.<ref name=":10" /> XCS took the simplified architecture of ZCS and added an accuracy-based fitness, a niche GA (acting in the action set [A]), an explicit generalization mechanism called ''subsumption'', and an adaptation of the [[Q-learning|Q-Learning]] credit assignment. XCS was popularized by its ability to reach optimal performance while evolving accurate and maximally general classifiers as well as its impressive problem flexibility (able to perform both [[reinforcement learning]] and [[supervised learning]]) . XCS later became the best known and most studied LCS algorithm and defined a new family of ''accuracy-based LCS''. ZCS alternatively became synonymous with ''strength-based LCS''. XCS is also important, because it successfully bridged the gap between LCS and the field of [[reinforcement learning]]. Following the success of XCS, LCS were later described as reinforcement learning systems endowed with a generalization capability.<ref>{{Cite journal|last=Lanzi|first=P. L.|title=Learning classifier systems from a reinforcement learning perspective|url=http://link.springer.com/article/10.1007/s005000100113|journal=Soft Computing|language=en|volume=6|issue=3-4|pages=162–170|doi=10.1007/s005000100113|issn=1432-7643}}</ref> [[Reinforcement learning]] typically seeks to learn a value function that maps out a complete representation of the state/action space. Similarly, the design of XCS drives it to form an all-inclusive and accurate representation of the problem space (i.e. a ''complete map'') rather than focusing on high payoff niches in the environment (as was the case with strength-based LCS). Conceptually, complete maps don't only capture what you should do, or what is correct, but also what you shouldn't do, or what's incorrect. Differently, most strength-based LCSs, or exclusively supervised learning LCSs seek a rule set of efficient generalizations in the form of a ''best action map'' (or a ''partial map''). Comparisons between strength vs. accuracy-based fitness and complete vs. best action maps have since been examined in greater detail.<ref>Kovacs, Timothy Michael Douglas. ''A Comparison of Strength and Accuracy-based Fitness in Learning and Classifier Systems''. 2002.</ref><ref>[http://link.springer.com/chapter/10.1007/3-540-48104-4_6 Kovacs, Tim. "Two views of classifier systems." In ''International Workshop on Learning Classifier Systems'', pp. 74-87. Springer Berlin Heidelberg, 2001]</ref>
Line 146:
The name, "Learning Classifier System (LCS)", is a bit misleading since there are many [[machine learning]] algorithms that 'learn to classify' (e.g. [[decision tree]]s, [[artificial neural network]]s), but are not LCSs. The term 'rule-based machine learning (RBML)' is useful, as it more clearly captures the essential 'rule-based' component of these systems, but it also generalizes to methods that are not considered to be LCSs (e.g. [[association rule learning]], or [[artificial immune system]]s). More general terms such as, 'genetics-based machine learning', and even 'genetic algorithm'<ref name=":8">Congdon, Clare Bates. "A comparison of genetic algorithms and other machine learning systems on a complex classification task from common disease research." PhD diss., The University of Michigan, 1995.</ref> have also been applied to refer to what would be more characteristically defined as a learning classifier system. Due to their similarity to [[genetic algorithm]]s, Pittsburgh-style learning classifier systems are sometimes generically referred to as 'genetic algorithms'. Beyond this, some LCS algorithms, or closely related methods, have been referred to as 'cognitive systems',<ref name=":2" /> 'adaptive agents', '[[production system (computer science)|production system]]s', or generically as a 'classifier system'.<ref>{{Cite journal|last=Booker|first=L. B.|last2=Goldberg|first2=D. E.|last3=Holland|first3=J. H.|date=1989-09-01|title=Classifier systems and genetic algorithms|url=http://www.sciencedirect.com/science/article/pii/0004370289900507|journal=Artificial Intelligence|volume=40|issue=1|pages=235–282|doi=10.1016/0004-3702(89)90050-7}}</ref><ref>Wilson, Stewart W., and David E. Goldberg. "A critical review of classifier systems." In ''Proceedings of the third international conference on Genetic algorithms'', pp. 244-255. Morgan Kaufmann Publishers Inc., 1989.</ref> This variation in terminology contributes to some confusion in the field.
 
Up until the 2000's2000s nearly all learning classifier system methods were developed with reinforcement learning problems in mind. As a result, the term ‘learning classifier system’ was commonly defined as the combination of ‘trial-and-error’ reinforcement learning with the global search of a genetic algorithm. Interest in supervised learning applications, and even unsupervised learning have since broadened the use and definition of this term.
 
== See also ==