Revision as of 13:06, 19 December 2024 edit DevotedWikiEditor (talk \| contribs) 112 edits mNo edit summary Tags: Visual edit Mobile edit Mobile web edit ← Previous edit		Revision as of 03:19, 21 January 2025 edit undo Alenoach (talk \| contribs) Extended confirmed users 5,805 edits Added an image Tag: Visual edit Next edit →
Line 52: === Interpretability === [[File:Grokking modular addition.jpg\|thumb\|upright=1.2\|[[Grokking (machine learning)\|Grokking]] is an example of phenomenon studied in interpretability. It involves a model that initially memorizes all the answers ([[overfitting]]), but later adopts an algorithm that generalizes to unseen data.<ref>{{Cite web \|last=Ananthaswamy \|first=Anil \|date=2024-04-12 \|title=How Do Machines ‘Grok’ Data? \|url=https://www.quantamagazine.org/how-do-machines-grok-data-20240412/ \|access-date=2025-01-21 \|website=Quanta Magazine \|language=en}}</ref>]] Scholars sometimes use the term "mechanistic interpretability" to refer to the process of [[Reverse engineering\|reverse-engineering]] [[artificial neural networks]] to understand their internal decision-making mechanisms and components, similar to how one might analyze a complex machine or computer program.<ref>{{Cite web \|last=Olah \|first=Chris \|date=June 27, 2022 \|title=Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases \|url=https://www.transformer-circuits.pub/2022/mech-interp-essay \|access-date=2024-07-10 \|website=www.transformer-circuits.pub}}</ref>

Explainable artificial intelligence: Difference between revisions