Content deleted Content added
No edit summary |
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation - Link equal to linktext - Punctuation in link) |
||
Line 7:
XAI hopes to help users of AI-powered systems perform more effectively by improving their understanding of how those systems reason.<ref>{{Cite journal|last=Alizadeh|first=Fatemeh|date=2021|title=I Don't Know, Is AI Also Used in Airbags?: An Empirical Study of Folk Concepts and People's Expectations of Current and Future Artificial Intelligence|url=https://www.researchgate.net/publication/352638184|journal=Icom|volume=20 |issue=1 |pages=3–17 |doi=10.1515/icom-2021-0009|s2cid=233328352}}</ref> XAI may be an implementation of the social [[right to explanation]].<ref name=":0">{{Cite journal|last1=Edwards|first1=Lilian|last2=Veale|first2=Michael|date=2017|title=Slave to the Algorithm? Why a 'Right to an Explanation' Is Probably Not the Remedy You Are Looking For|journal=Duke Law and Technology Review|volume=16|pages=18|ssrn=2972855}}</ref> Even if there is no such legal right or regulatory requirement, XAI can improve the [[user experience]] of a product or service by helping end users trust that the AI is making good decisions.<ref>{{Cite web |last=Do Couto |first=Mark |date=February 22, 2024 |title=Entering the Age of Explainable AI |url=https://tdwi.org/Articles/2024/02/22/ADV-ALL-Entering-the-Age-of-Explainable-AI.aspx |access-date=2024-09-11 |website=TDWI}}</ref> XAI aims to explain what has been done, what is being done, and what will be done next, and to unveil which information these actions are based on.<ref name=":3">{{Cite journal|last1=Gunning|first1=D.|last2=Stefik|first2=M.|last3=Choi|first3=J.|last4=Miller|first4=T.|last5=Stumpf|first5=S.|last6=Yang|first6=G.-Z.|date=2019-12-18|title=XAI-Explainable artificial intelligence|url=https://openaccess.city.ac.uk/id/eprint/23405/|journal=Science Robotics|language=en|volume=4|issue=37|pages=eaay7120|doi=10.1126/scirobotics.aay7120|pmid=33137719|issn=2470-9476|doi-access=free}}</ref> This makes it possible to confirm existing knowledge, challenge existing knowledge, and generate new assumptions.<ref>{{Cite journal|last1=Rieg|first1=Thilo|last2=Frick|first2=Janek|last3=Baumgartl|first3=Hermann|last4=Buettner|first4=Ricardo|date=2020-12-17|title=Demonstration of the potential of white-box machine learning approaches to gain insights from cardiovascular disease electrocardiograms|journal=PLOS ONE|language=en|volume=15|issue=12|pages=e0243615|doi=10.1371/journal.pone.0243615|issn=1932-6203|pmc=7746264|pmid=33332440|bibcode=2020PLoSO..1543615R|doi-access=free}}</ref>
[[Machine learning]] (ML) algorithms used in AI can be categorized as [[White-box testing|white-box]] or [[Black box|black-box]].<ref>{{Cite journal|last1=Vilone|first1=Giulia|last2=Longo|first2=Luca|title= Classification of Explainable Artificial Intelligence Methods through Their Output Formats |journal=Machine Learning and Knowledge Extraction|year=2021|volume=3|issue=3|pages=615–661|doi=10.3390/make3030032|doi-access=free }}</ref> White-box models provide results that are understandable to experts in the ___domain. Black-box models, on the other hand, are extremely hard to explain and may not be understood even by ___domain experts.<ref>{{Cite journal|last=Loyola-González|first=O.|date=2019|title=Black-Box vs. White-Box: Understanding Their Advantages and Weaknesses From a Practical Point of View|journal=IEEE Access|volume=7|pages=154096–154113|doi=10.1109/ACCESS.2019.2949286|bibcode=2019IEEEA...7o4096L |issn=2169-3536|doi-access=free}}</ref> XAI algorithms follow the three principles of transparency, interpretability, and explainability. A model is transparent "if the processes that extract model parameters from training data and generate labels from testing data can be described and motivated by the approach designer."<ref name=":4">{{Cite journal|last1=Roscher|first1=R.|last2=Bohn|first2=B.|last3=Duarte|first3=M. F.|last4=Garcke|first4=J.|date=2020|title=Explainable Machine Learning for Scientific Insights and Discoveries|journal=IEEE Access|volume=8|pages=42200–42216|doi=10.1109/ACCESS.2020.2976199|arxiv=1905.08883 |bibcode=2020IEEEA...842200R |issn=2169-3536|doi-access=free}}</ref> Interpretability describes the possibility of comprehending the ML model and presenting the underlying basis for decision-making in a way that is understandable to humans.<ref name="Interpretable machine learning: def">{{cite journal|last1=Murdoch|first1=W. James|last2=Singh|first2=Chandan|last3=Kumbier|first3=Karl|last4=Abbasi-Asl|first4=Reza|last5=Yu|first5=Bin|date=2019-01-14|title=Interpretable machine learning: definitions, methods, and applications|journal=Proceedings of the National Academy of Sciences of the United States of America|volume=116|issue=44|pages=22071–22080|arxiv=1901.04592|doi=10.1073/pnas.1900654116|pmid=31619572|pmc=6825274|bibcode= |doi-access=free}}</ref><ref name="Lipton 31–57">{{Cite journal|last=Lipton|first=Zachary C.|date=June 2018|title=The Mythos of Model Interpretability: In machine learning, the concept of interpretability is both important and slippery.|journal=Queue|language=en|volume=16|issue=3|pages=31–57|doi=10.1145/3236386.3241340|issn=1542-7730|doi-access=free}}</ref><ref>{{Cite web|date=2019-10-22|title=Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI|url=https://deepai.org/publication/explainable-artificial-intelligence-xai-concepts-taxonomies-opportunities-and-challenges-toward-responsible-ai|access-date=2021-01-13|website=DeepAI}}</ref> Explainability is a concept that is recognized as important, but a consensus definition is not yet available;<ref name=":4" /> one possibility is "the collection of features of the interpretable ___domain that have contributed, for a given example, to producing a decision (e.g., classification or regression)".<ref>{{Cite journal|date=2018-02-01|title=Methods for interpreting and understanding deep neural networks|journal=Digital Signal Processing|language=en|volume=73|pages=1–15|doi=10.1016/j.dsp.2017.10.011|issn=1051-2004|doi-access=free|last1=Montavon|first1=Grégoire|last2=Samek|first2=Wojciech|last3=Müller|first3=Klaus-Robert|arxiv=1706.07979 |bibcode=2018DSP....73....1M |author-link3=Klaus-Robert Müller}}</ref>
In summary, Interpretability refers to the user's ability to understand model outputs, while Model Transparency includes Simulatability (reproducibility of predictions), Decomposability (intuitive explanations for parameters), and Algorithmic Transparency (explaining how algorithms work). Model Functionality focuses on textual descriptions, visualization, and local explanations, which clarify specific outputs or instances rather than entire models. All these concepts aim to enhance the comprehensibility and usability of AI systems
If algorithms fulfill these principles, they provide a basis for justifying decisions, tracking them and thereby verifying them, improving the algorithms, and exploring new facts.<ref>{{Cite journal|last1=Adadi|first1=A.|last2=Berrada|first2=M.|date=2018|title=Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)|journal=IEEE Access|volume=6|pages=52138–52160|doi=10.1109/ACCESS.2018.2870052|bibcode=2018IEEEA...652138A |issn=2169-3536|doi-access=free}}</ref>
Line 43:
* ''Feature importance'' estimates how important a feature is for the model. It is usually done using ''permutation importance'', which measures the performance decrease when it the feature value randomly shuffled across all samples.
* ''LIME'' approximates locally a model's outputs with a simpler, interpretable model.<ref>{{Cite web |last=Rothman |first=Denis |date=2020-10-07 |title=Exploring LIME Explanations and the Mathematics Behind It |url=https://www.codemotion.com/magazine/ai-ml/lime-explainable-ai/ |access-date=2024-07-10 |website=Codemotion Magazine |language=en-US}}</ref>
* ''[[Multitask learning
For images, [[Saliency map|saliency maps]] highlight the parts of an image that most influenced the result.<ref>{{Cite web |last=Sharma |first=Abhishek |date=2018-07-11 |title=What Are Saliency Maps In Deep Learning? |url=https://analyticsindiamag.com/what-are-saliency-maps-in-deep-learning/ |access-date=2024-07-10 |website=Analytics India Magazine |language=en-US}}</ref>
Line 52:
Scholars sometimes use the term "mechanistic interpretability" to refer to the process of [[Reverse engineering|reverse-engineering]] [[artificial neural networks]] to understand their internal decision-making mechanisms and components, similar to how one might analyze a complex machine or computer program.<ref>{{Cite web |last=Olah |first=Chris |date=June 27, 2022 |title=Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases |url=https://www.transformer-circuits.pub/2022/mech-interp-essay |access-date=2024-07-10 |website=www.transformer-circuits.pub}}</ref>
Interpretability research often focuses on generative pretrained transformers. It is particularly relevant for [[AI safety]] and [[AI alignment|alignment
Studying the interpretability of the most advanced [[Foundation model|foundation models]] often involves searching for an automated way to identify "features" in generative pretrained transformers. In a [[Neural network (machine learning)|neural network]], a feature is a pattern of neuron activations that corresponds to a concept. A compute-intensive technique called "[[dictionary learning]]" makes it possible to identify features to some degree. Enhancing the ability to identify and edit features is expected to significantly improve the [[AI safety|safety]] of [[Frontier model|frontier AI models]].<ref>{{Cite web |last=Ropek |first=Lucas |date=2024-05-21 |title=New Anthropic Research Sheds Light on AI's 'Black Box' |url=https://gizmodo.com/new-anthropic-research-sheds-light-on-ais-black-box-1851491333 |access-date=2024-05-23 |website=Gizmodo |language=en}}</ref><ref>{{Cite magazine |last=Perrigo |first=Billy |date=2024-05-21 |title=Artificial Intelligence Is a 'Black Box.' Maybe Not For Long |url=https://time.com/6980210/anthropic-interpretability-ai-safety-research/ |access-date=2024-05-24 |magazine=Time |language=en}}</ref>
|