Revision as of 07:10, 20 November 2024 edit Dajasj (talk \| contribs) Extended confirmed users, New page reviewers 31,982 edits →See also Tags: Mobile edit Mobile web edit Advanced mobile edit ← Previous edit		Revision as of 12:39, 29 November 2024 edit undo Prototyperspective (talk \| contribs) Extended confirmed users 6,107 edits expanded (+1 see also, +2 ELs, +2 images, +1 audio, +wmc cat, added info on attribution concerns & ClueBot & MT), ce Next edit →
Line 7: ===ORES=== The Objective Revision Evaluation Service (ORES) project is an artificial intelligence service for grading the quality of Wikipedia edits.<ref>{{cite web \|last1=Simonite \|first1=Tom \|title=Software That Can Spot Rookie Mistakes Could Make Wikipedia More Welcoming \|url=https://www.technologyreview.com/s/544036/artificial-intelligence-aims-to-make-wikipedia-friendlier-and-better/ \|website=MIT Technology Review \|language=en \|date=1 December 2015}}</ref><ref>{{Cite magazine \|last1=Metz \|first1=Cade \|title=Wikipedia Deploys AI to Expand Its Ranks of Human Editors \|url=https://www.wired.com/2015/12/wikipedia-is-using-ai-to-expand-the-ranks-of-human-editors/ \|magazine=Wired \|date=1 December 2015\|archive-url=https://web.archive.org/web/20240402000516/https://www.wired.com/2015/12/wikipedia-is-using-ai-to-expand-the-ranks-of-human-editors/\|archive-date=2 Apr 2024}}</ref> The Wikimedia Foundation presented the ORES project in November 2015.<ref>{{cite web \|last1=Halfaker \|first1=Aaron \|last2=Taraborelli \|first2=Dario \|title=Artificial intelligence service "ORES" gives Wikipedians X-ray specs to see through bad edits \|url=https://wikimediafoundation.org/2015/11/30/artificial-intelligence-x-ray-specs/ \|website=Wikimedia Foundation \|date=30 November 2015}}</ref> === Wiki bots === {{Excerpt\|Vandalism on Wikipedia\|ClueBot NG}} ===Detox=== Line 14 ⟶ 17: In August 2018, a company called Primer reported attempting to use artificial intelligence to create Wikipedia articles about women as a way to address [[gender bias on Wikipedia]].<ref>{{Cite magazine \|last1=Simonite \|first1=Tom \|title=Using Artificial Intelligence to Fix Wikipedia's Gender Problem \|url=https://www.wired.com/story/using-artificial-intelligence-to-fix-wikipedias-gender-problem/ \|magazine=Wired \|date=3 August 2018}}</ref><ref>{{cite web \|last1=Verger \|first1=Rob \|title=Artificial intelligence can now help write Wikipedia pages for overlooked scientists \|url=https://www.popsci.com/artificial-intelligence-scientists-wikipedia \|website=Popular Science \|language=en \|date=7 August 2018}}</ref> [[File:DeepL machine translation of English Wikipedia example.png\|thumb\|Machine translation software such as [[DeepL]] is used by contributors<ref>{{cite journal \|last1=Costa-jussà \|first1=Marta R. \|last2=Cross \|first2=James \|last3=Çelebi \|first3=Onur \|last4=Elbayad \|first4=Maha \|last5=Heafield \|first5=Kenneth \|last6=Heffernan \|first6=Kevin \|last7=Kalbassi \|first7=Elahe \|last8=Lam \|first8=Janice \|last9=Licht \|first9=Daniel \|last10=Maillard \|first10=Jean \|last11=Sun \|first11=Anna \|last12=Wang \|first12=Skyler \|last13=Wenzek \|first13=Guillaume \|last14=Youngblood \|first14=Al \|last15=Akula \|first15=Bapi \|last16=Barrault \|first16=Loic \|last17=Gonzalez \|first17=Gabriel Mejia \|last18=Hansanti \|first18=Prangthip \|last19=Hoffman \|first19=John \|last20=Jarrett \|first20=Semarley \|last21=Sadagopan \|first21=Kaushik Ram \|last22=Rowe \|first22=Dirk \|last23=Spruit \|first23=Shannon \|last24=Tran \|first24=Chau \|last25=Andrews \|first25=Pierre \|last26=Ayan \|first26=Necip Fazil \|last27=Bhosale \|first27=Shruti \|last28=Edunov \|first28=Sergey \|last29=Fan \|first29=Angela \|last30=Gao \|first30=Cynthia \|last31=Goswami \|first31=Vedanuj \|last32=Guzmán \|first32=Francisco \|last33=Koehn \|first33=Philipp \|last34=Mourachko \|first34=Alexandre \|last35=Ropers \|first35=Christophe \|last36=Saleem \|first36=Safiyyah \|last37=Schwenk \|first37=Holger \|last38=Wang \|first38=Jeff \|title=Scaling neural machine translation to 200 languages \|journal=Nature \|date=June 2024 \|volume=630 \|issue=8018 \|pages=841–846 \|doi=10.1038/s41586-024-07335-x \|language=en \|issn=1476-4687}}</ref><ref name="nyt180724"/><ref name="considerations">{{cite web \|title=Considerations for Multilingual Wikipedia Research \|url=https://arxiv.org/abs/2204.02483}}</ref> More than 40% of Wikipedia's active editors ===Generative language models===▼ are in [[English Wikipedia]].<ref>{{cite web \|title=InfoSync: Information Synchronization across Multilingual Semi-structured Tables \|url=https://arxiv.org/abs/2307.03313}}</ref>]] ▲===Generative ~~language~~ models=== [[File:Wikipedia - Artificial intelligence in Wikimedia projects (spoken by AI voice).mp3\|thumb\|Wikipedia articles can be read using AI voice technology]] ====Text==== In 2022, the public release of [[ChatGPT]] inspired more experimentation with AI and writing Wikipedia articles. A debate was sparked about whether and to what extent such [[large language model]]s are suitable for such purposes in light of their tendency to [[Hallucination (artificial intelligence)\|generate plausible-sounding misinformation]], including fake references; to generate prose that is not encyclopedic in tone; and to [[Algorithmic bias\|reproduce biases]].<ref>{{Cite web \|last=Harrison \|first=Stephen \|date=2023-01-12 \|title=Should ChatGPT Be Used to Write Wikipedia Articles? \|url=https://slate.com/technology/2023/01/chatgpt-wikipedia-articles.html \|access-date=2023-01-13 \|website=Slate Magazine \|language=en}}</ref><ref name ="vice"/> {{As of\|2023\|05}}, a draft Wikipedia policy on ChatGPT and similar [[large language model]]s (LLMs) recommended that users who are unfamiliar with LLMs should avoid using them due to the aforementioned risks, as well as the potential for [[libel]] or [[copyright infringement]].<ref name ="vice">{{cite news \|last1=Woodcock \|first1=Claire \|title=AI Is Tearing Wikipedia Apart \|url=https://www.vice.com/en/article/v7bdba/ai-is-tearing-wikipedia-apart \|work=Vice \|date=2 May 2023 \|language=en}}</ref> ====Other media==== A [[WikiProject]] exists for finding and removing AI-generated text and images, called WikiProject AI Cleanup.<ref>{{Cite news \|last=Maiberg \|first=Emanuel \|date=October 9, 2024 \|title=The Editors Protecting Wikipedia from AI Hoaxes \|url=https://www.404media.co/the-editors-protecting-wikipedia-from-ai-hoaxes/ \|access-date=October 9, 2024 \|work=[[404 Media]]}}</ref> ==Using Wikimedia projects for artificial intelligence== [[File:Models of high-quality language data – (a) Composition of high-quality datasets - The Pile (left), PaLM (top-right), MassiveText (bottom-right).png\|thumb\|Datasets of Wikipedia are widely used for training AI models<ref>{{cite web \|title=Will we run out of data? Limits of LLM scaling based on human-generated data \|url=https://arxiv.org/abs/2211.04325 \|access-date=29 November 2024}}</ref>]] Content in Wikimedia projects is useful as a dataset in advancing artificial intelligence research and applications. For instance, in the development of the Google's [[Perspective API]] that identifies toxic comments in online forums, a dataset containing hundreds of thousands of Wikipedia talk page comments with human-labelled toxicity levels was used.<ref>{{Cite news\|url=https://www.engadget.com/2017/09/01/google-perspective-comment-ranking-system/\|title=Google's comment-ranking system will be a hit with the alt-right\|work=Engadget\|date=2017-09-01}}</ref> Subsets of the Wikipedia corpus are considered the largest well-curated data sets available for AI training.<ref name="nyt180724"/><ref name="considerations"/> A 2012 paper reported that more than 1000 academic articles, including those using artificial intelligence, examine Wikipedia, reuse information from Wikipedia, use technical extensions linked to Wikipedia, or research communication about Wikipedia.<ref>{{cite journal \|last1=Nielsen \|first1=Finn Årup \|title=Wikipedia Research and Tools: Review and Comments \|journal=SSRN Working Paper Series \|date=2012 \|doi=10.2139/ssrn.2129874 \|language=en \|issn=1556-5068}}</ref> A 2017 paper described Wikipedia as the [[mother lode]] for human-generated text available for machine learning.<ref>{{cite journal \|last1=Mehdi \|first1=Mohamad \|last2=Okoli \|first2=Chitu \|last3=Mesgari \|first3=Mostafa \|last4=Nielsen \|first4=Finn Årup \|last5=Lanamäki \|first5=Arto \|title=Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus \|journal=Information Processing & Management \|volume=53 \|issue=2 \|pages=505–529 \|doi=10.1016/j.ipm.2016.07.003 \|date=March 2017\|s2cid=217265814 \|url=http://urn.fi/urn:nbn:fi-fe202003057304 }}</ref> A 2016 research project called "One Hundred Year Study on Artificial Intelligence" named Wikipedia as a key early project for understanding the interplay between artificial intelligence applications and human engagement.<ref>{{cite web \|title=AI Research Trends - One Hundred Year Study on Artificial Intelligence (AI100) \|url=https://ai100.stanford.edu/2016-report/section-i-what-artificial-intelligence/ai-research-trends \|website=ai100.stanford.edu \|language=en}}</ref> There is a concern about the lack of attribution to Wikipedia articles in large-language models like ChatGPT.<ref name="nyt180724">{{cite news \|title=Wikipedia’s Moment of Truth \|url=https://www.nytimes.com/2023/07/18/magazine/wikipedia-ai-chatgpt.html \|access-date=29 November 2024 \|work=New York Times}}</ref> While Wikipedia's licensing policy lets anyone use its texts, including in modified forms, it does have the condition that credit is given, implying that using its contents in answers by AI models without clarifying the sourcing may violate its terms of use.<ref name="nyt180724"/> ==See also== {{Commons category\|Wikimedia projects and AI}} * [[:mw:ORES\|ORES Mediawiki page]] * [[Wikipedia:Artificial intelligence]] * [[Open-source artificial intelligence]] {{clear}} ==References== {{reflist}} ==External links== * [[meta:Artificial intelligence]] * [[wikitech:Machine Learning/LiftWing]] {{Wikimedia Foundation\|state=collapsed}}

Artificial intelligence in Wikimedia projects: Difference between revisions