This article is actively undergoing a major edit for a little while. To help avoid edit conflicts, please do not edit this page while this message is displayed. This message was added at 20:43, 25 August 2025 (UTC). This page was last edited at 20:43, 25 August 2025 (UTC) (5 days ago) – this estimate is cached, . Please remove this template if this page hasn't been edited for a significant time. If you are the editor who added this template, please be sure to remove it or replace it with {{Under construction}} between editing sessions. |
Artificial intelligence is used in Wikipedia and other Wikimedia projects for the purpose of developing those projects.[1][2] Human and bot interaction in Wikimedia projects is routine and iterative.[3]
Various articles on Wikipedia have been created entirely or with the help of artificial intelligence. AI-generated content can be detrimental to Wikipedia when unreliable or containing fake citations.
To address the issue of low-quality AI-generated content, the Wikipedia community created in 2023 a WikiProject named AI Cleanup. On August 2025, Wikipedia adopted a policy that allowed editors to nominate suspected AI-generated articles for speedy deletion.
Using artificial intelligence for Wikipedia
Beginnings
The use of AI to generate articles on Wikipedia started with the rise in popularity of chatbots like ChatGPT in 2022. In 2023, the Wikipedia community noticed the problem and created a special WikiProject named AI Cleanup to clean Wikipedia from AI content. The project wrote its own guidelines to help users in spotting AI content. As of 2025, it had created a list of over 500 articles pending review for suspected AI writing. Wikipedia has also created a special template for suspected AI-generated articles, which was used in articles like Danish nationalism and Natalie Portman. On October 2024, a study by Princeton University revealed that about 5% of 3,000 newly created articles (created on August 2024) on English Wikipedia were created using AI. The study said that some of the AI articles were on innocuous topics and that AI had likely only been used to assist in writing. For some other articles, AI had been used to promote businesses or political interests.
On December 6, 2022, a Wikipedia contributor named Pharos created the article "Artwork title" in his sandbox, declaring he used ChatGPT to experiment with it and would extensively modify it. He noted that the text needed to be toned down for neutrality. Another editor tagged the article as "original research", arguing that the article was initially unsourced AI-generated content, and sourced afterwards, instead of being based on reliable sources from the outset. Another editor who experimented with this early version of ChatGPT said that ChatGPT's overview of the topic was decent, but that the citations were fabricated. Wiki Education Foundation reported that some experienced editors found AI to be useful in starting drafts or creating new articles. It said that ChatGPT “knows” what Wikipedia articles look like and can easily generate one that is written in the style of Wikipedia. It warned editors that ChatGPT had a tendency to use promotional language. Miguel García, Wikipedia editor from Spain, said that when ChatGPT was originally launched, the number of AI-generated articles on the site peaked. He added that the rate of AI articles has now stabilized due to the community's efforts to combat it. He said that majority of the articles that have no sources are deleted instantly or are nominated for deletion.
Signs of AI use and speedy deletion
On August 2025, the Wikipedia community created a policy that allowed users to nominate suspected AI-generated articles for speedy deletion. Editors usually recognize AI-generated articles because they use citations that are not related to the subject of the article or fabricated citations. The wording of articles is also used to recognize AI writings. For example, if an article uses language that reads like an LLM response to a user, such as "Here is your Wikipedia article on” or “Up to my last training update”, the article is typically tagged for speedy deletion. Other signs of AI use include: excessive use of em dashes, overuse of the word "moreover", promotional material in articles that describes something as "breathtaking” and formatting issues like using curly quotation marks instead of straight versions. During the discussion on implementing the speedy deletion policy, one user, who is an article reviewer, said that he is “flooded non-stop with horrendous drafts” created using AI. Other users said that AI articles have a large amount of “lies and fake references” and that it takes a significant amount of time to fix the issues.
Ilyas Lebleu, founder of WikiProject AI Cleanup, said that he and his fellow editors noticed a pattern of unnatural writing that they managed to connect to ChatGPT. He added that AI is able to mass-produce content that sounds real while being completely fake, leading to the creation of hoax articles on Wikipedia that he was tasked to delete. Wikipedia created a guide on how to spot signs of AI-generated writing. The guide states that AI uses editorial commentary in its content, it listed phrases like "it's important to note", "it is worth", and "no discussion would be complete without" as examples. It also said that AI uses phrases like "In summary", "In conclusion", and "Overall" in the end of the articles. The guide also said that fabricated sources are also a major sign of AI use, AI is known to create hallucinated sources that have fake DOIs or ISBNs, or broken 404 links. The guide also added that ChatGPT is known to add broken code while adding external links to articles, leaving "turn0search0" in its links. The guide is called "Signs of AI writing".
Hoaxes and malicious AI use
In 2023, researchers discovered that ChatGPT can unintentionally fabricate information and make up fake articles for its users. At that time, ban on AI was deemed "too harsh" by the community. AI was deliberately used to create various hoax articles on Wikipedia. For example, Ilyas Lebleu and his team managed to expose a well-written 2,000-word article about an Ottoman fortress that never existed. The content of the article, while completely wrong, was difficult to debunk without knowledge of 13th-century Ottoman architecture. Another examle showed an user adding AI-generated misinformation to Estola albosignata, species of beetle. The paragraph the user cited and the citation looked normal, however, the source was not related to the subject at all and was about unrelated species of crab in French.
AI has been used on Wikipedia to advocate for certain political viewpoints in articles covered by contentious topic guidelines. One instance showed a banned editor using AI to engage in edit wars and manipulate Albanian history-related articles. Other instances included users generating articles about political movements or weapons, but dedicating the majority of the content to a different subject, such as by covering JD Vance or Volodymyr Zelensky in a non-neutral way. Ilyas Lebleu said that there are many reasons to why some users add AI-generated content to Wikipedia, he said that they include deliberate vandalism with the intention to create a hoax, self-promotion or them falsely thinking that AI-generated content is correct.
ORES
The Objective Revision Evaluation Service (ORES) project is an artificial intelligence service for grading the quality of Wikipedia edits.[4][5] The Wikimedia Foundation presented the ORES project in November 2015.[6]
Wiki bots
Bias reduction
In August 2018, a company called Primer reported attempting to use artificial intelligence to create Wikipedia articles about women as a way to address gender bias on Wikipedia.[10][11]
Generative models
Text
In 2022, the public release of ChatGPT inspired more experimentation with AI and writing Wikipedia articles. A debate was sparked about whether and to what extent such large language models are suitable for such purposes in light of their tendency to generate plausible-sounding misinformation, including fake references; to generate prose that is not encyclopedic in tone; and to reproduce biases.[12][13] Since 2023, work has been done to draft Wikipedia policy on ChatGPT and similar large language models (LLMs), e.g. at times recommending that users who are unfamiliar with LLMs should avoid using them due to the aforementioned risks, as well as noting the potential for libel or copyright infringement.[13] Some relevant policies are linked at WikiProject AI Cleanup/Policies.
Other media
A WikiProject exists for finding and removing AI-generated text and images, called WikiProject AI Cleanup.[14]
Simple Article Summaries
In 2025, Wikimedia started testing a "Simple Article Summaries" feature which would provide AI-generated summaries of Wikipedia articles, similar to Google Search's AI Overviews. The decision was met with immediate and harsh criticism from Wikipedia editors, who called the feature a "ghastly idea" and a "PR hype stunt." They criticized a perceived loss of trust in the site due to AI's tendency to hallucinate and questioned the necessity of the feature.[15] The negative criticism led Wikimedia to halt the rollout of Simple Article Summaries while hinting that they are still interested in how generative AI could be integrated into Wikipedia.[16]
Using artificial intelligence for other Wikimedia projects
Detox
Detox was a project by Google, in collaboration with the Wikimedia Foundation, to research methods that could be used to address users posting unkind comments in Wikimedia community discussions.[17] Among other parts of the Detox project, the Wikimedia Foundation and Jigsaw collaborated to use artificial intelligence for basic research and to develop technical solutions[example needed] to address the problem. In October 2016 those organizations published "Ex Machina: Personal Attacks Seen at Scale" describing their findings.[18][19] Various popular media outlets reported on the publication of this paper and described the social context of the research.[20][21][22]
Using Wikipedia for artificial intelligence
In the development of the Google's Perspective API that identifies toxic comments in online forums, a dataset containing hundreds of thousands of Wikipedia talk page comments with human-labelled toxicity levels was used.[28] Subsets of the Wikipedia corpus are considered the largest well-curated data sets available for AI training.[24][25]
A 2012 paper reported that more than 1,000 academic articles, including those using artificial intelligence, examine Wikipedia, reuse information from Wikipedia, use technical extensions linked to Wikipedia, or research communication about Wikipedia.[29] A 2017 paper described Wikipedia as the mother lode for human-generated text available for machine learning.[30]
A 2016 research project called "One Hundred Year Study on Artificial Intelligence" named Wikipedia as a key early project for understanding the interplay between artificial intelligence applications and human engagement.[31]
There is a concern about the lack of attribution to Wikipedia articles in large-language models like ChatGPT.[24][32] While Wikipedia's licensing policy lets anyone use its texts, including in modified forms, it does have the condition that credit is given, implying that using its contents in answers by AI models without clarifying the sourcing may violate its terms of use.[24]
Reactions
In November 2023, Wikipedia co-founder Jimmy Wales said that AI is not a reliable source and that he is not going to use ChatGPT to write Wikipedia articles. In July 2025, he proposed the use of LLMs to provide customized default feedback when drafts are rejected.
Wikimedia Foundation product director Marshall Miller said that WikiProject AI Cleanup keeps the site's content neutral and reliable and that AI enables the creation of low-quality content. When interviewed by 404 Media, Ilyas Lebleu described speedy deletion as a "band-aid" for more serious instances of AI use, and said that the bigger problem of AI use will continue. He also said that some AI articles are discussed for one week before being deleted.
See also
See also
References
- ^ Marr, Bernard (17 August 2018). "The Amazing Ways How Wikipedia Uses Artificial Intelligence". Forbes.
- ^ Gertner, Jon (18 July 2023). "Wikipedia's Moment of Truth - Can the online encyclopedia help teach A.I. chatbots to get their facts right — without destroying itself in the process? + comment". The New York Times. Archived from the original on 18 July 2023. Retrieved 19 July 2023.
{{cite news}}
: CS1 maint: bot: original URL status unknown (link) - ^ Piscopo, Alessandro (1 October 2018). "Wikidata: A New Paradigm of Human-Bot Collaboration?". arXiv:1810.00931 [cs.HC].
- ^ Simonite, Tom (1 December 2015). "Software That Can Spot Rookie Mistakes Could Make Wikipedia More Welcoming". MIT Technology Review.
- ^ Metz, Cade (1 December 2015). "Wikipedia Deploys AI to Expand Its Ranks of Human Editors". Wired. Archived from the original on 2 Apr 2024.
- ^ Halfaker, Aaron; Taraborelli, Dario (30 November 2015). "Artificial intelligence service "ORES" gives Wikipedians X-ray specs to see through bad edits". Wikimedia Foundation.
- ^ Hicks, Jesse (18 February 2014). "This machine kills trolls". The Verge. Archived from the original on 27 August 2014. Retrieved 18 February 2014.
- ^ Nasaw, Daniel (25 July 2012). "Meet the 'bots' that edit Wikipedia". BBC News. Archived from the original on 16 September 2018. Retrieved 21 July 2018.
- ^ Raja, Sumit. "Little about the bot that runs Wikipedia, ClueBot NG". Archived from the original on 22 November 2013. Retrieved 11 April 2017.
- ^ Simonite, Tom (3 August 2018). "Using Artificial Intelligence to Fix Wikipedia's Gender Problem". Wired.
- ^ Verger, Rob (7 August 2018). "Artificial intelligence can now help write Wikipedia pages for overlooked scientists". Popular Science.
- ^ Harrison, Stephen (2023-01-12). "Should ChatGPT Be Used to Write Wikipedia Articles?". Slate Magazine. Retrieved 2023-01-13.
- ^ a b Woodcock, Claire (2 May 2023). "AI Is Tearing Wikipedia Apart". Vice.
- ^ Maiberg, Emanuel (October 9, 2024). "The Editors Protecting Wikipedia from AI Hoaxes". 404 Media. Retrieved October 9, 2024.
- ^ https://arstechnica.com/ai/2025/06/yuck-wikipedia-pauses-ai-summaries-after-editor-revolt/
- ^ https://techcrunch.com/2025/06/11/wikipedia-pauses-ai-generated-summaries-pilot-after-editors-protest/
- ^ Research:Detox - Meta.
- ^ Wulczyn, Ellery; Thain, Nithum; Dixon, Lucas (2017). "Ex Machina: Personal Attacks Seen at Scale". Proceedings of the 26th International Conference on World Wide Web. pp. 1391–1399. arXiv:1610.08914. doi:10.1145/3038912.3052591. ISBN 9781450349130. S2CID 6060248.
- ^ Jigsaw (7 February 2017). "Algorithms And Insults: Scaling Up Our Understanding Of Harassment On Wikipedia". Medium.
- ^ Wakabayashi, Daisuke (23 February 2017). "Google Cousin Develops Technology to Flag Toxic Online Comments". The New York Times.
- ^ Smellie, Sarah (17 February 2017). "Inside Wikipedia's Attempt to Use Artificial Intelligence to Combat Harassment". Motherboard. Vice Media.
- ^ Gershgorn, Dave (27 February 2017). "Alphabet's hate-fighting AI doesn't understand hate yet". Quartz.
- ^ Costa-jussà, Marta R.; Cross, James; Çelebi, Onur; Elbayad, Maha; Heafield, Kenneth; Heffernan, Kevin; Kalbassi, Elahe; Lam, Janice; Licht, Daniel; Maillard, Jean; Sun, Anna; Wang, Skyler; Wenzek, Guillaume; Youngblood, Al; Akula, Bapi; Barrault, Loic; Gonzalez, Gabriel Mejia; Hansanti, Prangthip; Hoffman, John; Jarrett, Semarley; Sadagopan, Kaushik Ram; Rowe, Dirk; Spruit, Shannon; Tran, Chau; Andrews, Pierre; Ayan, Necip Fazil; Bhosale, Shruti; Edunov, Sergey; Fan, Angela; Gao, Cynthia; Goswami, Vedanuj; Guzmán, Francisco; Koehn, Philipp; Mourachko, Alexandre; Ropers, Christophe; Saleem, Safiyyah; Schwenk, Holger; Wang, Jeff (June 2024). "Scaling neural machine translation to 200 languages". Nature. 630 (8018): 841–846. Bibcode:2024Natur.630..841N. doi:10.1038/s41586-024-07335-x. ISSN 1476-4687. PMC 11208141. PMID 38839963.
- ^ a b c d "Wikipedia's Moment of Truth". New York Times. 18 July 2023. Retrieved 29 November 2024.
- ^ a b Johnson, Isaac; Lescak, Emily (2022). "Considerations for Multilingual Wikipedia Research". arXiv:2204.02483 [cs.CY].
- ^ Mamadouh, Virginie (2020). "Wikipedia: Mirror, Microcosm, and Motor of Global Linguistic Diversity". Handbook of the Changing World Language Map. Springer International Publishing. pp. 3773–3799. doi:10.1007/978-3-030-02438-3_200. ISBN 978-3-030-02438-3.
Some versions have expanded dramatically using machine translation through the work of bots or web robots generating articles by translating them automatically from the other Wikipedias, often the English Wikipedia. […] In any event, the English Wikipedia is different from the others because it clearly serves a global audience, while other versions serve more localized audience, even if the Portuguese, Spanish, and French Wikipedias also serves a public spread across different continents
- ^ Khincha, Siddharth; Jain, Chelsi; Gupta, Vivek; Kataria, Tushar; Zhang, Shuo (2023). "InfoSync: Information Synchronization across Multilingual Semi-structured Tables". arXiv:2307.03313 [cs.CL].
- ^ "Google's comment-ranking system will be a hit with the alt-right". Engadget. 2017-09-01.
- ^ Nielsen, Finn Årup (2012). "Wikipedia Research and Tools: Review and Comments". SSRN Working Paper Series. doi:10.2139/ssrn.2129874. ISSN 1556-5068.
- ^ Mehdi, Mohamad; Okoli, Chitu; Mesgari, Mostafa; Nielsen, Finn Årup; Lanamäki, Arto (March 2017). "Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus". Information Processing & Management. 53 (2): 505–529. doi:10.1016/j.ipm.2016.07.003. S2CID 217265814.
- ^ "AI Research Trends - One Hundred Year Study on Artificial Intelligence (AI100)". ai100.stanford.edu.
- ^ "Wikipedia Built the Internet's Brain. Now Its Leaders Want Credit". Observer. 28 March 2025. Retrieved 2 April 2025.
Attributions, however, remain a sticking point. Citations not only give credit but also help Wikipedia attract new editors and donors. " If our content is getting sucked into an LLM without attribution or links, that's a real problem for us in the short term,"
- ^ Villalobos, Pablo; Ho, Anson; Sevilla, Jaime; Besiroglu, Tamay; Heim, Lennart; Hobbhahn, Marius (2022). "Will we run out of data? Limits of LLM scaling based on human-generated data". arXiv:2211.04325 [cs.LG].