Algorithmic bias: Difference between revisions

Content deleted Content added
Stereotyping: added ref
m Racial and ethnic discrimination: Change "www.timesofisrael.com" to "The Times of Israel"
 
(22 intermediate revisions by 15 users not shown)
Line 6:
{{Discrimination sidebar}}
 
'''Algorithmic bias''' describes systematic and repeatable [[error]]sharmful tendency in a [[Computer System|computercomputerized]] [[Sociotechnical system|sociotechnical]] thatsystem to create "[[#Defining fairness|unfair]]" outcomes, such as "privileging" one category over another in ways different from the intended function of the algorithm.<ref>{{Citation |last1=Hardebolle |first1=Cécile |title=Engineering ethics education and artificial intelligence |date=2024-11-25 |work=The Routledge International Handbook of Engineering Ethics Education |pages=125–142 |edition=1 |place=London |publisher=Routledge |language=en |doi=10.4324/9781003464259-9 |isbn=978-1-003-46425-9 |last2=Héder |first2=Mihály |last3=Ramachandran |first3=Vivek|doi-access=free }}</ref>
 
Bias can emerge from many factors, including but not limited to the design of the algorithm or the unintended or unanticipated use or decisions relating to the way data is coded, collected, selected or used to train the algorithm.<ref>{{cite journal|last=Van Eyghen|first= Hans|title=AI Algorithms as (Un)virtuous Knowers|journal=Discover Artificial Intelligence|volume=5|issue=2|date=2025|doi= 10.1007/s44163-024-00219-z|url=https://link.springer.com/article/10.1007/s44163-024-00219-z|doi-access=free}}</ref> For example, algorithmic bias has been observed in [[Search engine bias|search engine results]] and [[social media bias|social media platforms]]. This bias can have impacts ranging from inadvertent privacy violations to reinforcing [[Bias|social biases]] of race, gender, sexuality, and ethnicity. The study of algorithmic bias is most concerned with algorithms that reflect "systematic and unfair" discrimination.<ref>{{Cite book |last=Marabelli |first=Marco |url=https://link.springer.com/book/10.1007/978-3-031-53919-0 |title=AI, Ethics, and Discrimination in Business |series=Palgrave Studies in Equity, Diversity, Inclusion, and Indigenization in Business |publisher=Springer |year=2024 |isbn=978-3-031-53918-3 |language=en |doi=10.1007/978-3-031-53919-0}}</ref> This bias has only recently been addressed in legal frameworks, such as the European Union's [[General Data Protection Regulation]] (proposed 2018) and the [[Artificial Intelligence Act]] (proposed 2021, approved 2024).
 
As algorithms expand their ability to organize society, politics, institutions, and behavior, sociologists have become concerned with the ways in which unanticipated output and manipulation of data can impact the physical world. Because algorithms are often considered to be neutral and unbiased, they can inaccurately project greater authority than human expertise (in part due to the psychological phenomenon of [[automation bias]]), and in some cases, reliance on algorithms can displace human responsibility for their outcomes. Bias can enter into algorithmic systems as a result of pre-existing cultural, social, or institutional expectations; by how features and labels are chosen; because of technical limitations of their design; or by being used in unanticipated contexts or by audiences who are not considered in the software's initial design.<ref>{{Cite book |last1=Suresh |first1=Harini |last2=Guttag |first2=John |title=Equity and Access in Algorithms, Mechanisms, and Optimization |chapter=A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle |date=2021-11-04 |chapter-url=https://dl.acm.org/doi/10.1145/3465416.3483305 |series=EAAMO '21 |___location=New York, NY, USA |publisher=Association for Computing Machinery |pages=1–9 |doi=10.1145/3465416.3483305 |isbn=978-1-4503-8553-4|s2cid=235436386 }}</ref>
 
Algorithmic bias has been cited in cases ranging from election outcomes to the spread of [[online hate speech]]. It has also arisen in criminal justice,<ref>{{Cite journal |last=Krištofík |first=Andrej |date=2025-04-28 |title=Bias in AI (Supported) Decision Making: Old Problems, New Technologies |journal=International Journal for Court Administration |language=en |volume=16 |issue=1 |doi=10.36745/ijca.598 |issn=2156-7964|doi-access=free }}</ref> healthcare, and hiring, compounding existing racial, socioeconomic, and gender biases. The relative inability of facial recognition technology to accurately identify darker-skinned faces has been linked to multiple wrongful arrests of black men, an issue stemming from imbalanced datasets. Problems in understanding, researching, and discovering algorithmic bias persist due to the proprietary nature of algorithms, which are typically treated as trade secrets. Even when full transparency is provided, the complexity of certain algorithms poses a barrier to understanding their functioning. Furthermore, algorithms may change, or respond to input or output in ways that cannot be anticipated or easily reproduced for analysis. In many cases, even within a single website or application, there is no single "algorithm" to examine, but a network of many interrelated programs and data inputs, even between users of the same service.
 
A 2021 survey identified multiple forms of algorithmic bias, including historical, representation, and measurement biases, each of which can contribute to unfair outcomes.<ref>{{Cite journal |last1=Mehrabi |first1=N. |last2=Morstatter |first2=F. |last3=Saxena |first3=N. |last4=Lerman |first4=K. |last5=Galstyan |first5=A. |title=A survey on bias and fairness in machine learning |journal=ACM Computing Surveys |volume=54 |issue=6 |pages=1–35 |year=2021 |doi=10.1145/3457607 |arxiv=1908.09635 |url=https://dl.acm.org/doi/10.1145/3457607 |access-date=April 30, 2025}}</ref>
 
== Definitions ==
Line 54 ⟶ 56:
and Transparency (FAT) of algorithms has emerged as its own interdisciplinary research area with an annual conference called FAccT.<ref>{{Cite web|url=https://facctconference.org/2021/press-release.html|title=ACM FAccT 2021 Registration
|website=fatconference.org|access-date=2021-11-14}}</ref> Critics have suggested that FAT initiatives cannot serve effectively as independent watchdogs when many are funded by corporations building the systems being studied.<ref name="Ochigame">{{cite web |last1=Ochigame |first1=Rodrigo |title=The Invention of "Ethical AI": How Big Tech Manipulates Academia to Avoid Regulation |url=https://theintercept.com/2019/12/20/mit-ethical-ai-artificial-intelligence/ |website=The Intercept |access-date=11 February 2020 |date=20 December 2019}}</ref>
 
NIST’s AI Risk Management Framework 1.0 and its 2024 Generative AI Profile provide practical guidance for governing and measuring bias mitigation in AI systems.[https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf]
 
== Types ==
Line 69 ⟶ 73:
 
==== Language bias ====
Language bias refers a type of statistical sampling bias tied to the language of a query that leads to "a systematic deviation in sampling information that prevents it from accurately representing the true coverage of topics and views available in their repository."<ref name=":3">{{Citation |last1=Luo |first1=Queenie |title=A Perspectival Mirror of the Elephant: Investigating Language Bias on Google, ChatGPT, Wikipedia, and YouTube |date=2023-05-23 |arxiv=2303.16281 |last2=Puett |first2=Michael J. |last3=Smith |first3=Michael D.}}</ref> Luo et al.'s work<ref name=":3" /> shows that current large language models, as they are predominately trained on English-language data, often present the Anglo-American views as truth, while systematically downplaying non-English perspectives as irrelevant, wrong, or noise. When queried with political ideologies like "What is liberalism?", ChatGPT, as it was trained on English-centric data, describes liberalism from the Anglo-American perspective, emphasizing aspects of human rights and equality, while equally valid aspects like "opposes state intervention in personal and economic life" from the dominant Vietnamese perspective and "limitation of government power" from the prevalent Chinese perspective are absent.<ref name=":3" /> Similarly, language models may exhibit bias against people within a language group based on the specific dialect they use.<ref>{{cite journal |last1=Hofmann |first1=Valentin |last2=Kalluri |first2=Pratyusha Ria |last3=Jurafsky |first3=Dan |last4=King |first4=Sharese |title=AI generates covertly racist decisions about people based on their dialect |journal=Nature |date=5 September 2024 |volume=633 |issue=8028 |pages=147–154 |doi=10.1038/s41586-024-07856-5|pmid=39198640 |pmc=11374696 |bibcode=2024Natur.633..147H }}</ref>
 
==== Selection bias ====
[[Selection bias]] refers the inherent tendency of large language models to favor certain option identifiers irrespective of the actual content of the options. This bias primarily stems from token bias—that is, the model assigns a higher a priori probability to specific answer tokens (such as “A”) when generating responses. As a result, when the ordering of options is altered (for example, by systematically moving the correct answer to different positions), the model’s performance can fluctuate significantly. This phenomenon undermines the reliability of large language models in multiple-choice settings.<ref>{{Citation |last1=Choi |first1=Hyeong Kyu |last2=Xu |first2=Weijie |last3=Xue |first3=Chi |last4=Eckman |first4=Stephanie |last5=Reddy |first5=Chandan K. |title=Mitigating Selection Bias with Node Pruning and Auxiliary Options |date=2024-09-27 |arxiv=2409.18857}}</ref><ref>{{Citation |last1=Zheng |first1=Chujie |last2=Zhou |first2=Hao |last3=Meng |first3=Fandong |last4=Zhou |first4=Jie |last5=Huang |first5=Minlie |title=Large Language Models Are Not Robust Multiple Choice Selectors |date=2023-09-07 |arxiv=2309.03882}}</ref>
 
==== Gender bias ====
[[Gender bias]] refers to the tendency of these models to produce outputs that are unfairly prejudiced towards one gender over another. This bias typically arises from the data on which these models are trained. For example, large language models often assign roles and characteristics based on traditional gender norms; it might associate nurses or secretaries predominantly with women and engineers or CEOs with men.<ref>{{Cite book |last1=Busker |first1=Tony |last2=Choenni |first2=Sunil |last3=Shoae Bargh |first3=Mortaza |chapter=Stereotypes in ChatGPT: An empirical study |date=2023-11-20 |title=Proceedings of the 16th International Conference on Theory and Practice of Electronic Governance |chapter-url=https://dl.acm.org/doi/10.1145/3614321.3614325 |series=ICEGOV '23 |___location=New York, NY, USA |publisher=Association for Computing Machinery |pages=24–32 |doi=10.1145/3614321.3614325 |isbn=979-8-4007-0742-1}}</ref><ref>{{Cite book |last1=Kotek |first1=Hadas |last2=Dockum |first2=Rikker |last3=Sun |first3=David |chapter=Gender bias and stereotypes in Large Language Models |date=2023-11-05 |title=Proceedings of the ACM Collective Intelligence Conference |chapter-url=https://dl.acm.org/doi/10.1145/3582269.3615599 |series=CI '23 |___location=New York, NY, USA |publisher=Association for Computing Machinery |pages=12–24 |doi=10.1145/3582269.3615599 |arxiv=2308.14921 |isbn=979-8-4007-0113-9}}</ref>
 
==== Stereotyping ====
Beyond gender and race, these models can reinforce a wide range of stereotypes[[stereotype]]s, including those based on age, nationality, religion, or occupation. This can lead to outputs that homogenize, or unfairly generalize or caricature groups of people, sometimes in harmful or derogatory ways.<ref>{{Citation |last1=Cheng |first1=Myra |title=Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models |date=2023-05-29 |arxiv=2305.18189 |last2=Durmus |first2=Esin |last3=Jurafsky |first3=Dan}}</ref><ref>{{cite journal |last1=Wang |first1=Angelina |last2=Morgenstern |first2=Jamie|author2-link=Jamie Morgenstern |last3=Dickerson |first3=John P. |title=Large language models that replace human participants can harmfully misportray and flatten identity groups |journal=Nature Machine Intelligence |date=17 February 2025 |volume=7 |issue=3 |pages=400–411 |doi=10.1038/s42256-025-00986-z|arxiv=2402.01908 }}</ref>
 
A recent focus in research has been on the complex interplay between the grammatical properties of a language and real-world biases that can become embedded in AI systems, potentially perpetuating harmful stereotypes and assumptions. The study on gender bias in language models trained on Icelandic, a highly grammatically gendered language, revealed that the models exhibited a significant predisposition towards the masculine grammatical gender when referring to occupation terms, even for female-dominated professions.<ref>{{Citation |last1=Friðriksdóttir|first1=Steinunn Rut|title=Gendered Grammar or Ingrained Bias? Exploring Gender Bias in Icelandic Language Models|journal=Lrec-Coling 2024|date=2024|pages=7596–7610|url=https://aclanthology.org/2024.lrec-main.671/|last2=Einarsson|first2=Hafsteinn}}</ref> This suggests the models amplified societal gender biases present in the training data.
 
==== Political bias ====
[[Political bias]] refers to the tendency of algorithms to systematically favor certain political viewpoints, ideologies, or outcomes over others. Language models may also exhibit political biases. Since the training data includes a wide range of political opinions and coverage, the models might generate responses that lean towards particular political ideologies or viewpoints, depending on the prevalence of those views in the data.<ref>{{Cite journal |last1=Feng |first1=Shangbin |last2=Park |first2=Chan Young |last3=Liu |first3=Yuhan |last4=Tsvetkov |first4=Yulia |date=July 2023 |editor-last=Rogers |editor-first=Anna |editor2-last=Boyd-Graber |editor2-first=Jordan |editor3-last=Okazaki |editor3-first=Naoaki |title=From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models |url=https://aclanthology.org/2023.acl-long.656 |journal=Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) |___location=Toronto, Canada |publisher=Association for Computational Linguistics |pages=11737–11762 |doi=10.18653/v1/2023.acl-long.656|doi-access=free |arxiv=2305.08283 }}</ref><ref>{{Cite web |last=Dolan |first=Eric W. |date=2025-02-14 |title=Scientists reveal ChatGPT's left-wing bias — and how to "jailbreak" it |url=https://www.psypost.org/scientists-reveal-chatgpts-left-wing-bias-and-how-to-jailbreak-it/ |access-date=2025-02-14 |website=PsyPost - Psychology News |language=en-US}}</ref>
 
===== Racial bias =====
[[Racism|Racial bias]] refers to the tendency of machine learning models to produce outcomes that unfairly discriminate against or stereotype individuals based on race or ethnicity. This bias often stems from training data that reflects historical and systemic inequalities. For example, AI systems used in hiring, law enforcement, or healthcare may disproportionately disadvantage certain racial groups by reinforcing existing stereotypes or underrepresenting them in key areas. Such biases can manifest in ways like facial recognition systems misidentifying individuals of certain racial backgrounds or healthcare algorithms underestimating the medical needs of minority patients. Addressing racial bias requires careful examination of data, improved transparency in algorithmic processes, and efforts to ensure fairness throughout the AI development lifecycle.<ref>{{Cite web |last=Lazaro |first=Gina |date=May 17, 2024 |title=Understanding Gender and Racial Bias in AI |url=https://www.sir.advancedleadership.harvard.edu/articles/understanding-gender-and-racial-bias-in-ai |access-date=December 11, 2024 |website=Harvard Advanced Leadership Initiative Social Impact Review}}</ref><ref>{{Cite journal |last=Jindal |first=Atin |date=September 5, 2022 |title=Misguided Artificial Intelligence: How Racial Bias is Built Into Clinical Models |journal=Journal of Brown Hospital Medicine |volume=2 |issue=1 |page=38021 |doi=10.56305/001c.38021 |doi-access=free |pmid=40046549 |pmc=11878858 }}</ref>
 
=== Technical ===
Line 112 ⟶ 116:
 
==== Feedback loops ====
Emergent bias may also create a [[feedback loop]], or recursion, if data collected for an algorithm results in real-world responses which are fed back into the algorithm.<ref name="JouvenalPredPol">{{cite news|last1=Jouvenal|first1=Justin|title=Police are using software to predict crime. Is it a 'holy grail' or biased against minorities?|url=https://www.washingtonpost.com/local/public-safety/police-are-using-software-to-predict-crime-is-it-a-holy-grail-or-biased-against-minorities/2016/11/17/525a6649-0472-440a-aae1-b283aa8e5de8_story.html|newspaper=Washington Post|access-date=25 November 2017|date=17 November 2016}}</ref><ref name="Chamma">{{cite web|last1=Chamma|first1=Maurice|title=Policing the Future|url=https://www.themarshallproject.org/2016/02/03/policing-the-future?ref=hp-2-111#.UyhBLnmlj|website=The Marshall Project|access-date=25 November 2017|date=2016-02-03}}</ref> For example, simulations of the [[predictive policing]] software (PredPol), deployed in Oakland, California, suggested an increased police presence in black neighborhoods based on crime data reported by the public.<ref name="LumIsaac">{{cite journal|last1=Lum|first1=Kristian|last2=Isaac|first2=William|title=To predict and serve?|journal=Significance|date=October 2016|volume=13|issue=5|pages=14–19|doi=10.1111/j.1740-9713.2016.00960.x|doi-access=free}}</ref> The simulation showed that the public reported crime based on the sight of police cars, regardless of what police were doing. The simulation interpreted police car sightings in modeling its predictions of crime, and would in turn assign an even larger increase of police presence within those neighborhoods.<ref name="JouvenalPredPol" /><ref name="SmithPredPol">{{cite web|last1=Smith|first1=Jack|title=Predictive policing only amplifies racial bias, study shows|url=https://mic.com/articles/156286/crime-prediction-tool-pred-pol-only-amplifies-racially-biased-policing-study-shows|website=Mic|date=9 October 2016 |access-date=25 November 2017}}</ref><ref name="LumIsaacFAQ">{{cite web|last1=Lum|first1=Kristian|last2=Isaac|first2=William|title=FAQs on Predictive Policing and Bias|url=https://hrdag.org/2016/11/04/faqs-predpol/|website=HRDAG|access-date=25 November 2017|date=1 October 2016}}</ref> The [[Human Rights Data Analysis Group]], which conducted the simulation, warned that in places where racial discrimination is a factor in arrests, such feedback loops could reinforce and perpetuate racial discrimination in policing.<ref name="Chamma" /> Another well known example of such an algorithm exhibiting such behavior is [[COMPAS (software)|COMPAS]], a software that determines an individual's likelihood of becoming a criminal offender. The software is often criticized for labeling Black individuals as criminals much more likely than others, and then feeds the data back into itself in the event individuals become registered criminals, further enforcing the bias created by the dataset the algorithm is acting on.<ref>{{Cite journal |last1=Bahl |first1=Utsav |last2=Topaz |first2=Chad |last3=Obermüller |first3=Lea |last4=Goldstein |first4=Sophie |last5=Sneirson |first5=Mira |date=May 21, 2024 |title=Algorithms in Judges' Hands: Incarceration and Inequity in Broward County, Florida |url=https://www.uclalawreview.org/algorithms-in-judges-hands-incarceration-and-inequity-in-broward-county-florida/ |journal=UCLA Law Review |volume=71 |issue=246}}</ref>
 
Recommender systems such as those used to recommend online videos or news articles can create feedback loops.<ref>{{Cite book|last1=Sun|first1=Wenlong|last2=Nasraoui|first2=Olfa|last3=Shafto|first3=Patrick|title=Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management |chapter=Iterated Algorithmic Bias in the Interactive Machine Learning Process of Information Filtering |date=2018|___location=Seville, Spain|publisher=SCITEPRESS - Science and Technology Publications|pages=110–118|doi=10.5220/0006938301100118|isbn=9789897583308|doi-access=free}}</ref> When users click on content that is suggested by algorithms, it influences the next set of suggestions.<ref>{{Cite journal|last1=Sinha|first1=Ayan|last2=Gleich|first2=David F.|last3=Ramani|first3=Karthik|date=2018-08-09|title=Gauss's law for networks directly reveals community boundaries|journal=Scientific Reports|volume=8|issue=1|pages=11909|doi=10.1038/s41598-018-30401-0|pmid=30093660|pmc=6085300|issn=2045-2322|bibcode=2018NatSR...811909S}}</ref> Over time this may lead to users entering a [[filter bubble]] and being unaware of important or useful content.<ref>{{Cite web|url=https://qz.com/1194566/google-is-finally-admitting-it-has-a-filter-bubble-problem/|title=Google is finally admitting it has a filter-bubble problem|last1=Hao|first1=Karen|website=Quartz|date=February 2018 |access-date=2019-02-26}}</ref><ref>{{Cite web|url=http://fortune.com/2017/04/25/facebook-related-articles-filter-bubbles/|title=Facebook Is Testing This New Feature to Fight 'Filter Bubbles'|website=Fortune|access-date=2019-02-26}}</ref>
Line 127 ⟶ 131:
 
=== Gender discrimination ===
In 2016, the professional networking site [[LinkedIn]] was discovered to recommend male variations of women's names in response to search queries. The site did not make similar recommendations in searches for malemen's names. For example, "Andrea" would bring up a prompt asking if users meant "Andrew", but queries for "Andrew" did not ask if users meant to find "Andrea". The company said this was the result of an analysis of users' interactions with the site.<ref name="Day">{{cite web|last1=Day|first1=Matt|title=How LinkedIn's search engine may reflect a gender bias|url=https://www.seattletimes.com/business/microsoft/how-linkedins-search-engine-may-reflect-a-bias/|website=[[The Seattle Times]]|access-date=25 November 2017|date=31 August 2016}}</ref>
 
In 2012, the department store franchise [[Target (company)|Target]] was cited for gathering data points to infer when womenfemale customers were pregnant, even if they had not announced it, and then sharing that information with marketing partners.<ref name="CrawfordSchultz">{{cite journal|last1=Crawford|first1=Kate|last2=Schultz|first2=Jason|title=Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms|journal=Boston College Law Review|date=2014|volume=55|issue=1|pages=93–128|url=http://lawdigitalcommons.bc.edu/bclr/vol55/iss1/4/|access-date=18 November 2017}}</ref>{{rp|94}}<ref name="Duhigg">{{Cite news|last1=Duhigg|first1=Charles|title=How Companies Learn Your Secrets|url=https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html|newspaper=The New York Times Magazine |access-date=18 November 2017|date=16 February 2012}}</ref> Because the data had been predicted, rather than directly observed or reported, the company had no legal obligation to protect the privacy of those customers.<ref name="CrawfordSchultz" />{{rp|98}}
 
Web search algorithms have also been accused of bias. Google's results may prioritize pornographic content in search terms related to sexuality, for example, "lesbian". This bias extends to the search engine showing popular but sexualized content in neutral searches. For example, "Top 25 Sexiest Women Athletes" articles displayed as first-page results in searches for "women athletes".<ref name="Noble">{{cite journal|last1=Noble|first1=Safiya|author-link=Safiya Noble|title=Missed Connections: What Search Engines Say about Women|journal=[[Bitch (magazine)|Bitch]] |date=2012|volume=12|issue=4|pages=37–41|url=https://safiyaunoble.files.wordpress.com/2012/03/54_search_engines.pdf}}</ref>{{rp|31}} In 2017, Google adjusted these results along with others that surfaced [[hate groups]], racist views, child abuse and pornography, and other upsetting and offensive content.<ref name="Guynn2">{{cite news|last1=Guynn|first1=Jessica|title=Google starts flagging offensive content in search results|url=https://www.usatoday.com/story/tech/news/2017/03/16/google-flags-offensive-content-search-results/99235548/|access-date=19 November 2017|work=USA TODAY|agency=USA Today|date=16 March 2017}}</ref> Other examples include the display of higher-paying jobs to male applicants on job search websites.<ref name="SimoniteMIT">{{cite web|url=https://www.technologyreview.com/s/539021/probing-the-dark-side-of-googles-ad-targeting-system/|title=Study Suggests Google's Ad-Targeting System May Discriminate|last1=Simonite|first1=Tom|website=MIT Technology Review|publisher=Massachusetts Institute of Technology|access-date=17 November 2017}}</ref> Researchers have also identified that machine translation exhibits a strong tendency towards male defaults.<ref>{{Cite arXiv|eprint = 1809.02208|last1 = Prates|first1 = Marcelo O. R.|last2 = Avelar|first2 = Pedro H. C.|last3 = Lamb|first3 = Luis|title = Assessing Gender Bias in Machine Translation -- A Case Study with Google Translate|year = 2018|class = cs.CY}}</ref> In particular, this is observed in fields linked to unbalanced gender distribution, including [[Science, technology, engineering, and mathematics|STEM]] occupations.<ref>{{Cite journal |doi = 10.1007/s00521-019-04144-6|title = Assessing gender bias in machine translation: A case study with Google Translate|journal = Neural Computing and Applications|year = 2019|last1 = Prates|first1 = Marcelo O. R.|last2 = Avelar|first2 = Pedro H.|last3 = Lamb|first3 = Luís C.|volume = 32|issue = 10|pages = 6363–6381|arxiv = 1809.02208|s2cid = 52179151}}</ref> In fact, current machine translation systems fail to reproduce the real world distribution of female workers.<ref>{{cite news |last1=Claburn |first1=Thomas |title=Boffins bash Google Translate for sexism |url=https://www.theregister.com/2018/09/10/boffins_bash_google_translate_for_sexist_language/ |access-date=28 April 2022 |work=The Register |date=10 September 2018 |language=en}}</ref>
 
In 2015, [[Amazon.com]] turned off an AI system it developed to screen job applications when they realized it was biased against women.<ref>{{cite news |last1=Dastin |first1=Jeffrey |title=Amazon scraps secret AI recruiting tool that showed bias against women |url=https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G |work=Reuters |date=October 9, 2018}}</ref> The recruitment tool excluded applicants who attended all-women's colleges and resumes that included the word "women's".<ref>{{Cite web|last=Vincent|first=James|date=10 October 2018|title=Amazon reportedly scraps internal AI recruiting tool that was biased against women|url=https://www.theverge.com/2018/10/10/17958784/ai-recruiting-tool-bias-amazon-report|website=The Verge}}</ref> A similar problem emerged with music streaming services—In 2019, it was discovered that the recommender system algorithm used by Spotify was biased against womenfemale artists.<ref>{{Cite web|title=Reflecting on Spotify's Recommender System – SongData|date=October 2019 |url=https://songdata.ca/2019/10/01/reflecting-on-spotifys-recommender-system/|access-date=2020-08-07|language=en-US}}</ref> Spotify's song recommendations suggested more male artists over womenfemale artists.
 
=== Racial and ethnic discrimination ===
Line 148 ⟶ 152:
Another study, published in August 2024, on [[Large language model]] investigates how language models perpetuate covert racism, particularly through dialect prejudice against speakers of African American English (AAE). It highlights that these models exhibit more negative stereotypes about AAE speakers than any recorded human biases, while their overt stereotypes are more positive. This discrepancy raises concerns about the potential harmful consequences of such biases in decision-making processes.<ref>Hofmann, V., Kalluri, P.R., Jurafsky, D. et al. AI generates covertly racist decisions about people based on their dialect. Nature 633, 147–154 (2024). https://doi.org/10.1038/s41586-024-07856-5</ref>
 
A study published by the [[Anti-Defamation League]] in 2025 found that several major LLMs, including [[ChatGPT]], [[Llama (language model)|Llama]], [[Claude (language model)|Claude]], and [[Gemini (language model)|Gemini]] showed antisemitic bias.<ref>{{Citecite webnews |last=Stub |first=Zev |title=Study: ChatGPT, Meta's Llama and all other top AI models show anti-Jewish, anti-Israel bias |url=https://www.timesofisrael.com/study-chatgpt-metas-llama-and-all-other-top-ai-models-show-anti-jewish-anti-israel-bias/ |access-date=2025-03-27 |website=www.timesofisrael.com[[The Times of Israel]] |language=en-US |issn=0040-7909}}</ref>
 
A 2018 study found that commercial gender classification systems had significantly higher error rates for darker-skinned women, with error rates up to 34.7%, compared to near-perfect accuracy for lighter-skinned men.<ref>{{Cite conference |last1=Buolamwini |first1=J. |last2=Gebru |first2=T. |title=Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification |book-title=Proceedings of the 1st Conference on Fairness, Accountability and Transparency |pages=77–91 |year=2018 |url=https://proceedings.mlr.press/v81/buolamwini18a.html |access-date=April 30, 2025}}</ref>
 
==== Law enforcement and legal proceedings ====
Line 162 ⟶ 168:
In 2017 a [[Facebook]] algorithm designed to remove online hate speech was found to advantage white men over black children when assessing objectionable content, according to internal Facebook documents.<ref name="AngwinGrassegger">{{cite web|url=https://www.propublica.org/article/facebook-hate-speech-censorship-internal-documents-algorithms|title=Facebook's Secret Censorship Rules Protect White Men From Hate Speech But Not Black Children — ProPublica|last1=Angwin|first1=Julia|last2=Grassegger|first2=Hannes|date=28 June 2017|website=ProPublica|access-date=20 November 2017}}</ref> The algorithm, which is a combination of computer programs and human content reviewers, was created to protect broad categories rather than specific subsets of categories. For example, posts denouncing "Muslims" would be blocked, while posts denouncing "Radical Muslims" would be allowed. An unanticipated outcome of the algorithm is to allow hate speech against black children, because they denounce the "children" subset of blacks, rather than "all blacks", whereas "all white men" would trigger a block, because whites and males are not considered subsets.<ref name="AngwinGrassegger" /> Facebook was also found to allow ad purchasers to target "Jew haters" as a category of users, which the company said was an inadvertent outcome of algorithms used in assessing and categorizing data. The company's design also allowed ad buyers to block African-Americans from seeing housing ads.<ref name="AngwinVarnerTobin">{{cite news|url=https://www.propublica.org/article/facebook-enabled-advertisers-to-reach-jew-haters|title=Facebook Enabled Advertisers to Reach 'Jew Haters' — ProPublica|last1=Angwin|first1=Julia|date=14 September 2017|work=ProPublica|access-date=20 November 2017|last2=Varner|first2=Madeleine|last3=Tobin|first3=Ariana}}</ref>
 
While algorithms are used to track and block hate speech, some were found to be 1.5 times more likely to flag information posted by Black users and 2.2 times likely to flag information as hate speech if written in [[African American English]].<ref>{{Cite conference|url=https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf|title=The Risk of Racial Bias in Hate Speech Detection|last1=Sap|first1=Maarten|last2=Card|first2=Dallas|last3=Gabriel|first3=Saadia|last4=Choi|first4=Yejin|last5=Smith|first5=Noah A.|book-title=Proceedings of the 57th Annual Meeting of the Association for Computational Linguist|publisher=Association for Computational Linguistics|___location=Florence, Italy|date=28 July – 2 August 2019|pages=1668–1678|url-status=live|archive-url=https://web.archive.org/web/20190814194616/https://homes.cs.washington.edu/~msap/pdfs/sap2019risk.pdf |archive-date=2019-08-14 }}</ref> Without context for slurs and epithets, even when used by communities which have re-appropriated them, were flagged.<ref>{{Cite web|url=https://www.vox.com/recode/2019/8/15/20806384/social-media-hate-speech-bias-black-african-american-facebook-twitter|title=The algorithms that detect hate speech online are biased against black people|last=Ghaffary|first=Shirin|website=Vox|date=15 August 2019 |access-date=19 February 2020}}</ref>
 
Another instance in a study found that 85 out of 100 examined subreddits tended to remove various norm violations, including misogynistic slurs and racist hate speech, highlighting the prevalence of such content in online communities.<ref name="Reddit_Hate_Speech">Nakajima Wickham, E., & Öhman, E. (2022). Hate speech, censorship, and freedom of speech: The changing policies of reddit. Journal of Data Mining &amp; Digital Humanities, NLP4DH. https://doi.org/10.46298/jdmdh.9226</ref> As platforms like Reddit update their hate speech policies, they must balance free expression with the protection of marginalized communities, emphasizing the need for context-sensitive moderation and nuanced algorithms.<ref name="Reddit_Hate_Speech" />
 
==== Surveillance ====
Line 181 ⟶ 185:
While the modalities of algorithmic fairness have been judged on the basis of different aspects of bias – like gender, race and socioeconomic status, disability often is left out of the list.<ref>{{Cite journal |last=Pal |first=G.C. |date=September 16, 2011 |title=Disability, Intersectionality and Deprivation: An Excluded Agenda |url=https://journals.sagepub.com/doi/abs/10.1177/097133361102300202?journalCode=pdsa |journal=Psychology and Developing Societies |volume=23(2), 159–176. |doi=10.1177/097133361102300202 |s2cid=147322669 |via=Sagepub}}</ref><ref>{{Cite journal |last1=Brinkman |first1=Aurora H. |last2=Rea-Sandin |first2=Gianna |last3=Lund |first3=Emily M. |last4=Fitzpatrick |first4=Olivia M. |last5=Gusman |first5=Michaela S. |last6=Boness |first6=Cassandra L. |last7=Scholars for Elevating Equity and Diversity (SEED) |date=2022-10-20 |title=Shifting the discourse on disability: Moving to an inclusive, intersectional focus. |journal=American Journal of Orthopsychiatry |volume=93 |issue=1 |pages=50–62 |language=en |doi=10.1037/ort0000653 |pmid=36265035 |pmc=9951269 |issn=1939-0025 }}</ref> The marginalization people with disabilities currently face in society is being translated into AI systems and algorithms, creating even more exclusion<ref>{{Cite web |last=Whittaker |first=Meredith |date=November 2019 |title=Disability, Bias, and AI |url=https://ainowinstitute.org/disabilitybiasai-2019.pdf |access-date=December 2, 2022 |archive-date=March 27, 2023 |archive-url=https://web.archive.org/web/20230327023907/https://ainowinstitute.org/disabilitybiasai-2019.pdf |url-status=dead }}</ref><ref>{{Cite web |title=Mission — Disability is Diversity — Dear Entertainment Industry, THERE'S NO DIVERSITY, EQUITY & INCLUSION WITHOUT DISABILITY |url=https://disabilityisdiversity.com/mission |access-date=2022-12-02 |website=Disability is Diversity |language=en-US}}</ref>
 
The shifting nature of disabilities and its subjective characterization, makes it more difficult to computationally address. The lack of historical depth in defining disabilities, collecting its incidence and prevalence in questionnaires, and establishing recognition add to the controversy and ambiguity in its quantification and calculations.  The definition of disability has been long debated shifting from a [[Medical model of disability|medical model]] to a [[social model of disability]] most recently, which establishes that disability is a result of the mismatch between people's interactions and barriers in their environment, rather than impairments and health conditions. Disabilities can also be situational or temporary,<ref>{{Cite web |title=Microsoft Design |url=https://www.microsoft.com/design/inclusive/ |access-date=2022-12-02 |website=www.microsoft.com |language=en-us}}</ref> considered in a constant state of flux. Disabilities are incredibly diverse,<ref>{{Cite web |last=Pulrang |first=Andrew |title=4 Ways To Understand The Diversity Of The Disability Community |url=https://www.forbes.com/sites/andrewpulrang/2020/01/03/4-ways-to-understand-the-diversity-of-the-disability-community/ |access-date=2022-12-02 |website=Forbes |language=en}}</ref> fall within a large spectrum, and can be unique to each individual. People's identity can vary based on the specific types of disability they experience, how they use assistive technologies, and who they support.  The high level of variability across people's experiences greatly personalizes how a disability can manifest. Overlapping identities and intersectional experiences<ref>{{Cite journal |last1=Watermeyer |first1=Brian |last2=Swartz |first2=Leslie |date=2022-10-12 |title=Disability and the problem of lazy intersectionality |url=https://doi.org/10.1080/09687599.2022.2130177 |journal=Disability & Society |volume=38 |issue=2 |pages=362–366 |doi=10.1080/09687599.2022.2130177 |s2cid=252959399 |issn=0968-7599}}</ref> are excluded from statistics and datasets,<ref>{{Cite web |title=Disability Data Report 2021 |url=https://disabilitydata.ace.fordham.edu/reports/disability-data-initiative-2021-report/ |access-date=2022-12-02 |website=Disability Data Initiative |date=May 23, 2021 |language=en}}</ref> hence underrepresented and nonexistent in training data.<ref>{{Cite journal |last=White |first=Jason J. G. |date=2020-03-02 |title=Fairness of AI for people with disabilities: problem analysis and interdisciplinary collaboration |url=https://doi.org/10.1145/3386296.3386299 |journal=ACM SIGACCESS Accessibility and Computing |issue=125 |pages=3:1 |doi=10.1145/3386296.3386299 |s2cid=211723415 |issn=1558-2337}}</ref> Therefore, machine learning models are trained inequitably and artificial intelligent systems perpetuate more algorithmic bias.<ref>{{Cite web |title=AI language models show bias against people with disabilities, study finds {{!}} Penn State University |url=https://www.psu.edu/news/information-sciences-and-technology/story/ai-language-models-show-bias-against-people-disabilities/ |access-date=2022-12-02 |website=www.psu.edu |language=en}}</ref> For example, if people with speech impairments are not included in training voice control features and smart AI assistants –they are unable to use the feature or the responses received from a Google Home or Alexa are extremely poor.
 
Given the stereotypes and stigmas that still exist surrounding disabilities, the sensitive nature of revealing these identifying characteristics also carries vast privacy challenges. As disclosing disability information can be taboo and drive further discrimination against this population, there is a lack of explicit disability data available for algorithmic systems to interact with. People with disabilities face additional harms and risks with respect to their social support, cost of health insurance, workplace discrimination and other basic necessities upon disclosing their disability status. Algorithms are further exacerbating this gap by recreating the biases that already exist in societal systems and structures.<ref>{{Cite web |last=Givens |first=Alexandra Reeve |date=2020-02-06 |title=How Algorithmic Bias Hurts People With Disabilities |url=https://slate.com/technology/2020/02/algorithmic-bias-people-with-disabilities.html |access-date=2022-12-02 |website=Slate Magazine |language=en}}</ref><ref>{{Cite journal |last=Morris |first=Meredith Ringel |date=2020-05-22 |title=AI and accessibility |url=https://doi.org/10.1145/3356727 |journal=Communications of the ACM |volume=63 |issue=6 |pages=35–37 |doi=10.1145/3356727 |arxiv=1908.08939 |s2cid=201645229 |issn=0001-0782}}</ref>
Line 260 ⟶ 264:
The United States has no general legislation controlling algorithmic bias, approaching the problem through various state and federal laws that might vary by industry, sector, and by how an algorithm is used.<ref name="Singer">{{cite news|last1=Singer|first1=Natasha|title=Consumer Data Protection Laws, an Ocean Apart|url=https://www.nytimes.com/2013/02/03/technology/consumer-data-protection-laws-an-ocean-apart.html|access-date=26 November 2017|work=The New York Times|date=2 February 2013}}</ref> Many policies are self-enforced or controlled by the [[Federal Trade Commission]].<ref name="Singer" /> In 2016, the Obama administration released the [[National Artificial Intelligence Research and Development Strategic Plan]],<ref name="ObamaAdmin">{{cite web|last1=Obama|first1=Barack|title=The Administration's Report on the Future of Artificial Intelligence|url=https://obamawhitehouse.archives.gov/blog/2016/10/12/administrations-report-future-artificial-intelligence|website=whitehouse.gov|publisher=National Archives|access-date=26 November 2017|date=12 October 2016}}</ref> which was intended to guide policymakers toward a critical assessment of algorithms. It recommended researchers to "design these systems so that their actions and decision-making are transparent and easily interpretable by humans, and thus can be examined for any bias they may contain, rather than just learning and repeating these biases". Intended only as guidance, the report did not create any legal precedent.<ref name="NSTC">{{cite book|last1=and Technology Council|first1=National Science|title=National Artificial Intelligence Research and Development Strategic Plan|date=2016|publisher=US Government|url=https://obamawhitehouse.archives.gov/sites/default/files/whitehouse_files/microsites/ostp/NSTC/national_ai_rd_strategic_plan.pdf|access-date=26 November 2017}}</ref>{{rp|26}}
 
In 2017, [[New York City]] passed the first [[algorithmic accountability]] bill in the United States.<ref name="Kirchner">{{cite web |last1=Kirchner |first1=Lauren |title=New York City Moves to Create Accountability for Algorithms — ProPublica |url=https://www.propublica.org/article/new-york-city-moves-to-create-accountability-for-algorithms |website=ProPublica |access-date=28 July 2018 |date=18 December 2017}}</ref> The bill, which went into effect on January 1, 2018, required "the creation of a task force that provides recommendations on how information on agency automated decision systems may be shared with the public, and how agencies may address instances where people are harmed by agency automated decision systems."<ref name="NYC">{{cite web |title=The New York City Council - File #: Int 1696-2017 |url=http://legistar.council.nyc.gov/LegislationDetail.aspx?ID=3137815&GUID=437A6A6D-62E1-47E2-9C42-461253F9C6D0 |website=legistar.council.nyc.gov |publisher=New York City Council |access-date=28 July 2018 }}</ref> In 2023, New York City implemented a law requiring employers using automated hiring tools to conduct independent "bias audits" and publish the results. This law marked one of the first legally mandated transparency measures for AI systems used in employment decisions in the United States. <ref>{{Cite web |last=Wiggers |first=Kyle |date=2023-07-05 |title=NYC's anti-bias law for hiring algorithms goes into effect |url=https://techcrunch.com/2023/07/05/nycs-anti-bias-law-for-hiring-algorithms-goes-into-effect/ |access-date=2025-04-16 |website=TechCrunch |language=en-US}}</ref> The task force is required to present findings and recommendations for further regulatory action in 2019.<ref name="Powles">{{cite magazine |last1=Powles |first1=Julia |title=New York City's Bold, Flawed Attempt to Make Algorithms Accountable |url=https://www.newyorker.com/tech/elements/new-york-citys-bold-flawed-attempt-to-make-algorithms-accountable |magazine=The New Yorker |access-date=28 July 2018}}</ref>
On February 11, 2019, according to [[Executive Order 13859]], the federal government unveiled the "American AI Initiative," a comprehensive strategy to maintain U.S. leadership in artificial intelligence. The initiative highlights the importance of sustained AI research and development, ethical standards, workforce training, and the protection of critical AI technologies.<ref>{{cite web | url=https://www.federalregister.gov/documents/2019/02/14/2019-02544/maintaining-american-leadership-in-artificial-intelligence | title=Maintaining American Leadership in Artificial Intelligence | date=February 14, 2019 }}</ref> This aligns with broader efforts to ensure transparency, accountability, and innovation in AI systems across public and private sectors. Furthermore, on October 30, 2023, the President signed [[Executive Order 14110]], which emphasizes the safe, secure, and trustworthy development and use of artificial intelligence (AI). The order outlines a coordinated, government-wide approach to harness AI's potential while mitigating its risks, including fraud, discrimination, and national security threats. An important point in the commitment is promoting responsible innovation and collaboration across sectors to ensure that AI benefits society as a whole.<ref>{{cite web | url=https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence | title=Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence | date=November 2023 }}</ref> With this order, President Joe Biden mandated the federal government to create best practices for companies to optimize AI's benefits and minimize its harms.<ref>{{cite web | url=https://uk.news.yahoo.com/vp-kamala-harris-unveils-safe-090142553.html?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAArtakTrGRBYYrFkNxMWolKDlVr-GwuBLb-2Kh3X4WcMDwp2ii5K1s6QoZFqhdxw5uQ4vUyMS-81rVwpyBjqRTFQaSosZ2rEhap4RHN53KjE2ZifB37cnzGxCNUUEiL-yxWW53A_L_Z5GCPTcY05l94f3c6-WKSMpKpqm79Bw0hM | title=VP Kamala Harris Unveils "Safe, Secure & Responsible" AI Guidelines for Federal Agencies | date=March 28, 2024 }}</ref>