Friendly artificial intelligence: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Add: s2cid. | Use this bot. Report bugs. | Suggested by Neko-chan | Category:Philosophy of artificial intelligence | #UCB_Category 35/47
combining duplicate cites
Line 5:
== Etymology and usage ==
[[File:Eliezer Yudkowsky, Stanford 2006 (square crop).jpg|thumb|[[Eliezer Yudkowsky]], AI researcher and creator of the term Friendly artificial intelligence]]
The term was coined by [[Eliezer Yudkowsky]],<ref>{{cite book|last1=Tegmark|first1=Max|title=Our Mathematical Universe: My Quest for the Ultimate Nature of Reality|date=2014|isbn=9780307744258|edition=First|chapter=Life, Our Universe and Everything|quote=Its owner may cede control to what Eliezer Yudkowsky terms a "Friendly AI,"...|title-link=Our Mathematical Universe: My Quest for the Ultimate Nature of Reality}}</ref> who is best known for popularizing the idea,<ref name="aima">{{cite book |last1=Russell |first1=Stuart |author1-link=Stuart J. Russell |last2=Norvig |first2=Peter |author2-link=Peter Norvig |date=2009 |title=Artificial Intelligence: A Modern Approach |publisher=Prentice Hall |isbn=978-0-13-604259-4|title-link=Artificial Intelligence: A Modern Approach }}</ref><ref>{{cite book |last=Leighton |first=Jonathan |date=2011 |title=The Battle for Compassion: Ethics in an Apathetic Universe |publisher=Algora |isbn=978-0-87586-870-7}}</ref> to discuss [[superintelligence|superintelligent]] artificial agents that reliably implement human values. [[Stuart J. Russell]] and [[Peter Norvig]]'s leading [[artificial intelligence]] textbook, ''[[Artificial Intelligence: A Modern Approach]]'', describes the idea:<ref>{{cite book |last1name=Russell |first1=Stuart |last2=Norvig | first2=Peter |date=2009 |title=Artificial Intelligence: A Modern Approach |publisher=Prentice Hall |isbn=978-0-13-604259-4 |title-link=Artificial Intelligence: A Modern Approach"aima" }}</ref>
 
<blockquote>Yudkowsky (2008) goes into more detail about how to design a '''Friendly AI'''. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design&mdash;to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.</blockquote>
Line 19:
<blockquote>Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly.'</blockquote>
 
In 2008 Eliezer Yudkowsky called for the creation of “friendly AI” to mitigate [[existential risk from advanced artificial intelligence]]. He explains: "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else."<ref>{{cite web |author=[[Eliezer Yudkowsky]] (|year=2008) in ''[|url=http://intelligence.org/files/AIPosNegFactor.pdf |title=Artificial Intelligence as a Positive and Negative Factor in Global Risk]''}}</ref>
 
[[Steve Omohundro]] says that a sufficiently advanced AI system will, unless explicitly counteracted, exhibit a number of [[Instrumental convergence#Basic AI drives|basic "drives"]], such as resource acquisition, self-preservation, and continuous self-improvement, because of the intrinsic nature of any goal-driven systems and that these drives will, "without special precautions", cause the AI to exhibit undesired behavior.<ref>{{cite journal |last=Omohundro, |first=S. M. (2008, |date=February). 2008 |title=The basic AI drives. In |journal=AGI (Vol. |volume=171, pp. |pages=483-492).}}</ref><ref>{{cite book|last1=Bostrom|first1=Nick|title=Superintelligence: Paths, Dangers, Strategies|date=2014|publisher=Oxford University Press|___location=Oxford|isbn=9780199678112|title-link=Superintelligence: Paths, Dangers, Strategies}} |chapter=Chapter 7: The Superintelligent Will.}}</ref>
 
[[Alexander Wissner-Gross]] says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold.<ref>''[http://io9.com/how-skynet-might-emerge-from-simple-physics-482402911 How Skynet Might Emerge From Simple Physics]'', io9, Published 2013-04-26.</ref><ref>{{cite journal | last1 = Wissner-Gross | first1 = A. D. | author-link1 = Alexander Wissner-Gross | last2 = Freer | first2 = C. E. | author-link2 = Cameron Freer | year = 2013 | title = Causal entropic forces | url = http://www.alexwg.org/link?url=http%3A%2F%2Fwww.alexwg.org%2Fpublications%2FPhysRevLett_110-168702.pdf| journal = Physical Review Letters | volume = 110 | issue = 16| page = 168702 | doi = 10.1103/PhysRevLett.110.168702 | pmid = 23679649 | bibcode=2013PhRvL.110p8702W| doi-access = free }}</ref>
Line 27:
Luke Muehlhauser, writing for the [[Machine Intelligence Research Institute]], recommends that [[machine ethics]] researchers adopt what [[Bruce Schneier]] has called the "security mindset": Rather than thinking about how a system will work, imagine how it could fail. For instance, he suggests even an AI that only makes accurate predictions and communicates via a text interface might cause unintended harm.<ref name=MuehlhauserSecurity2013>{{cite web|last1=Muehlhauser|first1=Luke|title=AI Risk and the Security Mindset|url=http://intelligence.org/2013/07/31/ai-risk-and-the-security-mindset/|website=Machine Intelligence Research Institute|access-date=15 July 2014|date=31 Jul 2013}}</ref>
 
In 2014, Luke Muehlhauser and Nick Bostrom underlined the need for 'friendly AI';<ref name=think13>{{Cite journal|last1=Muehlhauser|first1=Luke|last2=Bostrom|first2=Nick|title=Why We Need Friendly AI|date=2013-12-17|journal=Think|volume=13|issue=36|pages=41–47|doi=10.1017/s1477175613000316|s2cid=143657841|issn=1477-1756}}</ref> nonetheless, the difficulties in designing a 'friendly' superintelligence, for instance via programming counterfactual moral thinking, are considerable.<ref name=boyles2019>{{Cite journal|last1=Boyles|first1=Robert James M.|last2=Joaquin|first2=Jeremiah Joven|date=2019-07-23|title=Why friendly AIs won't be that friendly: a friendly reply to Muehlhauser and Bostrom|journal=AI & Society|volume=35|issue=2|pages=505–507|doi=10.1007/s00146-019-00903-0|s2cid=198190745|issn=0951-5666}}</ref><ref>{{Cite journal|last=Chan|first=Berman|date=2020-03-04|title=The rise of artificial intelligence and the crisis of moral passivity|journal=AI & Society|volume=35|issue=4|pages=991–993|language=en|doi=10.1007/s00146-020-00953-9|s2cid=212407078|issn=1435-5655}}</ref>
 
==Coherent extrapolated volition==
Line 60:
{{see also|Technological singularity#Criticisms}}
 
Some critics believe that both human-level AI and superintelligence are unlikely, and that therefore friendly AI is unlikely. Writing in ''[[The Guardian]]'', Alan Winfield compares human-level artificial intelligence with faster-than-light travel in terms of difficulty, and states that while we need to be "cautious and prepared" given the stakes involved, we "don't need to be obsessing" about the risks of superintelligence.<ref>{{cite news|last1=Winfield|first1=Alan|title=Artificial intelligence will not turn into a Frankenstein's monster|url=https://www.theguardian.com/technology/2014/aug/10/artificial-intelligence-will-not-become-a-frankensteins-monster-ian-winfield|access-date=17 September 2014|work=[[The Guardian]]}}</ref> Boyles and Joaquin, on the other hand, argue that Luke Muehlhauser and [[Nick Bostrom]]’s proposal to create friendly AIs appear to be bleak. This is because Muehlhauser and Bostrom seem to hold the idea that intelligent machines could be programmed to think counterfactually about the moral values that humans beings would have had.<ref>{{cite journal | last1 name=think13 Muehlhauser | first1 = Luke | last2 = Bostrom | first2 = Nick | year = 2014 | title = Why we need friendly AI | journal = Think | volume = 13 | issue = 36 | pages = 41–47 | doi = 10.1017/S1477175613000316| s2cid = 143657841 }}</ref> In an article in ''[[AI & Society]]'', Boyles and Joaquin maintain that such AIs would not be that friendly considering the following: the infinite amount of antecedent counterfactual conditions that would have to be programmed into a machine, the difficulty of cashing out the set of moral values—that is, those that a more ideal than the ones human beings possess at present, and the apparent disconnect between counterfactual antecedents and ideal value consequent.<ref>{{cite journal | last1 name=boyles2019 Boyles | first1 = Robert James M. | last2 = Joaquin | first2 = Jeremiah Joven | year = 2019 | title = Why Friendly AIs won't be that Friendly: A Friendly Reply to Muehlhauser and Bostrom | journal = AI & Society | volume = 35 | issue = 2 | pages = 505–507 | doi = 10.1007/s00146-019-00903-0| s2cid = 198190745 }}</ref>
 
Some philosophers claim that any truly "rational" agent, whether artificial or human, will naturally be benevolent; in this view, deliberate safeguards designed to produce a friendly AI could be unnecessary or even harmful.<ref>Kornai, András. "[http://www.kornai.com/Papers/agi12.pdf Bounding the impact of AGI]". Journal of Experimental & Theoretical Artificial Intelligence ahead-of-print (2014): 1-22. "...the essence of AGIs is their reasoning facilities, and it is the very logic of their being that will compel them to behave in a moral fashion... The real nightmare scenario (is one where) humans find it advantageous to strongly couple themselves to AGIs, with no guarantees against self-deception."</ref> Other critics question whether it is possible for an artificial intelligence to be friendly. Adam Keiper and Ari N. Schulman, editors of the technology journal ''[[The New Atlantis (journal)|The New Atlantis]]'', say that it will be impossible to ever guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power. They write that the criteria upon which friendly AI theories are based work "only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes.<ref>{{cite web|url=http://www.thenewatlantis.com/publications/the-problem-with-friendly-artificial-intelligence|author=Adam Keiper and Ari N. Schulman|title=The Problem with 'Friendly' Artificial Intelligence|publisher=The New Atlantis|access-date = 2012-01-16}}</ref>