Research Priorities for Robust and Beneficial Artificial Intelligence

Researc h Priorities for Robust and Beneﬁcial Artiﬁcial In telligence ∗ Stuart Russell, Daniel Dew ey , Max T egmark Computer Scienc e Division, University of Calif ornia, Berkeley, CA 94720 Dept. of Physics & MIT Kavli I nstitute, Massachusett s Institute of T e chnolo gy, Cam bridge, MA 02139 and F utur e of Humanity Institut e, Oxfor d University, 16 -17 S t. Ebb e’s st r., Oxfor d OX1 1PT, UK Success in the quest for artiﬁcial intelligence has the p otential to bring unp recedented beneﬁ t s to humanit y , and it is therefore w orthwhile to in vestigate ho w to maximize these ben eﬁts while av oiding p otentia l p itfalls. This article gives n umerous ex amples (which should by no means b e construed as an exhaustive list) of such worth while researc h aimed at ensuring that AI remains robust and b eneﬁcial. Artiﬁcial intelligence (AI) r esearch ha s explore d a v a- riety of pro blems and approa ches since its inception, but for the last 20 y ears o r so has bee n fo cused on the prob- lems surr ounding the construction of int el ligent agents – systems that p erceive and act in so me environmen t. In this context, the criterion for intelligence is r elated to sta - tistical and economic no tions of r ationality – collo quially , the ability to make go o d decisio ns, plans, or inferenc e s. The adoption of probabilistic r epresentations and s tatis- tical learning metho ds has led to a large degree of inte- gration and cro s s-fertilization betw een AI, machine learn- ing, statistics, control theory , neuroscience, and other ﬁelds. The establishmen t of shar ed theor etical fra me- works, co m bined with the av ailability of data and pro- cessing p ow er, has yielded remar k able succes ses in v ar i- ous comp onent task s such as sp eech recognition, ima ge classiﬁcation, autonomous vehicles, mac hine translation, legged loc omotion, a nd question-answering systems. As capabilities in these areas and others cros s the threshold from lab ora to ry resear ch to eco no mically v alu- able technologies, a v irtuous cycle ta kes hold whereby even small improv ements in p erfor mance a re w orth large sums of money , pr o mpting greater inv estments in re- search. Ther e is now a broad c o nsensus that AI research is pr ogressing steadily , and tha t its impact o n so ciety is likely to increase. The p otential beneﬁts are huge, since everything that civilization has to oﬀer is a pro duct of hu man intelligence; w e cannot predict what w e mig ht achiev e when this intelligence is ma g niﬁed by the to ols AI may provide, but the er a dication of disease a nd povert y are not unfathomable. Because of the great p otential of AI, it is v a luable to inv estigate how to rea p its b eneﬁts while av oiding p otential pitfalls. Progr ess in AI r esearch ma kes it timely to fo cus re- search not only on making AI more capable, but als o on maximizing the so cietal b eneﬁt o f AI. Such considera- tions motiv ated the AAAI 2008 –09 Presidential Panel on Long-T erm AI F utures [1] and o ther pro jects a nd com- m unity eﬀorts on AI’s future impacts. These constitute ∗ Published in AI Magazine 36 , No 4 (2015) : http://t inyurl.com/ rbaipaper . T his article give s examples of the t ype of researc h advocated by the Open Letter at http://f utureoflife .org/ai- open- letter a signiﬁcant expansio n o f the ﬁeld of AI itself, which up to now has fo cused la rgely on techniques that are neutral with resp ect to purp ose. The present do cument can b e viewed as a natura l co ntin ua tion of these e ﬀorts, fo cusing on iden tifying research directions that c an help maximize the so cietal b eneﬁt of AI. This r e search is by necess it y int erdisciplinar y , because it inv olves bo th so ciety and AI. It rang es from econo mics, law, and philosophy to com- puter sec ur ity , formal methods and, of cour se, v arious branches of AI itself. The fo cus is o n delivering AI that is b eneﬁcial to so c ie t y a nd r obust in the sense that the bene ﬁts a re guar anteed: o ur AI systems must do what we wan t them to do. This do cument was drafted with input fro m the atten- dees o f the 2 015 co nference “The F utur e of AI: Opp or- tunities and Challenges ” 1 (see Ackno wledgements), and was the basis for an op en letter that has collected ov er 8,000 sig natures in supp or t of the r esearch priorities out- lined here. I. SHOR T-TERM RESEARCH PRIORI TIES A. Optimizing AI’ s Economic Impact The successes of industria l applications o f AI, from manuf acturing to informa tion services, demons trate a growing impact on the economy , although there is dis- agreement ab out the e x act nature o f this impact a nd o n how to dis tinguish b e tw een the eﬀects o f AI and thos e of other information technologies. Ma ny eco nomists a nd computer scientists a gree that there is v aluable research to b e done o n how to maximize the economic beneﬁts of AI while mitigating adverse eﬀects, which could include increased inequa lit y and unemplo ymen t [2 – 8]. Such co n- siderations motiv ate a rang e of res e arch dir ections, span- ning areas from economics to psy chology . Below are a few ex amples that should by no means be in terpreted a s an exhaustiv e list. 1 More details about the conference, including many of the talks, are av ailable at http://tinyu rl.com/beneficialai . 2 1. Lab or mark et forecasting : When and in what order sho uld we e xpe c t v arious jobs to b ecome au- tomated [4]? Ho w will this aﬀect the wages of le ss skilled workers, the c reative professions, and diﬀer- ent kinds of information w orkers? So me ha ve hav e argued that AI is lik ely to g reatly increase the ov er- all w ealth of humanit y as a who le [3 ]. How ev er, increased a utomation ma y pus h income distribu- tion further towards a p ower law [9], and the re- sulting disparity ma y fall dis pr op ortionately alo ng lines of ra ce, class, and gender; re search anticipat- ing the economic a nd s o cietal impact of such dis- parity could be useful. 2. Other mark et dis ruptions : Signiﬁca n t pa rts o f the economy , including ﬁnance, insurance, actuar- ial, and many consumer ma rkets, could be suscepti- ble to disruption through the use o f AI techniques to learn, model, and predict h uman and mark et behaviors. These mar kets migh t be identiﬁed b y a combination of high c o mplexity and high re w ards for na vigating that complexity [8 ]. 3. P olicy for managing adv erse eﬀects : What po licies could help incr easingly automated so cieties ﬂourish? F or e x ample, Brynjolfsson and McAfee [3] ex plore v arious p olicies for incen tivizing de- velopmen t of lab or-intensive s ectors and for us- ing AI-generated wealth to supp ort undere mployed po pulations. What are the pr os and cons of in- terven tions such as educatio nal reform, appren- ticeship progr ams, lab or-dema nding infrastructur e pro jects, and changes to minim um wage law, tax structure, a nd the soc ia l safety net [5]? His- tory provides many examples of subpopulations no t needing to work for economic security , ranging from aristo cra ts in a nt iquity to many pr esent-da y citi- zens of Qa tar. What s o cietal structures a nd other factors deter mine w he ther such p opulations ﬂour- ish? Unemploymen t is not the same as leisure, and there are deep links b etw een unemplo yment and unhappiness, s elf-doubt, and iso lation [10, 11]; understanding what p o lic ies and norms can break these links could signiﬁca n tly improv e the median quality of life. E mpirical and theor etical resea rch on to pic s such as the basic income pr op osal could clarify our optio ns [12, 13]. 4. Economic m easures : It is p ossible that eco nomic measures such as rea l GDP p er capita do not ac- curately capture the b eneﬁts and detr imen ts of heavily AI-and-automatio n- based economies, mak- ing these metrics unsuitable for p olicy purp oses [2]. Research on improv ed metrics could b e useful for decision-making . B. La w and Ethics Rese arc h The developmen t of systems that embo dy signiﬁcant amounts o f intelligence and autonom y leads to imp or- tant legal and ethical questions whose answers impact bo th pro ducer s a nd consumers of AI technology . The s e questions s pan la w, public policy , professio nal ethics, and philosophical ethics, a nd will require expe r tise from com- puter s c ient ists, lega l exp erts, p olitical scien tists, and ethicists. F or example: 1. Liabil it y and l a w for autonomo us v ehicles : If se lf-dr iving ca rs cut the r oughly 40,0 00 annual US traﬃc fatalities in ha lf, the car ma kers might get not 20,000 thank-y ou notes, but 2 0,000 la w- suits. In what leg al framework can the safet y bene ﬁts of a utonomous vehicles such as dr o ne air- craft and self-driv ing cars b est b e realize d [14]? Should legal questions abo ut AI b e handled b y ex - isting (softw are- a nd internet-focuse d) “cyb erlaw”, or should they b e tr eated separately [15]? In bo th military and commer c ial applications, gov ernments will need to decide how b est to bring the relev an t exp ertise to bea r; for example, a panel or commit- tee o f profess ionals and aca demics could b e created, and Ca lo has prop osed the cr eation of a F eder a l Rob otics Commission [16]. 2. M ac hine ethics : Ho w should an autonomous ve- hicle trade oﬀ, say , a s mall pro bability of injury to a h uman against the near-certaint y of a large material co st? How sho uld lawy ers, e thicis ts, and po licymakers eng a ge the public on these issues? Should such trade-o ﬀs b e the sub ject of national standards? 3. Autono m ous w eap ons : Can lethal a utonomous weapons be made to comply with h umanitarian law [1 7]? If, as some orga nizations hav e suggested, autonomous weapons s hould b e banned [18], is it po ssible to develop a precise deﬁnition of auton- omy for this purp o s e, and ca n such a ban practi- cally b e enforced? If it is p ermis sible or legal to use lethal autonomous weapons, how should these weapons be int egra ted into the existing command- and-control structure so that resp onsibility and li- ability remain as so ciated with sp e ciﬁc hu man ac- tors? What tec hnical r ealities and fo recasts should inform these questions, a nd how should “meaning- ful human control” ov er weap ons b e deﬁned [1 9– 21]? Are autonomous w eap o ns likely to reduce p o- litical av ersion to conﬂict, or p erhaps res ult in “ ac- cident al” battles or w ars [22]? W ould suc h weapons bec ome the to ol o f choice for oppr essors or ter ror- ists? Finally , how can transpa r ency and public dis- course bes t be encoura ged on these issues? 4. Priv acy : Ho w should the abilit y of AI sys tems to in terpret the data obtained from surv eillance 3 cameras, phone lines, e mails, etc. , interact with the rig h t to priv acy? Ho w will priv a cy risks inter- act with cyber security a nd cy b er warfare [23]? Our ability to take full adv antage of the synergy be- t ween AI and big data will dep end in part o n our ability to manage and pres e rve priv a c y [24, 25]. 5. Professional ethics : What role should computer scientists play in the la w and e thics o f AI develop- men t and use? Past and current pro jects to explore these questions include the AAAI 20 08–0 9 Pr esi- dent ial P anel on Lo ng-T erm AI F utures [1 ], the EP - SR C P rinciples of Rob otics [2 6], and r ecently an- nounced pro grams such as Stanford’s One-Hundred Y ea r Study of AI and the AAAI Co mmittee on AI Impact and E thical Issues. F r om a public policy p ersp ective, AI (like a ny power- ful new technology) e na bles b oth great new beneﬁts and nov el pitfalls to be avoided, and appr opriate p olicies can ensure that w e can enjoy the b eneﬁts while risks are min- imized. This r aises policy questions suc h as these: 1. What is the space of p olic ies worth s tudying , and how might they be enac ted? 2. Which criteria should be used to determine the merits of a p olicy? Ca ndidates include veriﬁabilit y of compliance, enfor ceability , ability to reduce ris k, ability to av oid stiﬂing desira ble tec hnology de- velopmen t, a doptability , and ability to ada pt ov er time to changing cir cumstances. C. Computer Scie nce Research for Robust A I As a utonomous systems b ecome mor e prev alent in s o- ciety , it becomes incre a singly impor tant that they r o - bustly b ehav e as intended. The developmen t of au- tonomous vehicles, autonomous trading s ystems, au- tonomous weap ons, etc. has therefore stoked interest in high-assur ance s ystems where strong robustness gua ran- tees can b e made; W eld and Etzioni hav e argued that “so- ciety will reject autono mous ag ents unless we hav e so me credible means of ma king them safe” [2 7]. Diﬀerent w ays in which an AI system may fail to p erfor m as desired corres p o nd to diﬀerent areas o f robustness r esearch: 1. V eriﬁcation : how to prov e that a system satisﬁes certain desired formal prop er ties. ( “Did I build the system right?” ) 2. V alidi t y : how to ensure that a system tha t meets its formal r equirements do es not ha ve un w anted be- haviors and consequence s . ( “Did I build the right system?” ) 3. Securit y : how to preven t in tentional manipulation by unauthorized parties. 4. Co ntrol : how to enable meaningful hum an co n trol ov er an AI system after it begins to ope r ate. ( “O K, I built the system wr ong; c an I ﬁ x it?” ) 1. V eriﬁc ation By veriﬁcation, we mean metho ds that yield high co n- ﬁdence that a sy stem will satisfy a set of formal con- straints. When p os sible, it is desirable for systems in safety-critical situations, e.g. self-driv ing car s, to be ver- iﬁable. F o r mal veriﬁcation of soft ware has adv anced signiﬁ- cantly in recent years: examples include the seL4 k er- nel [28], a complete, gener al-purp ose operating - system kernel that has b een mathematica lly chec k ed a gainst a formal sp eciﬁcation to g ive a strong guar antee a gainst crashes a nd unsafe op erations , and HA CMS, DAR P A’s “clean-sla te, formal metho ds-based approa ch” to a set o f high-assur ance so ft ware to o ls [29]. Not only s hould it be p o s sible to build AI systems on top of veriﬁed sub- strates; it sho uld also b e p o ssible to verify the desig ns of the AI systems themselves, pa rticularly if they fol- low a “co mpo nentized architecture”, in which guar antees ab out individual comp onents c a n b e c ombined accord- ing to their connections to yield pro pe rties o f the over- all system. This mirror s the agent architectures used in Russell and Norv ig (20 10), whic h s eparate an ag ent into distinct mo dules (predictive mo dels, state estimates, util- it y functions, p olicies, learning elements, etc. ), and ha s analogues in some fo rmal r esults o n c ontrol sys tem de- signs. Research on richer kinds of ag ent s – for exa mple, agents with lay ered architectures, a nytime comp onents, ov erlapping delib erative and r eactive elements, metalevel control, etc. – could contribute to the cr eation of veriﬁ- able agents, but we la ck the for mal “algebra ” to prop erly deﬁne, explore, a nd rank the space of designs. Perhaps the most salient diﬀerence be t ween veriﬁca- tion of traditio nal so ft ware and v eriﬁcation of AI sys- tems is that the co rrectness of tr a ditional softw are is de- ﬁned with respect to a ﬁxed and kno wn mac hine model, whereas AI s y stems – esp ecia lly robo ts and other e m- bo die d systems – op erate in environments that are a t bes t pa rtially known by the system designer. In these cases, it may be practical to verify that the sy s tem acts correctly given the k nowledge that it has, av oiding the problem of mo delling the real environment [30]. A lack of desig n-time knowledge also motiv ates the use of learn- ing algorithms within the agent soft ware, and veriﬁcation bec omes mo re diﬃcult: statistical lear ning theory g ives so-called ǫ - δ (pro bably approximately correc t) b ounds, mostly for the somewhat unrealistic settings o f s uper - vised learning from i.i.d. data and sing le -agent reinforce- men t learning with simple architectures and full observ- ability , but ev en then req uiring prohibitively large sample sizes to obta in meaningful g ua rantees. W or k in adaptive control theor y [31], th e theor y of so-called cyb erphysi c al systems [32], and veriﬁcation of 4 hybrid or ro bo tic systems [3 3, 3 4] is highly relev an t but also faces the same diﬃculties. And of course all these issues are laid on top of the standar d pro blem of pr oving that a g iven so ft ware artifact do es in fac t co rrectly im- plement , say , a re info r cement learning a lgorithm of the int ended t yp e. Some w ork ha s b e e n do ne on verifying neural netw ork applications [35 – 3 7] and the notion of p artial pr o gr ams [38, 3 9] a llows the designer to imp ose arbitrar y “structural” constra ints on b ehavior, but muc h remains to be done b efor e it will b e p o s sible to hav e high conﬁdence that a learning agent will learn to satisfy its design criteria in rea listic con texts. 2. V alidity A v eriﬁcation theorem for an agent design has the form, “If environmen t satisﬁes assumptions φ then b e- havior satisﬁes r equirements ψ .” There are t wo ways in which a veriﬁed ag ent c a n, nonetheless, fail to b e a bene ﬁc ia l agen t in actuality: ﬁrs t, the environmen tal as- sumption φ is fa lse in the rea l world, leading to b ehavior that violates the requir ements ψ ; second, the system ma y satisfy the fo rmal requirement ψ but still b ehav e in wa ys that w e ﬁnd hig hly undesirable in practice. It may b e the case tha t this undesirability is a consequence o f sa tisfying ψ when φ is v iolated; i.e., had φ held the undesirability would no t have bee n manifested; or it may b e the ca se that the requirement ψ is erro neous in itself. Russell and Nor vig (20 1 0) provide a simple example: if a r ob ot v acuum clea ner is asked to clea n up as muc h dirt as p os- sible, and has an a ction to dump the co nten ts of its dir t container, it will rep eatedly dump and clean up the same dirt. The re q uirement should fo cus not on dirt clea ned up but o n cleanliness of the ﬂo o r. Suc h sp eciﬁcation erro rs are ubiquitous in softw are veriﬁcation, where it is com- monly observed that writing cor rect speciﬁca tions can be harder than w r iting correct code. Unfor tuna tely , it is not po ssible to verify the sp eciﬁcatio n: the notio ns of “b en- eﬁcial” a nd “desir able” are not separa tely made for mal, so o ne ca nno t stra ightf orwardly prove that satisfying ψ necessarily leads to desira ble b ehavior and a beneﬁcia l agent. In order to build systems that robustly behave well, w e of cour se need to decide what “go o d b ehavior” means in each a pplica tion domain. This ethical question is tied in- timately to questions of what engineering techniques ar e av ailable, how relia ble these techniques are , and what trade-oﬀs ca n b e made – a ll are as where computer sci- ence, ma chine lea rning, and br oader AI exp ertise is v a lu- able. F or example, W allach a nd Allen (2008) argue that a signiﬁca nt considera tio n is the computational exp ense of diﬀerent behavioral standar ds (or ethica l theories): if a standard canno t b e applied eﬃcien tly eno ugh to guide behavior in safet y-critical s ituations, then c heaper ap- proximations ma y b e needed. D esigning simpliﬁed r ule s – for example, to gov ern a self-dr iving car ’s decisions in critical s ituations – will lik ely r e q uire exp er tise fr om b oth ethicists and computer scientists. Computatio nal models of ethical r easoning ma y shed ligh t o n questions of com- putational exp ense a nd the viabilit y of r eliable ethical reasoning methods [40, 41]. 3. Se curity Security resear ch can help make AI more robust. As AI s ystems are used in an increasing num b er of cr iti- cal r oles, they will take up a n inc r easing prop or tio n of cyb er-attack s urface a rea. It is also probable that AI and machine lear ning techniques will themselves b e used in cyber -attacks. Robustness against exploitation at the low lev el is closely tied to veriﬁability a nd freedom from bugs. F or example, the D ARP A SAFE progra m aims to build a n in- tegrated hardware-softw are system with a ﬂexible meta- data rule engine, on which can b e built memory safety , fault is o lation, and other pro to cols that c o uld impr ov e security by preven ting exploitable ﬂaws [42]. Suc h pro - grams cannot eliminate all se c urity ﬂa ws (since veriﬁca- tion is only as strong a s the a ssumptions tha t underly the sp eciﬁcatio n), but c o uld sig niﬁcantly reduce vulner- abilities of the type exploited by the recent “Heartbleed” and “ B ash” bugs. Suc h systems could b e preferentially deploy ed in safety-critical applica tions, where the cost of improv ed security is justiﬁed. A t a higher level, resear ch into sp eciﬁc AI and ma - chine lea r ning techniques may b ecome increa singly use- ful in s ecurity . These techniques could b e applied to the detection of intrusions [4 3], a nalyzing malw are [4 4], o r detecting potential exploits in other prog rams throug h co de analysis [45]. It is not implausible that cyb er attack betw een states and priv ate actor s will b e a r isk factor for harm fr om near-future AI sy stems, motiv ating resear ch on preventing harmful ev ent s. As AI systems grow mor e complex a nd a re netw orked together, they will hav e to int elligently manage their trust, mo tiv ating r esearch on statistical-b ehavioral trust establishment [46 ] a nd com- putational reputation models [47]. 4. Contr ol F o r ce r tain types of s afety-critical AI systems – esp e- cially vehicles and weap ons platforms – it may b e desir - able to r etain so me form o f meaningful hum an control, whether this means a human in the lo op, on the lo op [48, 49], or some o ther proto co l. In a n y of these cases, there will b e technical work needed in order to ensure that meaningful human con trol is maintained [50]. Automated v ehicles are a test-b ed for eﬀective c ontrol- granting tec hniques. T he design of systems and proto cols for transition b etw een automated navigation a nd h uman control is a promising area for further research. Such issues also motiv ate broader rese arch on how to opti- mally allo cate tasks within human–computer teams, both 5 for identifying situations w her e co nt rol should be tra ns - ferred, and for applying h uman judgmen t eﬃciently to the highest-v alue decisions. II. LONG-TERM RESEARCH PRIORITIES A fr equently discussed long-ter m goa l of so me AI re- searchers is to develop systems that c a n learn from exp e- rience with h uman-like brea dth and surpass h uman per- formance in most cognitive tasks, thereby having a ma jor impact o n so ciety . If there is a non-neg lig ible probability that these eﬀorts will succeed in the foreseea ble future, then additional current research b eyond that men tioned in the previous se c tions will be motiv a ted a s exempliﬁed below, to help ens ur e that the resulting AI will be robust and beneﬁcial. Assessments o f this success probability v ary widely b e- t ween resear chers, but few would argue with grea t con- ﬁdence that the proba bilit y is negligible, given the track record of such predictions. F or example, Ernest Ruther- ford, arguably the greatest n uclear physicist o f his time, said in 19 33 – less than 2 4 ho urs b efore Szilard’s inv en- tion of the nuclear chain rea ction – that nuclear energy was “ mo onshine” [51], a nd Astronomer Ro yal Richard W o o lle y called in terplanetar y tra vel “utter bilge” in 1956 [52]. Moreover, to justify a mo dest inv estmen t in this AI robustness research, this probability nee d not be high, merely non- ne g ligible, just as a mo des t inv estmen t in home insurance is justiﬁed by a non-negligible probabil- it y of the ho me bur ning down. A. V eriﬁcation Reprising the themes of sho r t-term rese arch, resear ch enabling veriﬁable low-level softw are a nd hardware can eliminate lar ge classe s of bugs and pro blems in general AI s ystems; if suc h systems b ecome increasingly pow- erful and s a fet y-critica l, veriﬁable sa fet y prop er ties will bec ome increasingly v aluable. If the theory of extend- ing veriﬁable pro per ties fro m comp onents to entire sys - tems is well understo o d, then even very large systems can enjoy certain kinds of safety guarantees, po tent ially aided by techniques designed explicitly to handle learning agents and hig h- level prop erties. Theoretica l rese a rch, esp ecially if it is do ne explicitly with very genera l and capable AI sy stems in mind, could b e particula rly useful. A related veriﬁcation r esearch topic tha t is distinctive to long-term concerns is the v eriﬁability o f systems that mo dify , extend, or impro ve themselves, p ossibly man y times in succes s ion [5 3, 54]. At tempting to straig ht for- wardly apply formal veriﬁcation to o ls to this mor e ge n- eral setting presents new diﬃculties , including the chal- lenge that a formal system that is suﬃciently p ow erful cannot use formal metho ds in the obvious wa y to ga in assurance about the accura cy of functionally similar for - mal sy stems, o n pain of inconsistency via G¨ odel’s incom- pleteness [5 5, 5 6]. It is not y et clea r whether or ho w this pro blem can b e ov ercome, or whe ther simila r prob- lems will arise with other v eriﬁcation metho ds of similar strength. Finally , it is often diﬃcult to actually apply for ma l ver- iﬁcation techniques to ph ysical systems, esp ecially sys- tems that have not been desig ned with v eriﬁcation in mind. This motiv ates r esearch pursuing a gene r al the- ory that links functional sp eciﬁcation to physical states of aﬀairs. This type of theory would allow use of for- mal too ls to ant icipate and control b ehaviors of systems that a pproximate r ational agen ts, a lternate desig ns suc h as s atisﬁcing agents, and s y stems that canno t b e eas- ily desc r ib ed in the sta ndard ag e nt forma lism (p ow erful prediction systems, theorem-prov ers, limited-purp ose sci- ence or engineering sys tems, etc. ). It may a lso be that such a theor y could a llow rig o rous demo ns trations that systems are constrained from taking ce r tain kinds of a c- tions or p er forming certain kinds of reas o ning. B. V alidity As in the sho rt-term research priorities , v alidity is co n- cerned with undes irable b ehaviors that c an arise despite a system’s forma l corre ctness. In the long term, AI sys- tems might b ecome more p owerful and autono mo us, in which case failures of v alidit y could carry corresp onding ly higher costs. Strong guarantees for machine lear ning metho ds, an area w e highligh ted fo r short-term v alidit y resear ch, will also b e impor tant for long -term safety . T o maximize the long-term v alue of this work, machine lea rning resear ch might fo cus on the types of unexp ected generalization that would b e most problematic for v ery genera l a nd ca- pable AI s ystems. In par ticular, it might aim to under - stand theoretica lly and pra ctically how learned r epresen- tations o f high-level human concepts could b e exp ected to gener alize (or fail to) in radically new c o nt exts [57]. Additionally , if some concepts could b e learned r eliably , it might b e p ossible to us e them to deﬁne tasks and con- straints that minimize the chances of unint ended conse- quences even when autonomo us AI s ystems b eco me very general and ca pable. Little work has b een done on this topic, which sug g ests that b oth theor etical a nd ex per i- men tal research may b e useful. Mathematical to ols such a s formal log ic, probability , and decision theory hav e yielded sig niﬁcant insight into the founda tions of reasoning and decision-making . How- ever, there are still ma n y op en pro blems in the fo unda - tions of rea soning a nd dec ision. Solutio ns to these prob- lems may mak e the b ehavior of very capable sy s tems m uch more reliable a nd pr edictable. Example r esearch topics in this ar ea include reasoning and decision un- der b ounded computational reso urces ` a la Hor vitz and Russell [5 8, 59], how to tak e into account co rrelatio ns betw een AI systems’ b ehaviors a nd those of their envi- ronments or of other agents [60 – 64], how agents that are 6 embedded in their environmen ts s hould re a son [65, 66], and how to reaso n ab out uncer taint y over logical con- sequences of be lie fs or o ther deterministic co mputations [67]. Thes e topics ma y b eneﬁt from be ing considered to- gether, since they app ear deeply linked [68, 69]. In the long term, it is pla usible tha t we will w ant to make agents that act autonomo usly and p ow erfully across many domains. Explicitly sp ecifying our prefer- ences in broad domains in the style of near -future ma- chine ethics may no t b e practical, ma king “aligning ” the v alues of p ow erful AI s ystems with o ur own v a lues and preferences diﬃcult [70, 71]. Cons ider, for instance, the diﬃcult y of creating a utility function that encompasses an entire b o dy of law; even a litera l rendition of the la w is far b eyond o ur cur rent capabilities, and would b e highly unsatisfactory in practice (since la w is written a ssuming that it will be in terpreted and a pplied in a ﬂexible, case- by-case wa y by humans who , presumably , already em- bo dy the bac kground v alue systems that ar tiﬁcial agents may la ck). Reinforcement learning rais es its own pro b- lems: when systems beco me very capable and general, then an e ﬀect similar to Go o dhar t’s Law is likely to o c- cur, in which sophisticated age nts attempt to manipulate or directly control their reward sig nals [72]. This moti- v ates resea rch ar eas that could improv e our ability to engineer systems that can learn or acquir e v alues at r un- time. F o r ex a mple, inv erse reinforcement learning may oﬀer a viable approa ch, in which a system infers the pre f- erences of another rational or nearly rationa l actor b y o b- serving its b ehavior [73, 74]. O ther a pproaches could use diﬀerent as sumptions a bo ut underlying co gnitive mo dels of the actor who se preferences are b eing lea rned [75], or could b e explicitly inspired by the way humans acq uir e ethical v alues. As sys tems be c o me mor e capable, more epistemically diﬃcult metho ds could b ecome viable, s ug- gesting that res e arch on such metho ds could b e useful; for exa mple, Bo s trom (20 14) reviews preliminary work on a v ariety of metho ds for specifying goals indirectly . C. Security It is unclear whether lo ng-term pro gress in AI will make the o verall problem of security easier or ha rder; on one hand, s y stems will b eco me increa singly complex in construction a nd behavior a nd AI-ba sed cybera ttacks may b e ex tr emely eﬀective, while o n the o ther hand, the use of AI a nd machine learning techniques a long with sig- niﬁcant progress in low-level sys tem reliability ma y ren- der ha rdened systems muc h less vulnerable than to day’s. F r om a cryptogra phic p e r sp ective, it app ears tha t this conﬂict favors defenders over attack ers; this ma y b e a reason to pursue eﬀectiv e defense resea r ch wholeheart- edly . Although the topics describ ed in the nea r-term s ecurity resear ch section ab ove may b ecome incr easingly imp or- tant in the long ter m, very general and capable systems will p ose distinctive security pr oblems. I n particula r, if the problems of v alidit y and control are not solv ed, it may be useful to crea te “co ntainers” for AI systems that could hav e undesirable b ehaviors and consequences in less con- trolled environment s [76]. Both theor etical a nd practical sides o f this question warrant inv estigation. If the gen- eral case of AI containmen t tur ns out to b e pro hibitiv ely diﬃcult, then it may be that designing an AI system and a co nt ainer in parallel is mor e successful, allowing the w eakness es and streng ths of the design to inform the containmen t str ategy [72]. The design o f anomaly detec- tion systems a nd automa ted ex ploit-chec kers could b e of signiﬁcant help. Overall, it seems rea sonable to ex pec t this additional p ers p ective – defending ag ainst attacks from “within” a system a s well a s from ex ter nal actors – will r aise interesting and proﬁtable questions in the ﬁeld of computer securit y . D. Contr ol It has b een a r gued tha t very general and ca pa ble AI systems op erating a utonomously to a c complish some ta sk will often b e sub ject to eﬀects that increa se the diﬃcult y of main taining meaning ful human control [6, 72, 77, 78]. Research on systems that are no t s ub ject to these ef- fects, minimize their impact, or allow for reliable human control could b e v aluable in preven ting undesired co nse- quences, as could work on re lia ble and secure test-b eds for AI s ystems at a v ariety o f capabilit y lev els. If an AI system is se lecting the actions that best allow it to complete a giv en task, then av oiding co nditions that preven t the s y stem fro m contin uing to pur sue the task is a natural subgo a l [77, 78] (and con versely , seeking unco n- strained situatio ns is s ometimes a useful heuris tic [79]). This could b eco me problematic, how ever, if we wish to repurp ose the system, to deactiv ate it, or to sig niﬁca nt ly alter its decision-mak ing pro cess; such a system w ould rationally avoid these changes. Sy stems that do no t ex- hibit these behaviors have b een termed c orrigi ble systems [80], and b oth theore tica l and pr actical work in this area app ears tractable and us e ful. F or ex a mple, it may b e po ssible to desig n utility functions or decision pro cesses so tha t a sy s tem will not try to av oid b eing shut down or repurp osed [80], a nd theoretical frameworks could b e developed to better understa nd the spac e of p otential systems that a void undesirable behaviors [81 – 83]. It has b een a r gued that another natura l subgoal for AI systems pursuing a given goa l is the acquisition of fun- gible r esources of a v a riety of kinds : for exa mple, infor- mation ab out the environment , safety fro m dis ruption, and improv ed freedom o f action a re all instrumen tally useful for many task s [77, 78]. Hammond et al (1995) gives the label stabili zation to the mor e general set o f cases where “due to the action of the a gent, the environ- men t comes to be be tter ﬁtted to the agent as time go es on”. T his type of s ubg oal could lea d to undesir ed con- sequences, a nd a b etter understanding o f the co nditio ns under which resource acquisitio n or radica l stabilization 7 is an optimal str ategy (or likely to b e selec ted by a given system) w ould b e useful in mitigating its eﬀects. Poten- tial research topics in this area include “domestic” go als that are limited in scop e in some wa y [72], the eﬀects of lar ge tempo ral discount r ates on r esource acquis itio n strategies, and e x per iment al inv estigatio n of simple sys- tems that displa y these subgo als. Finally , r esearch o n the p o ssibility of sup erintelligent machines or rapid, sustained self-improvemen t (“intelli- gence explosion” ) has b een highlighted by past and cur- rent pro jects on the future o f AI as p otentially v aluable to the pro ject of main taining reliable co ntrol in the long term. The AAAI 200 8 –09 Pres iden tial Panel on Long- T er m AI F utures’ “ Subgroup on Pace, Concerns , and Control” stated that There was overall skepticism ab out the prosp ect of an intelligence ex plo sion... Nev- ertheless, there was a shared se ns e that addi- tional research would b e v aluable on metho ds for understa nding and verifying the rang e of behaviors of complex computational systems to minimize unexp ected o utco mes. Some panelists rec o mmended that more resear ch needs to be done to b etter deﬁne “intelli- gence ex plosion,” and also to b etter formu- late diﬀerent cla sses of such a ccelerating in- telligences. T echnical work w ould likely lead to enhanced understanding of the likelihoo d of such phenomena, and the na ture, risks, and overall o utco mes as s o ciated with diﬀer- ent conceived v ariants [1 ]. Stanford’s O ne-Hundred Y ea r Study of Ar tiﬁcial In tel- ligence includes “Loss of Control of AI systems” a s an area of study , sp eciﬁcally highlig ht ing co nc e r ns ov er the po ssibility that ...we could one day lose c o ntrol of AI systems via the rise of s up er in telligences tha t do not act in ac c ordance with human wis hes – and that suc h p ow erful systems w ould threaten hu manity . Are such dys topic outcomes p os- sible? If s o, how might these situations arise? ...What kind of in vestmen ts in resea rch should b e made to b etter under stand and to address the p ossibility of the rise of a danger- ous sup erintelligence or the o ccurrence of a n “intelligence explosion” ? [84] Research in this area c o uld include an y of the long - term resear ch priorities listed ab ov e, as well as theo r etical and forecasting work o n intelligence ex plosion a nd s uper in- telligence [72, 85], and could ex tend o r critique existing approaches b egun by gro ups such as the Machine Intel- ligence Research Institute [71]. II I. CONCLUSION In summary , succes s in the quest for artiﬁcial intelli- gence has the po tent ial to br ing unprecedented b eneﬁts to humanit y , and it is ther efore worth while to res e arch how to maximize these beneﬁts while av oiding potential pitfalls. The research ag enda outlined in this pap er, and the concerns that motiv ate it, hav e b een called “ anti- AI”, but we v ig orously contest this characteriza tion. It seems self-evident that the g rowing capabilities of AI a r e leading to an incr e ased p otential for impact o n human so ciety . It is the dut y of AI r esearchers to ensure that the future impa c t is b eneﬁcial. W e b elieve that this is po ssible, a nd hop e that this resear ch agenda provides a helpful step in the right direc tion. IV. AUTHORS Stuart Russell is a Profess or of C o mputer Science at UC Berkeley . His rese a rch covers many asp ects of artiﬁ- cial intelligence and machine learning. He is a fellow of AAAI, ACM, and AAAS and winner of the IJCAI Co m- puters and Thought Award. He held the Cha ire B laise Pascal in Paris from 2012 to 2 014. His b o ok A rtiﬁcial Intel lige nc e: A Mo dern Appr o ach (with Peter No r vig) is the standard text in the ﬁeld. Daniel Dew ey is the Alexa nder T amas Resea rch F el- low o n Machine Super int elligence a nd the F uture of AI at Oxford’s F utur e of Humanit y Ins titute, Oxfor d Martin School. He was pr eviously at Go o gle, Intel Labs P itts- burgh, and Ca r negie Mellon University . Max T egmark is a professor of physics at MIT. His current resear ch is at the interface of physics and a rti- ﬁcial intelligence, using physics-based techniques to ex- plore connections b etw een information pro cessing in bi- ologica l and eng ineered s ystems. He is the pr esident o f the F uture of Life Institute, whic h s uppo rts r e s earch a d- v ancing robust and beneﬁcial artiﬁcia l intelligence. V. ACKNO WLEDGEMENTS The initial version of this do cument was drafted with ma jor input fro m Janos Kramar and Richard Mallah, and reﬂects v aluable feedbac k from Anthon y Aguirre, Erik Brynjolfsson, Ryan Calo, Meia Chita-T egmark , T o m Dietterich, Dileep George, Bill Hibba r d, Demis Hassabis, Eric Hor v itz, Leslie Pac k Kaelbling, James Manyik a, Luke Muehlhauser, Michael Osbor ne, Da vid Park es, Heather Roﬀ, F rancesc a Rossi, Bart Selman, Murray Sha na han, a nd many others . The authors are also gra teful to Ser k an Cabi a nd David Stanley for he lp with man uscript editing and forma tting. 8 [1] E. Hor vitz and B. S elman , Interi m Rep ort from the P anel Chairs, 2009, AAA I Presidential Panel on Long T erm AI F utures. [2] J. Mokyr , Se cular Stagnation: F acts, Causes and Cur es , 83 (2014). [3] E. Br ynjolfsson and A. McAfee , The se c ond machine age: work, pr o gr ess, and pr osp erity in a time of bril l i ant te chnolo gies , W.W. Norton & Company , 2014. [4] C. Fre y and M. Osborne , The future of employmen t: how suscept ible are jobs to comput erisation?, T ec hni- cal report, O xford Martin Sc ho ol, Universit y of Oxford, 2013. [5] E. L. G laeser , Se cular Stagnation: F acts, Causes and Cur es , 69 (2014). [6] M. Shanahan , The T e chnolo gic al Singularity , MIT Press, 2015, F orthcoming. [7] N. J. Nilsson , AI Magazine 5 , 5 (1984 ). [8] J. Many ika , M. Chui , J. B ughin , R. Dobbs , P. B is- son , and A. Marrs , Di sruptive T e chnolo gies: A dvanc es that wil l T r ansform Life, Busi ness, and the Glob al Ec on- omy , McKinsey G lobal In stitute, W ashington, D.C., 2013. [9] E. Br ynjolfsson , A. McA fee , and M. Spe nce , F or- eign Aﬀ. 93 , 44 (2014). [10] C. Hetschk o , A. Knabe , and R. Sch ¨ ob , The Ec onomic Journal 124 , 149–166 (2014 ). [11] A. E. Clark and A . J. Osw ald , The Ec onomic Journal , 648–659 (1994). [12] P. V an P arijs , Ar guing for Basic Inc ome. Ethic al foun- dations for a r adic al r eform , V erso, 1992. [13] K. Widerquist , J. A. Noguera , Y. V an derbor ght , and J. De Wi spelaere , Basic inc ome: an antholo gy of c ontemp or ary r ese ar ch , Wiley/Blac kw ell, 2013 . [14] D. C. Vladeck , Wash. L. R ev. 89 , 117 (2014 ). [15] R. Calo , Available at SSRN 2402972 (2014). [16] R. Calo , Available at SSRN 2529151 (2014). [17] R. R. Churchill and G. Ulfstein , Amer ic an Journal of Interna tional L aw 94 , 623–659 ( 2000). [18] B. L. Docher ty , L osing Humani ty: The Case A gainst Kil ler R ob ots , Human R ights W atch, New Y ork, 2012. [19] H. M. Roff , R outle dge Handb o ok of Ethics and War: Just W ar The ory in the 21st Century , 352 (2013). [20] H. M. Roff , Journal of Military Ethics 13 (2014 ). [21] K. Ande rson , D. Reisner , and M . C. W axman , In- ternational L aw Studies 90 , 386–411 (2014). [22] P. Asaro , How Just could a Rob ot W ar Be?, in Curr ent Issues in C om puting And Phil osophy , ed ited by K. W. Adam Briggle and P. A. E. Bre y , p. 50–64, IOS Press, Amsterdam, 2008. [23] P. W. S inger and A. Friedm an , Cyb erse curity: What Everyone Ne e ds to Know , Ox ford Universit y Press, New Y ork, 2014. [24] J. Manyika , M. Ch ui , B . Bro wn , J. Bughin , R. Dobbs , C. R o xburgh , and A. H. Byers , Big Data: The N ext F rontier for In n o v ation, Competition, and Pro- ductivity , Rep ort, McKinsey Global I nstitute, W ashing- ton, D.C., 2011. [25] R. Agra w al and R. Srikant , ACM Si gmo d R e c or d 29 , 439–450 (2000). [26] M. Boden , J. Br yson , D. Caldwell , K. D aut- enhahn , L. Ed w ards , S. Kember , P. Newman , V. P arr y , G. Pegman , T. Rodden , et al., (2011). [27] D. Weld and O. Etzioni , AAAI T e chnic al R ep ort SS- 94-03 , 17–23 ( 1994). [28] G . Klein , K. Elphin stone , G. Hei ser , J. And r onick , D. Cock , P. Derrin , D. Elkaduwe , K. Engelhardt , R. Kolansk i , M. N orrish , T. Sewell , H. Tuch , and S. Winwood , seL4: F ormal veriﬁcation of an OS kernel, in Pr o c e e dings of the A CM SIGOPS 22nd symp osium on Op er ating systems principles , p. 207–220, A CM, 2009. [29] K. Fisher , HACMS: High Assurance Cyb er Military Systems, in Pr o c e e dings of the 2012 ACM Conf er enc e on High Inte grity L anguage T e chnolo gy , p. 51–52, ACM, 2012. [30] L. A. Dennis , M. Fisher , N. K. Lincoln , A. Lisitsa , and S. M. Veres , arXiv pr eprint (2013). [31] K. J. r Astr ¨ om and B. Wittenm ark , A daptive c ontr ol , Courier Dov er Publications, 201 3. [32] A . Pla tzer , L o gic al A nalysis of Hybrid Systems: Pr ov- ing The or ems for Complex Dynamics , Springer, 20 10. [33] R . Alur , F ormal veriﬁcation of hybrid sy stems, in Em- b e dde d Softwar e (EMSOFT), 2011 Pr o c e e dings of the In- ternational Confer enc e on , p. 273–278, IEEE, 2011. [34] A . F. W infield , C. Blum , and W. Liu , T ow ards an Ethical Rob ot: Internal Mod els, Consequences and Ethical Action Selection, in A dvanc es i n Autonomous R ob otics Systems , edited by M. Mistr y , A. Leonardi s , M. Witko wski , and C. Melhu ish , p. 85–96, Sp ringer, 2014. [35] L. Pulina and A. T acchella , An abstraction- reﬁnement approach to veriﬁcation of artiﬁcial neural netw orks, in Computer Aide d V eriﬁc ation , p. 243–257, 2010. [36] B . J. E. T a ylor , Metho ds and Pr o c e dur es for the V er- iﬁc ation and V alidation of Art iﬁcial Neur al Networks , Springer, 2006. [37] J. M. Schumann and Y. Liu , Appli c ations of neur al networks in hi gh assur anc e systems , S pringer, 2010. [38] D. Andre and S . J. Russell , State abstraction for pro- grammable reinforcemen t learning agents, in Eighte enth national c onfer enc e on Artiﬁcial intel ligenc e , p. 119–1 25, American Asso ciation for Art iﬁcial Intelligence, 2002. [39] D. F. Spears , Assuring the Behavior of Adaptive Agents, in A gent T e chnolo gy fr om a F ormal Persp e ctive , p. 227–257, Springer, 2006. [40] P. M. Asa r o , I nternational R eview of Information Ethics 6 , 9–16 (2006 ). [41] J. P. Sullins , Philosophy & T e chnolo gy 24 , 23 3–238 (2011). [42] A . DeH on , B. Karel , T. F. Kni ght Jr , G. Malecha , B. Mont agu , R. Morisset , G. Morrise tt , B. C. Pierce , R. Pollac k , S. Ra y , O. Shi vers , and J. M. Smith , Preliminary Design of the SAFE Platform, in Pr o c e e dings of the 6th Workshop on Pr o gr amming L an- guages and Op er ating Syst ems , A CM, 2011. [43] T . D. Lan e , Mac hine learning techniques for the com- puter securit y domain of anomaly detection, 2000, Ph.D. Dissertation, Department of Electrical Engineering, Pur- due U n ivers ity . [44] K. Rieck , P. Trinius , C . Willems , and T. Holz , Jour- nal of C om puter Se curity 19 , 639 –668 (2011). 9 [45] Y. Brun and M. D. Ernst , Find ing Latent Cod e Er- rors v ia Machine Learning Over Program Executions, in Pr o c e e dings of the 26th Intern ational Confer enc e on Soft- war e Engine ering , p. 480–490 , IEEE Computer So ciety , 2004. [46] M. J. Probst and S. K. Kasera , Statistical trust es- tablishment in wireless sensor n etw orks, in Par al lel and Distribute d Systems, 2007 I nternational Conf er enc e on , vol ume 2, p. 1–8, IEEE, 2007. [47] J. Saba ter and C. Si erra , A rtiﬁcial intel l igenc e r eview 24 , 33–60 (2005). [48] H. Hexmoor , B. McLa ughlan , and G . Tuli , Journal of Exp erimental & The or etic al Art iﬁcial Intel ligenc e 21 , 59–77 (2009). [49] R. P a rasuraman , T. B . She ridan , and C . D. W ick- ens , Syst ems, M an and Cyb ernetics, Part A: Systems and Humans, IEEE T r ansactions on 30 , 286–2 97 (2000). [50] UNIDIR , The We ap onization of Incr e asingly Au- tonomous T e chnolo gies: Im pl ic ations for Se curity and Ar ms Contr ol , UN IDIR, 2014. [51] A. Press , New Y ork Her ald T ribune (1933), September 12, p. 1. [52] Reuters , The Ottawa Citizen (1956), January 3, p. 1. [53] I. J. Good , A dvanc es in Computers 6 , 31– 88 (1965). [54] V. Vinge , The Coming T echnological S ingularit y , in VISION-21 Symp osium, NASA L ewis R ese ar ch Center and the Ohi o A er osp ac e Institute , 1993, NASA CP-10129. [55] B. F allenstein and N. So ares , Vingean R eﬂection: Reliable R easoning for Self-Mod ifying Agents, T ec hnical rep ort, Mac hine Intelligence Research Institute, Berkele y , 2014. [56] N. Wea ve r , P arado xes of rational agency and formal systems that verify their own sound ness, 2013, Preprint. [57] M. Tegmark , F riend ly Artiﬁcial Intelligence: the Physics Challenge, in Pr o c e e dings of the AAAI-15 Work- shop on AI and Ethics , p. 87–89, AA AI, 2015. [58] E. J. Hor vitz , Reasoning A b out Beliefs and Actions Un - der Comput ational Resource Constraints, in Thir d AAAI Workshop on Unc ertainty in A rtiﬁcial Intel l igenc e , p. 429–444 , 1987. [59] S. J. R ussell and D. Subramanian , Journal of Artiﬁ- cial Intel ligenc e R ese ar ch , 1–36 (1995). [60] M. Te nnenhol tz , Games and Ec onomic Behavior 49 , 363–373 (2004). [61] P. LaVi ctoire , B. F allenstein , E. Yud k o wsky , M. Barasz , P. Christiano , and M . Herreshoff , Pro- gram Equilibrium in the Prisoner’s Dilemma via Lb’s Theorem, in AAAI Mul ti agent I nter action without Prior Co or dination workshop , 2014. [62] D. Hintze , Problem Class Dominance in Predictive Dilemmas, 2014, H onors Thesis, A rizona St ate Univer- sit y . [63] J. Y. Halpe rn and R. P ass , arXiv pr eprint arXiv:1308.3778 (2013). [64] N. Soares an d B. F allenstein , T o war d Idealized Deci- sion Theory , T echnical rep ort, Machine Intelligence Re- searc h In stitute, Berk eley , 2014. [65] N. Soares , F ormalizing Tw o Problems of Realisti c W orld-Mod els, T echnical rep ort, Machine Intelli gence Researc h I n stitute, Berkeley , 2014. [66] L. Orseau and M. Ring , Space-Time Embedded I ntell i- gence, in Pr o c e e dings of the 5th International Confer enc e on A rtiﬁcial Gener al Intel ligenc e , p . 209–218, Berlin, 2012, Sp ringer. [67] N . So ares and B. F allenstein , Questions of Reasoning Und er Logical Uncertaint y , T ec hnical re- p ort, Mac hine Intelligence Researc h I nstitute, 201 4, url: http://int elligence.or g/files/QuestionsLogicalUncertainty.p d f [68] J. Y. Halper n and R. P a ss , arXiv pr eprint arXiv:1106.2657 (2011). [69] J. Y. Halp ern , R. P ass , and L. Se eman , T opics in Co gnitive Scienc e 6 , 245–257 (2014). [70] N . So ares , The V alue Learning Problem, T echnical rep ort, Mac hine Intelli gence Researc h Institut e, Berkeley , 2014. [71] N . Soares and B. F allenstein , Aligning Sup erin- telligence with Human Interests: A T echnical Researc h Agenda, T echnical rep ort, Mac hine Intel ligence Researc h Institute, Berkeley , California, 2014. [72] N . Bostrom , Sup erintel ligenc e: Pat hs, dangers, str ate- gies , Ox ford Universit y Press, 2014 . [73] S . Russ ell , Learning Agents for Uncertain Environ- ments, in Pr o c e e dings of the Eleventh Annual Confer enc e on Computational L e arning The ory , p . 101– 103, 1998. [74] A . Y. Ng and S. Russell , A lgorithms for Inverse R e- inforcemen t Learning, in Pr o c e e dings of the 17th Inter- national Confer enc e on Machine L e arning , p. 663–670, 2000. [75] W . Chu and Z. Ghahramani , Preference learning with Gaussian pro cesses, in Pr o c e e dings of the 22nd interna- tional c onfer enc e on Machine le arning , p. 137–144, ACM, 2005. [76] R . Y ampolskiy , Journal of C onsciousness Studies 19 , 1–2 (2012). [77] S . M. Om ohundro , The nature of self-improving artiﬁ- cial in telligence, 2007, Presented at Singularity Summit 2007. [78] N . Bostr om , Minds and M achines 22 , 71–85 (2012). [79] A . Wissner-Gross and C. Freer , Physic al r eview let- ters (2013), 110.16: 168702. [80] N . Soares , B. F allenstein , E. Yud k o wsky , and S. Arm str ong , Corrigibili ty , in AAAI-15 Workshop on AI and Ethics , 2015. [81] B . Hibbard , Avoiding u nintended AI b eh a viors, in A r- tiﬁcial Gener al Intel ligenc e , edited by J. Bach , B. Go- er tze l , and M. Ikl , p. 107 –116, S pringer, 2012. [82] B . Hibbard , Ethic al A rtiﬁcial Intel li genc e , 2014. [83] B . Hibbard , S elf-Modeling Agents an d R ew ard Genera- tor Corruption, in Pr o c e e dings of the AAAI-15 Worksh op on AI and Ethics , p . 61–6 4, AAAI, 2015. [84] E. Hor vitz , One-H undred Y ear Study of A rt iﬁcial Intel- ligence: Reﬂections and F raming, White p ap er, S t anford Universit y , 2014. [85] D. Ch almers , Journal of Consciousness Stu dies 17 , 7–65 (2010).

Research Priorities for Robust and Beneficial Artificial Intelligence

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment