On the Safety of Machine Learning: Cyber-Physical Systems, Decision Sciences, and Data Products
Machine learning algorithms increasingly influence our decisions and interact with us in all parts of our daily lives. Therefore, just as we consider the safety of power plants, highways, and a variety of other engineered socio-technical systems, we …
Authors: Kush R. Varshney, Homa Alemzadeh
1 On the Safety of Machine Learning: Cyber -Physical Systems, Decision Sciences, and Data Products Kus h R. V arshney (Corresponding Author) Data S cience Theory and Algorithms IBM Thomas J. W atson Rese a rch Center Y orktown Heights, Ne w Y ork 10598 Email: krv arshn@us.ibm.com Homa Alemzadeh Electrical a nd Computer Engineering Uni versity of V irginia Charlottesville, V ir ginia 22904 Email: alemzadeh@virg inia.edu Abstract Machine learning algo rithms increasingly influence our decision s and interact with us in all parts of our da ily liv es. Ther e f ore, just as we consider the safety of power plan ts, highways, a n d a variety of other engineere d so cio-technical systems, we must also take into account the safety of systems in volving machine learning. Heretofo re, the definition of safety has not been formalized in a machine learning context. In th is paper, we do so b y de fining machine le a rning safety in terms o f risk, epistemic uncertainty , and the harm incur r ed by unwanted outcom e s. W e th en use th is definition to examin e safety in all so r ts of applications in cyber-physical s ystems, decision sciences, and data produc ts. W e find that the f oundatio nal princip le of mo d ern statistical machine learning, emp irical risk m inimization, is not always a sufficient objectiv e. Finally , we discuss how fou r different categories of strategies fo r achieving safety in engineer in g, including inherently safe design, safety reserves, safe fail, an d pr ocedural safeguards can be mapped to a machine learning con text. W e then discuss examp le techn iques th at can be adopted in each category , such as con sidering interpretab ility and causality of pre dictiv e mod e ls, 2 objective fun ctions beyond expected prediction accur a cy , human in volvement f or labeling d ifficult o r rare e x amples, and user experience d e sign of software and o pen data. I . I N T R O D U C T I O N In recent years, machine learning algorithm s ha ve started i nfluencing eve ry p art of our l iv es, including health and w el l ness, law and order , commerce, entertainm ent, finance, human capital management, comm unication, transportatio n , and p hilanthropy . As the algorithm s , the data on which they are trained, and the models they produce are getting more powe rful and more ingrained in society , questions abo u t safety must be examined. It may be argued that machin e learning systems are s i mply tools, that they will soon hav e a general intelligence that surpass es human abilities, or something in-between. But from all perspectives, they are technological components of larger socio-t echni cal systems th at m ay ha ve to be engineered with safety in mind [1]. Safety is a commo nly u sed term across engineering dis ciplines connoting the absence of failures or conditio ns that render a syst em dangerous [2]. Safety is a not ion th at is domain - specific, cf. safe food and water , safe vehicles and highways, safe medical treatments, safe toys, s afe nei g h borhoods, and safe i ndustrial plants. Each of these domai n s h as specific d esi gn principles and regulations th at are appl icable on l y to them. There are some loose notions of safety for m achine learning, but they are prim arily of the “I know it wh en I see it” variety or are very application-specific; to the b est of our knowledge [3], there is no precise, non-appl ication-specific, first-principles definition of safety for machin e learning. The m ai n cont ri bution of this p aper is to provide exactly such a definition. T o d o so, we build upon a universal domain-agnosti c definition of safety in the engineering literature [4], [5]. In [4], [5] and numerous references th erein, Moeller et al. prop o se a decision-theoretic defi- nition of safety that applies to a broad set of domains and systems. They define s afety to be the reduction or m inimization of r i sk and epistemic uncertainty associated with unwanted outcom es that are s evere enough to be seen as harmful . Th e key point s i n this definition are: i) the cost of unwanted outcomes has to be s u f ficientl y high i n some hum an sense for e vents t o be h arm ful , and ii) safety in volves reducing bot h the probability o f expected harms and the p ossibilit y of unexpected harms. 3 W e define safety in machine learning in the same way , as the min imization of both risk and uncertainty of harms, and dev ot e Section II to fleshing out the details of this definition. As such, formulations of machine learning for achieving safety that we describe in Section III must have both risk and uncertainty min imization in t heir objective functions either explicitly , impli cit ly vi a constraints, or t h ro u g h s o cio-technical compon ent s beyond the core machine learning algorithm. The harmful cost regime is t h e part of the space that requi res the dual objectives of risk and uncertainty min i mization; t he non-harmful cost regime does not require the un certainty minimizati on obj ectiv e. As background before getting to those sections, we briefly describe harms, risk, and uncertainty without specialization to machine learning. A system yields an outcome based on i t s state and the in puts it recei ves. An o u t come ev ent may be desired or undesi red. Single events and sets of e vents have associated cost s th at can be measured and quantified by society . For example, a numeric level of mo rbidity can be the cost of an outcome. An undesired outcome is only a harm if its cost exceeds som e threshold. Unwanted events of small severity are n ot counted as safety is s ues. Risk is t h e expected value of the cost . Epistemi c uncertainty resul ts from the l ack of knowledge that coul d be o b tained i n principle, but may be practically i n tractable to gather [6]. Harmful o utcomes often occur in re g imes and operating conditions that are unexpected or undet erm i ned. W ith risk, we do not know w h at the outcome will be, b ut its distribution is known, and we can calculate the expectation of its cost. W i th uncertainty , we st i ll d o not know what the outcom e will be, b u t in contrast to risk, its prob ability distribution is also unknown (or only p arti ally known). Some decision theori s ts argue that all uncertaint y can be captured probabilisti call y , b ut we maintain the dis tinction between risk and un certaint y [5]. The first contribution of this work is to critically examine the foundati onal stati stical machine learning principles of empirical risk m i nimization and structural risk minimi zation [7] from the perspectiv e of safety . W e discuss how they do not d eal with epistemic un certaint y . Further , these principles rely on ar gu m ents in volving average loss es and l aws of l arge numbers, which m ay not necessarily be fully appli cable when considering safety . Mo reover , the loss functions inv olved in t hese principl es are abstract measures of distance between true and predicted v al u es rather than appl ication-specific qu ant ities measuring the possibil i ty of outcomes such as loss of life or loss of qualit y of l i fe that can b e judg ed harmful or not [8]. A discussion of safety would be incomplete without a discuss i on of s t rategies to increase the safety o f socio-technical system s with machine learning components. Four cate go ri es of 4 approaches hav e been identified for promoting safety in general [4]: inherently safe design, safety reserves, safe fail, and procedural safeguards. As a second contribution, we discuss these approaches specifically for m achine learning algorithms and especially to mi tigate epistemic uncertainty . Through t h is contribution, we can recomm end s trategies to engi neer safer machine learning meth o ds and set an agenda for further m achi n e learning safety research. The third contribution of t his paper i s examining the definition of and strategies for safety in s pecific m achi n e learning applications. T oday , machine learning technologies are used in a var iety of sett i ngs, including cyber -physi cal syst ems, decision sciences, and data products. By cyber -physical systems, we mean engineered systems that integrate computatio n al algorithm s and physi cal components, e.g. surgical robots, self-driving cars, and the sm art grid [9]. By decision sciences, we mean the use of algorithms to aid people in making important decisions and informing st rategy , e.g. pris on parole, medical treatment, and loan approv al [10]. By dat a products, we mean the use o f algo rithms to automate i n formational products, e.g. web advertising placement, media recommendation, and spam filtering [10]. These settin gs va ry widely in terms of their i nteraction with people, the scale of data, the tim e scale of operation and consequence, and the cos t magnit ude of consequences. A further contribution is a discussio n on how to e ven understand and quantify the desirability and undesirabili ty of outcomes alon g with their costs. T o complement sim ply elicitin g such knowledge directly from people [11], we suggest a data-driven approach for characterizing h arms that are particularly relev ant for cyber-physical systems wi th lar ge s t ate spaces of o u tcomes. Overall, the purpose of th is paper is to introduce a common language and framework for understanding, ev al u ating, and designing m achine learning systems that i n volve society and technology . Our goal is to set forth a fundament al organizing and unifying principle that carries through to abstract theoretical formulatio n s of machine learning as well as to concrete real- world app lications of machine learning. Thus it provides practitioners working at an y le vel of abstraction a principled way to reason abo ut the space of s o cio-technical solut ions. The remainder of the paper is organized in the following m anner . In Section II, after in tro- ducing the standard notat i on and concept of statist ical machine learning, we discus s what harm, risk, and epistemic uncertainty mean for machine l earning . In Section III, we dis cuss specific strategies for achie v ing safety in machin e learning. Section IV div es i nto example applications in cyber -physi cal syst ems, d ecis i on sci ences, and data products. Section V concludes the paper . 5 I I . S A F E T Y I N M AC H I N E L E A R N I N G In this secti o n, after briefly introducin g statis tical machine learning n o t ation, we examine how machine learning applicati ons fit wi th the conception of safety given above. A. Notation In what follows, we use standard notati on to describe concepts from empirical risk min- imization [7]. Giv en joi nt random v ariables X ∈ X (features) and Y ∈ Y (labels) with probability density function f X,Y ( x, y ) , a function mappi ng h ∈ H : X → Y , and a loss function L : Y × Y → R , the ris k R ( h ) is defined as the expected value of loss: E [ L ( h ( X ) , Y )] = Z X Z Y L ( h ( x ) , y ) f X,Y ( x, y ) dy dx. The loss function L typically measures the discrepancy between the value predicted for y using h ( x ) and y itself, for example ( h ( x ) − y ) 2 in regression problems . W e would like to learn the fu n ct i on h that m inimizes the risk. In the m achine learning context, we do not have access to the probabili ty f X,Y , but rather to a training set o f samp l es drawn i. i .d. from t he jo int dis t ribution ( X , Y ) : { ( x 1 , y 1 ) , . . . , ( x m , y m ) } and the goal is to l earn h such that th e empi rical risk R emp m ( h ) is mi nimized. The emprical risk is given by: R emp m ( h ) = 1 m m X i =1 L ( h ( x i ) , y i ) . B. Harmful Costs Analyzing safety requires us first t o examine whether im mediate h u man cost s of outcomes exceed some se verity threshold to be h armful. Unli ke other dom ai n s mentioned in the intro- duction, such as safe indus trial plants and s afe toys, we ha ve a great advantage when working with machine learning systems because the o p t imization formulation explicitly includ es the loss function L . The dom ain of L is Y × Y and t h e output is an abstract quantity representing prediction error . In real-world applications, th e value of the los s function may be endowed with some human cost and that hu m an cost may im ply a loss functi on that also includes X in the domain. M oreover , the cost may be seve re enough to be harmful and th u s a safety issue in som e parts of the domain and not in others. In many decision science applications, undesired outcomes are truly harmful in a human sense and their effec t is felt in near -real time. They are safety iss ues. Moreover , the space of 6 outcomes is often binary or of small cardinality and it is often self-evident which outcom es are undesired. Howe ver , lo ss functions are not always monotonic in the correctness of p redictions and depend on whose perspective is i n the objective. The space of outcomes for t h e machine learning components of typical cyber -ph y s ical systems applications is so vast that it is near -impo s sible to enu m erate all of the out comes, let alone elicit costs for them. Nev erthel ess , it is clear t hat outcomes l eading t o accidents have high human cost in real ti me and require the consideration of s afety . In order to get more nuanced characterizations of the cost s e verity of out com es, a data- driv en approach is prud ent [12]. The qu ality of service implication s of unwanted outcomes in data product applications are not typically safety hazards because t h ey d o n ot hav e an imm ediate sev ere human cost . Und esired outcomes may only hypot hetically lead to hum an cons equ ences. In practice, often the acceptable le vels o f safety and accident rates are defined by the society and th e application d omain. For example, t h e difference in acceptable accident rates and costs in m otor vehicles (hundreds of t housands of fatalities per y ear) versus comm ercial aircraft (tens of fatalities per year) shows the sub j ectivity of the public’ s acceptance o f safety [13]. C. Ri sk and Epistemic Uncertainty The risk minimi zation approach to machine l earning has many st rengths, which is evident by its successful application in various domains. W e benefit from this explicit optimization formulation in the machine learning domain by automatically reducing the probability of harms, which is not alwa ys the case i n other domains. Howe ver , thi s standard formulation does not capture th e issu es related to the uncertainty that are also relev ant for safety . First, altho ugh it i s assumed th at t he t raining sam p les { ( x 1 , y 1 ) , . . . , ( x m , y m ) } are drawn from the true und erlying p robability di stribution of ( X , Y ) , that may not always be the case. Further , it may be that the dist ribution the samples actually come from cannot be known, precluding the use of cov ariate shift [14] and domain adaptation techniques [15]. This is on e form of epistemic uncertainty that is quite relev ant to s afety because training on a dataset from a diffe rent distribution can cause mu ch harm. Also, it may be that the training samples do come from the true, but unknown, underlying distribution, but are absent from large parts of the X × Y space due t o small probabil i ty density there. Here the learned function h will be complet ely dependent on an inductive bias encod ed through H rather than t he un certain true distribution, which coul d introduce a safety hazard. 7 The statisti cal learning theory analysis ut i lizes laws of large numbers to st udy the effe ct of finite training data and the con vergence of R emp m ( h ) to R ( h ) . Howev er , when cons i dering safety , we should also be cognizant that in practice, a machine learning system onl y encoun ters a finite number of test sam ples and the actual operational risk is an empirical qu ant ity on t he test set. Thus the operational risk may b e much large r than the actual risk for small cardinality test sets, e ven if h is risk-optimal. This uncertainty caused by the i nstantiation of t he test set can ha ve lar ge s afety im plications on individual test samp les. Applications performed at scales with lar g e trainin g sets, l ar ge testing sets, and the ability to explore the feature space have little epistemic uncertainty , whereas in other applications it is more often than not the case that there is uncertainty about the trainin g samples being representative of the test ing samples and that only a few predictio n s are m ade. Moreover , in applications such as cyber -phys ical sy stems, very lar ge out come spaces preve nt e ven m ild covera ge of t he space through t raining sampl es. I I I . S T R A T E G I E S F O R A C H I E V I N G S A F E T Y As dis cus sed, safety and strategies for achieving i t are often inv esti gated on an appli cati o n-by- application basis. For example, setting the minim um thickness of vessels and removing flammable materials from a chemical plant are ways of achieving safety . By analyzin g such strategies across domains, [4] has identified four main categories of approaches to achieve s afety . First, inherently safe design is the exclusion of a potential hazard from the sys t em (instead of controlling the hazard). For example, excluding hydrogen from the buoyant material of a dirigible airship makes it safe. (Another po ssible s afety m easure would be to int roduce apparatus to prevent t he hyd rogen from i gniting.) A second strategy for achieving s afety is through m ultiplicative or additive reserves, known as safety factors and s afety mar gins, respectiv ely . In mechanical sy s tems, a safety f actor is a ratio between the m aximal load t h at does n o t lead to failure and the load for wh i ch the sy stem was designed. Similarly , the safety margin is the diffe rence between the two. The third general category of safety measures is ‘safe fail, ’ which implies that a syst em remains safe when it fails in its intend ed operation. Examples are electrical fuses, s o-called dead man’ s switches on trains, and safety valves on boilers. 8 Finally , the fourt h strategy for achieving safety is giv en the name procedural safeguards. This strategy inclu des measures beyond ones design ed into the core functionality of th e system, such as audi ts, trainin g, posted warnings, and s o on . In this section, we discuss each of th ese strategies with specific approaches that extend m achine learning formul at i ons beyond risk minimizati o n for safety . 1) Inher entl y Saf e Design : In t he machine learning con t ext, we would like robustness against the uncertainty of the trainin g s et not being sampled from the test distribution. The training set may have various b iases that are unknown to the us er and that will not be present during the test phase o r may contain patterns th at are undesired and might lead t o h armfu l ou t comes. Mod ern techniques such as extreme gradient boosti ng and deep n eural networks may exploit these biases and achieve high accuracy , but they may fail in making safe p redi ct i ons due to unknown shi fts in the data domain or inferring incorrect patterns or harmful rules [16]. These models are s o complex that it is very d i f ficult to understand how they will react to such shifts and whether they will produ ce harmful outcomes as a result. T w o related ways to introduce inherently safe design are by insi sting on model s that can be interpreted by peopl e and by excluding features that are not causally related to t he outcome [17]–[20]. By examining interpretable models, features o r functions capturing quirks in the data can be noted and excluded, thereby a voiding related harm. Similarly , by carefully selecti n g variables that are causally related to th e outcome, phenom ena that are not a part of the true ‘physics’ of the sy stem can be excluded, and associated harm be a voided. W e note that post h o c interpretation and repair of complex u n interpretable models, appealin g for other reasons, does not assure safety via inh erentl y safe d esign b ecause th e interpretation is not the decisi on rule that is actually used in m aki ng predictions. Neither interpretabilit y n o r causality of models is properly captured within the st andard risk minimizati on formulati on of machi n e learning. Extra regularization o r constraints on H , beyond those implied by structural risk minim i zation, are needed to learn inherently safe mo d els. That might l ead t o performance loss in accurac y when measured through st andard metrics such as training and testing data probability di stributions, but the safety will be enhanced by reduction in epistemic uncertainty and u ndesired bias. Both interpretabil i ty and causalit y may be incorporated into a single learned model, e.g. [21], and causalit y may be used to induce interpretabil ity , e.g. [22]. In applications with very large outcom e spaces such as those employing reinforcement learning, it is shown that appropriate aggregation of states in out come policies can lead to 9 interpretable mod el s [23]. 2) Safety Reserves: In machine learning formulations, the un certainty in the matching of training and test data di s tributions or in the instant i ation of the test set can b e parameterized with the sym bol θ . Let R ∗ ( θ ) b e th e risk of the risk-optimal model if t he θ were known. Along the same lin es as s afety factors and safety m argins, robust formulatio n s find h while constraining or min i mizing max θ R ( h,θ ) R ∗ ( θ ) or max θ ( R ( h, θ ) − R ∗ ( θ )) . Such formu l ations can captu re uncertaint y in the class priors and uncertaint y resultin g from label no i se i n classification problems [24], [25 ]. They can also capture t he uncertaint y of which part of the X space the actual small set of test samples com es from. A different sort o f safety fac tor comes about when considering f airness and equitability . In certain prediction problems, the ri s k of harm for members of p rot ected groups sh o uld not be much worse (up to a m ultiplicative factor) than the risk of harm for others [26]–[28]. W e can p arti tion the feature space X into the sets X u , X p ⊂ X , respectiv ely , correspon ding to t h e unprotected and p rot ected groups, indicated by features such as race and gender . Then using a rul e such as the 80% (or four-fi fths) rule advocated in t he study of disparate impact [29], we can constraint the relative ris k of harm for the p rotected versus unprot ected group to a maximum va lue such as 5 / 4 : R X p R Y L ( x, h ( x ) , y ) f X,Y ( x, y ) dy dx R X u R Y L ( x, h ( x ) , y ) f X,Y ( x, y ) dy dx ≤ 5 4 . Under such a constraint, we ensure that the outcome of prediction for protected groups is not much mo re harmful than for un p rotected groups. 3) Safe F ail : A technique used in m achine learning when predictions cannot be giv en confi- dently is the reject optio n [30]: th e model reports that it cannot reliably give a prediction and does not attem p t to do so, thereby failing safely . When the model s elects the reject option, typi call y a hum an operator intervenes, examines the test samp le, and provides a manual p redi ct i on. In classification problems, models are reported to be least confident near t he decision boundary . Howe ver , by doing s o, th ere is an implicit assumption th at dis t ance from the decision boundary is in versely related to confidence. This is reasonable in parts of X with high probabili ty density and l ar ge numbers of training samples because t he decisi on boundary is located where there is a lar ge overlap i n li kelihood fun ct i ons. Howe ver parts of X with l ow d ens ity may not contain any training samples at all and th e decision boundary m ay be completely based on an inductive bias, thereby cont aining much epist emic u n certainty . In these parts of the s p ace, d istance from 10 the decision boundary is fairly meaningless and th e typical trigg er for the reject option shoul d be av oi ded [31]. For a rare com bination of features i n a test sam ple [32], a safe fail mechanism is to always go for m anual examination. Both o f these manual intervention op t ions are suitable for applications wi t h s uf ficiently long time s cales. When working on t he scale o f mi l liseconds, only options similar to dead man’ s switches th at stop operations i n a reasonable m anner are appli cable. 4) Pr ocedural Safe guards: In addit i on to general p rocedural safeguards that carry over from other domains, two directions in machine learning that can be used for increasing safety wit h in this category are user experience design and op enn ess . In decision science appl ications especially , n on-specialists are often the operators of machine learning sys tems. Defining the training data set and setti n g up e valuation procedures, among other things, ha ve certain subtleties that can cause harm during operation if done incorrectly . User experience desi gn can be u sed to guide and warn novice and experienced practitioners to set up machine learning systems properly and thereby i ncrease safety . These days most m odern machine learning algorithms are open source, w h ich allows for the possibili ty of the public audit. Safety hazards and potential harms can be discovered through examination of source code. Howe ver , op en so urce software is not suffic ient, because the b eha vi or of machine learning sys tems is drive n by data as much as i t is driven by software implement at i ons of algorithms. Open data refers to data that can be freely used, reused, and redistributed by anyone. Opening data is a procedural s afeguard for increasing safety t hat is in creasing l y b ein g adopted by the community [33 ]–[35 ]. I V . E X A M P L E A P P L I C A T I O N S In this section, we further detail safety in machine learning systems by providing examples from cyber -physical sys tems, decision sciences, and data products. A. Cyber -Physical Systems W ith adv ances in comp uting, networking, and sensi ng techno l ogies, cyber- physical sys t ems hav e been deployed in various safety-critical sett ings su ch as aerospace, energy , transportati on, and healthcare. The increasing comp l exity and connectivity of these systems, t he tight coupl i ng between their cyber and physical components , and the inevitable in volvement of hum an operators in thei r supervisi on and control has introduced significant challeng es in ensuring system reli abi lity 11 and safety whil e maintaining the expected performance. Cyber -ph ysical systems continuousl y interact with th e p hysical world and human operators in real-time. In order to adapt to the constantly changing and uncertain en viro n m ent, they need to take i nto account no t only the current appl i cation but also the op erator’ s preferences, in t ent, and p ast b ehavior [36]. Autonomous machine learning and artificial intelligence techniques have been appli ed t o sev eral decision-making and control problems in cyber-physical s ystems. Here we discuss two examples wh ere unexpected harmful eve nts with epi s temic uncertainty mi ght im pact human l iv es in real-time. 1) Sur gical Robots: Robotically-assisted su r g ical sys t ems are a typical exa mple of hum an- in-the-loop cyber -physical systems. Sur gical robots consist of a teleoperation con s ole operated by a surgeon, an emb edded sys t em hos ting the automated robot control, and the physical robotic actuators and sensors. The robot control system receiv es the surgeon’ s commands issued using th e teleoperation console and translates the sur g eon ’ s h and, wrist, and finger movements into precisely engineered movements of miniatu rized surgical instruments inside p at i ent’ s body . Recent research shows an increasing interest in the use of machine learning algorithm s for modeling su rgical skil ls, workflo w , and en vironment and integration of this knowledge into control and automation of sur gical robots [37 ]. Machine learning techniques have been u sed for detection and class ification of surgical m o tions for autom at ed surgical s k i ll ev aluatio n [38]– [40] and automat i ng portions of repetitiv e and time-consum i ng surgical tasks (e.g., kno t-tying, suturing). [40], [41]. In autonomou s robo tic surgery , a machine l earning enabled surgical robot continuously esti- mates the st ate of t h e en vironment (e.g., l eng th or thi ckn ess of soft ti ssues under su rgery) based on t h e measurements from sensors (e.g., image data or force signals ) and generates a plan for exe cuting acti ons (e.g., moving the robotic ins t ruments along a trajectory). The m apping functi on from the perception of environment to the robotic actions is considered as a surgical s kill which the robo t learns, through eva luation of its own actio ns or from observing the actions of expert surgeons. Th e quality of the learned surgical ski lls can be assessed using cost functions that are either aut omatically learned or are manually defined by surgeons [3 7 ]. Giv en the uncertainty and large variability in the operator actio ns and behavior , organ/tissue movements and dynam i cs, and pos s ibility of incidental failures in the robotic syst em and ins t ru- ments, predicti ng all pos s ible system st at es and outcomes and assessing their associated costs is very challengin g . As mentioned in Section II-B, due to t he ve ry lar ge out come space, it 12 is n ot straightforward to elicit costs of all dif ferent outcomes and characterize which tasks or actions are costly enough to represent safety issues . For example, t here hav e been ongoin g reports of safety i ncidents during use of su r g ical robots that negatively impact pat i ents by causing procedure interruptions o r minor injuries. These incidents happen despi te existing safe fail mechanisms includ ed in the system and often resul t from a combination of di f ferent causal factors and unexpected cond itions, includi ng malfuncti o ns of surgical in struments, actions taken by the surgeon, and the patient’ s medical history [12]. There are also practical lim itations in learning opt i mal and safe surgical trajectories and workflo ws due to epistemic uncertainty in su ch en vironment s. The trainin g data often cons ists of sampl es collected from a select set of surgical tasks (e.g., elementary suturi n g gestures) performed by well-trained surgeons, which m ight no t represent the var iety of actions and tasks performed during a real procedure. Previous work shows that surgeon’ s expertise lev el, surgery type, and medical histo ry have a significant impact on the possi b ility of compl i cations and errors occurring during surgery . Further , aut omated algorithms should be able to cop e with uncertainty and unpredictable e vents and guarantee patient safety j ust as expert surgeons do in s u ch scenarios [37]. One solut ion for dealing with these uncertainties is to assess t he robustness of the s y stem in the presence of unwanted and rare hazardous ev ents (e.g., f ailures in cont rol system, noi s y sensor m easurement s, or incorrect commands sent by novice op erators) b y sim ulating such ev ents in virtual en vironment s [42] and q u antifying the possibil ity of makin g s afe decisions by the learning algorit hm. This approach is an example of procedural safeguards (Section III-4). Such a simul at ed assessment also serves to highligh t th e si t uations requiring safe fail strategies, such as con verting the p rocedure to non-robot ic techniques, rescheduling it t o a later time, or restarting the s ystem, that can refine the system. T h e costs of unwanted outcomes and safe fail strategies to cope with them can also be characterized based on past data. For example, we mi ned the FD A ’ s M anufacturer and Us er Facility Device Experience (MA UDE) database, a large database containing 14 years worth of adverse events, to obtain such characterizations on t h e causes and sev erity of s afety incidents and recovery actio ns taken by the surgical team. Such analysis helps focus development of m achine learning alg orithms containing safety strategies on regimes with harmful outcomes and av oid concern for safety strategies in regimes with non-harmful outcom es. Another solution currently adopted in practice is through superviso ry control of automated surgical tasks inst ead of fully autonomous surgery . For example, if the rob o t generates a geo- 13 metrically op timized su ture plan based on sensor data or surgeon i nput, it should still be tracked and u pdated i n real tim e because of possible tissue m otion and deformat i on during surgery [41]. This is an example of examining interpretable models to av oid possible harm (as discussed in Section III-1). An example of adopting safety reserves (Section III-2) in robotic surge ry is robust optimi zation of preoperative planning to minim ize the uncertainty at t he task level while maximizing t h e dexterity [43]. 2) Self-Dri vi n g Cars: Self-dri vin g cars are autonom ous cyber-physical systems capable of making intelli g ent navigation decisions in real-time wi thout any hum an input. They combine a range of sensor data from laser range finders and radars with video and GPS data to generate a detailed 3D map of the en vironment and estim ate their positi o n. The control system of the car uses this information to determi ne the optimal path to the dest ination and sends the relev ant commands to actuators that control the steering, braking, and t hrottle. Machine l earning algorithms are us ed in the control syst em of self-driving cars to model, identify , and track t he dyn amic en vironm ent, including the road condition s and m oving obj ects (e.g., other cars and pedestrians). Although auto m ated driving syst ems are expected to el i minate hu m an driv er errors and reduce the possibili ty of crashes, there are sever al sources of uncertainty and failure that mi ght lead t o potential safety hazards in th ese syst ems. Unreliable or noisy sensor signals (e.g., GPS data or video si gnals in bad weather cond itions), limitat i ons of computer visi on systems, and unexpected changes in t he en vironment (e.g., un k nown driving scenes or unexpected accidents on the road) can adversely af fect t he ability of cont rol system in learning and understanding the en viron m ent and making s afe decisio ns [44]. For example, a s elf-dri ving car (in auto-pi l ot mode) recently collided wit h a truck after failing to apply the brakes, leading to the death of the truck driv er . This was the first kn own fatality in over 130 million miles of testing the automated driving system. The accident was caused under extremely rare circumstances of the h igh height of t he truck, its white color under the bright sky , combined with the positioning of the cars across the road [45]. The i mportance of epistemic uncertainty or ”uncertainty on uncertainty” in these AI-assisted systems has been recently recognized, and there are ongoing research efforts tow ards quantifyi ng the robustness of self-driving cars to events that are rare (e.g., distance to a bicycle running on an expected trajectory) or n ot present in the training data (e.g., unexpected trajectories of moving objects) [46]. Systems that recognize such rare ev ent s trigger s afe fail mechanism s. T o the b est of our knowledge, there i s no self-driving car system with an inherently safe 14 design that utilizes, e.g., i n t erpretable models [47]. Fail-safe mechanisms that upon detection of failures or less confident predictions, stop the autonomous control software and switch to a backup syst em or a degraded lev el of autonomy (e.g., full control by the driv er) are considered for self-driving cars [48]. B. Decision Sciences In decision sciences applications, peop l e are in the loop in a diff erent way than in cyber - physical systems, b u t in the loop non etheless. Decis i ons are made about people and are m ade by people usi n g machine learning-based tools for s u pport. Many emer g ing applicati o n dom ains are now shifting to data-driven d ecis i on making du e to a greater capture of informat i on digi t ally and the desire t o be more scient ific rather than relying on (f all ible) gut insti nct [49]. These applications present many safety-related challenges. 1) Pr edi ct i ng V oluntary Resignation: W e recently studi ed the problem of predicting which IBM employees wi ll voluntarily resign from the company in t he n ext six months based on human resources and compensation data, wh ich required us to dev elo p a classification algorithm to be placed wi thin a larger decision-makin g system i nv olving h uman decision makers [50]. There are sev eral sources of epi stemic un certainty in this problem . First , t he way t o con s truct a training set in the problem is to look at the his torical set o f emp l oyees and treat employees that voluntarily resigned as positive samp l es and employees sti ll in the workforce as negativ e samples. Howe ver , since th e prediction probl em is to p redict resignation in the next s i x m o nths, our s et of negative samples wi ll necessarily inclu d e employees who should be labeled po s itive ly b ecause they will be resig n ing soon [51]. Another uncertainty is related to quirks or vagaries in the data th at are predictiv e b ut will not generalize. In t h is problem, a few predictiv e features related to stipulations in empl oyees’ contracts to remain with IBM for a fixed duration after their com pany was acquired, b u t such a pattern would no t remai n true going forward. Another i s sue is unique feature vectors: i f the data contains an employee in Australia wh o has gone 17 years wi t hout being promoted and no other similar employees, then there is huge uncertainty in t h at p art of feature space, and i nductive bias must b e compl etely relied u pon. In t he sol ution created for this problem, the inherently safe design principle of interpretability (Section III-1) was insis ted upon and was what l ed t o t he discovery about the acquired com pany . Specifically , C5.0 decision trees were used with the rule set option, and the project directly 15 motiv ated the study of an o p timization approach for learning class i fication rules [52]. The reason for cond ucting the project was to take action s such as salary increases to retain empl oyees at risk of resigning, and for this, t he o t her inherently s afe desig n principl e of causali ty is imp ortant. Rare samples su ch as the Australian emp l oyee led to the safe fail m echanism of manual inspection. 2) Loan Appr oval: As another example i n the decision sciences t h at we hav e studied, l et us consider the decision to approve loans for sol ar panels g iven to the rural poor i n India based on data in application form s [53]. The epistem ic uncertainty related to the training set n ot bein g representativ e of the true test distribution repeat here and can be addressed by sim ilar safety strategies as discussed in the p re vi ous examples. Loan approv al is an example illu strating that l o ss functions that are not always monoton i c in the correctness of predictions and depend on perspective. The applicant would like an approval decision regardless of their features ind icating ability to repay , the lender would like approv al only in cases i n wh ich appl icant features indi cate likely repayment, and s ociety would like t here to be fairness or equi tability in the system so that protected groups , su ch as defined by gender and religion, are not discrimin ated against . The l end er perspective is consistent with th e ty p ical choice of the loss funct i on, but the others are not . An in teresting additi onal issue, in thi s case, relates to the human cost function from society’ s perspectiv e inclu ding X . One of the attributes a vailable in the problem was th e surname of the applicant; in this part of Indi a, the s urname is a strong indicator of religion and caste. T h e use of this variable as a feature improved classi fication accurac y by a couple of percentage points, but resulted in w orse fairness: the true cost in the problem from s ociety’ s perspecti ve. Simply dropping the attribute as a feature does not ensure fairness because ot her features may be correlated, but a safety m ar gi n on the accuracy of the groups make the system fairer . C. Dat a Pr odu cts W ith data products applications, the first question to consider is whether immediate costs are lar ge enough for them to be considered s afety issues. One may ar gue that an algorith m s howing biased or m isguided advertisements or a sp am filter not allowing an important email to pass could eventually lead to harm, e.g., by being s hown an ad for a lower -paying job rather than a higher-paying one, a person may hypo thetically end up with a lower quali ty of life at some point in the futu re. Here the cost function does depend on X because m isclassifying certain email s 16 is more costly than ot hers. Howe ver , we do not view such a delayed and onl y hypotheti cal consequence as a safety issue. Moreover , in typical data products appl i cations, one can use billions of data poin t s as train- ing, perform large-scale A/ B testing , and e valuate ave rage performance on million s or bil l ions of clicks. Therefore, uncertainty is not at the forefront, and neither are the safety strategies. For example, the procedural safeguard of opening data is more commo n in decision science applications such as those sponsored or run by governments than in data p roducts application s where the data is often t h e key value proposition. V . C O N C L U S I O N Machine learning systems are already embedded i n many functions of society . The prognosi s is for b road adop tion t o only increase across all areas of li fe. W ith t his prev ailing trend, researchers, engineers, and ethicists ha ve started discuss ing the to pic of safety in machine learning. In this paper , we con t ribute to thi s di scussion starting from a very basic definition of safety in terms of harm, risk, and uncertainty and building u pon it i n the machin e learning context. W e identify that the m inimization of epistemic uncertainty i s missi ng from standard mod es of machine learning dev eloped around st at i stical risk minim ization and that it needs to be included when considering safety . W e discuss a few strategies for i n creasing safety in machine learning that are no t a compre- hensiv e list and are far from fully developed. This p aper can be seen as laying the foundations for a research agenda motivated by safety within which further strategies can be dev elop ed and existing strategies can b e fleshed out. In some respects, t he research community has t aken risk minimizati on clo se to the lim its of what is achiev able. Safety , especially epistemic uncertaint y minimizati on, represents a di rection that of fers new and exciting problem s to p ursue, many of which are being pursued already . As it is said in the Sanskrit literature, ahim . s ¯ a paramo dharmah . (non-harm is the ul timate direction). Mo reover , not only is non-harm the first ethical duty , many o f the safety i ssues for machine learning we have discussed in this paper are startin g to enter legal obligatio ns as well. For example, t h e Europ ean Uni on has recently adopted a set of comprehensive regulations for data protection, which i n clude prohi biting algo rithms th at make any ”decision based solely on automated processing, including profiling ” and sign ificantly affect a data subject or produ ce legal eff ects concerning him/her . Thi s regulation whi ch will take ef fect 17 in 2 0 18 is anticipated to restrict a wide range of m achine learning algorithm s currently u s ed in, e.g., recommendati o n systems , credit and insurance risk assessments, and social networks [54]. W e present example applications where m achin e learning algorithm s are in creasingly used and d i scuss the aspects of epistemic uncertainty , harmful out comes, and potenti al strategies for achie ving safety for each application. In some application s su ch as cyber-physical systems and decision sciences, machine learning algorithm s are used t o support control and decision makin g in safety-critical setti ngs wi th cons iderable costs and direct harmful i m pact on people’ s li ves, such as in jury or lo s s of life. In o ther applications, m achine learning based predictions are only used in less critical settings for aut omated information al products. Applicati o ns wit h higher costs of unwanted outcomes t end to be also those wi t h higher uncertainty and the on es wi th less sev ere outcomes are the ones wi t h smaller uncertainty . V I . A C K N O W L E D G E M E N T S No competi n g financial interests exist. R E F E R E N C E S [1] A. Conn, “The AI wa rs: The battle of the human minds to keep arti ficial intell igence safe, ” http://futureoflife.org/2015 /12/17/the-ai-wars-the-b attle-of-the-human-minds-to-k eep-artificial-intell igence-safe, Dec. 2015. [2] T . Ferrell, “Engineering safety-criti cal systems in the 21st century , ” 2010. [3] K. R. V arshney , “Engineering safety in machine learning, ” in Proc . Inf. Theory Appl. W orksh op , La Jolla, CA, F eb . 2016. [4] N. M ¨ oller and S. O. Hansson, “P rinciples of engineering safety: Ri sk and uncertainty reduction, ” Reliab . Eng. Syst. Safe. , vol. 93, no. 6, pp. 798–805, Jun. 2008. [5] N. M ¨ oller , “The concepts of risk and safety , ” in Handbook of Risk Theory , S. Roeser , R. Hillerbrand, P . Sandin, and M. Peterson, Eds. Dordrecht, Netherlands: Springer , 2012, pp. 55–85. [6] R . Senge, S. B ¨ osner , K. Dembczynski, J. Haasenritter, O . Hirsch, N. Donner-Ban zhoff, and E. H ¨ ullermeier , “Reliable classification: Learning classifiers that distinguish aleatoric and epistemic uncertainty , ” Inf. Sci. , vol. 255, pp. 16–29, Jan. 2014. [7] V . V apnik, “Pri nciples of risk minimization for learning theory , ” in A dv . Neur . Inf. P r ocess. Syst. 4 , 1992, pp. 831–838. [8] K. L. W agstaf f, “Machine learning that matters, ” in Pr oc. Int. Conf. Mach. Learn. , E dinburgh, United Kingdom, Jun.–Jul. 2012, pp. 529–536. [9] H. Alemzadeh, “Data-driv en r esil iency assessment of medical cyber-phy sical systems, ” Ph.D. dissertation, Uni v . Illinois, Urbana-Champaign , Urbana, IL, 2016. [10] J. Stanle y and D. T unkelan g, “Doing data science right — your most common questions answered, ” http://firstround.com/re view/doing-data-scienc e-right-your-mo st -common-qu estions-answered, 2016. [11] A. Olteanu, K. T alamadupula, and K. R. V arshney , “T he l imits of abstract ev aluation metrics: The case of hate speech detection, ” in Proc. ACM W eb Sci. Conf. , Tro y , NY , Jun. 2017, pp. 405–406. 18 [12] H. Alemzadeh, J. Raman, N. Lev eson, Z. Kalbarczyk, and R. K. Iyer , “ Adv erse even t s i n robotic surgery : A retrospectiv e study of 14 years of FDA data, ” PLoS ONE , vol. 11, no. 4, pp. 1–20, 04 2016. [13] J. Knight, Fundamentals of Dependable Computing for Softwar e Engineers . CRC Press, 2012. [14] H. Shimodaira, “Improv i ng predictive inference under cov ariate shift by weighting the l og-likeliho od function, ” Jo urnal of statistical planning and infer ence , vol. 90, no. 2, pp. 227–24 4, 2000. [15] H. Daume III and D. Marcu, “Domain adaptation for statistical classifi ers, ” J ournal of Art i ficial Intelligence Resear ch , vol. 26, pp. 101–126, 2006. [16] R. Caruana, Y . Lou, J. Gehrke, P . K och, M. Sturm, and N. Elhadad, “Intelligible mode ls for healthca re: Predicting pneumo nia risk and hospital 30-day readmission, ” in Proc. ACM SIGKDD Conf. Knowl. Discov . Data Min. , S ydney , Australia, Aug. 2015, pp. 1721–173 0. [17] A. A. Freitas, “Comprehensible classifi cation models – a position paper, ” SIGKDD Explorations , vol. 15, no. 1, pp. 1–10, Jun. 2013. [18] C. Rudin, “ Algorithms for interpretable machine learning, ” in Proc. ACM SIGKDD Conf. Knowl. Discov . Data Min. , New Y ork, NY , Aug. 2014, p. 1519. [19] S . Athey and G. W . Imbens, “Machine learning methods f or esti mating heterogeneous causal effects, ” http://arxiv .org/pdf/1504.011 32.pdf, Jul. 2015. [20] M. W elling, “ Are ML and statisti cs complementary?” in IMS-ISBA Meeting on ‘Data Science in the N ext 50 Y ear s’ , Dec. 2015. [21] F . W ang and C. Rudin, “Causal falling rule li st s, ” http://arxiv .org/pdf/1510.0518 9.pdf, Oct. 2015. [22] A. Chakarov , A. Nori, S . Rajamani, S. Sen, and D. V ijaykeerthy , “Debugg i ng machine learning tasks, ” http://arxiv .org/pdf/1603.072 92.pdf, Mar . 2016. [23] M. Petrik and R. Luss, “Interpretable policies for dynamic product recommendations, ” in P roc. Conf. Uncertainty Art i f. Intell. , Jersey City , NJ, Jun. 2016, p. 74. [24] F . P rov ost and T . Fawcett, “Robu st classification for imprecise en vironments, ” Mach. Learn. , vol. 42, no. 3, pp. 203–231, Mar . 2001. [25] M. A. Daven port, R. G. Baraniuk, and C. D. Scott, “Tu ning support vector machines for minimax and Neyman-Pearson classification, ” vol. 32, no. 10, pp. 1888–1898 , Oct. 2010. [26] S . Hajian and J. Domingo-Ferrer , “ A methodology for direct and indirect discrimination prev ention in data mining, ” IEEE T ransactions on knowledge and data engineering , vol. 25, no. 7, pp. 1445–1459, Jul. 2013. [27] M. Feldman, S. A. Friedler, J. Moeller , C. Scheidegger , and S . V enkatasubramanian, “Certifying and removing disparate impact, ” in Pr oc. ACM SIGKDD Conf. Knowl. Discov . Data Min. , Sydney , Australia, Aug. 2015, pp. 259–268. [28] S . Barocas and A. D. Selbst, “Big data’ s disparate impact, ” California Law Rev . , vol. 104, 2016. [29] The U.S. EEOC, “Uniform guidelines on employee selection procedures, ” 1979. [30] K. R. V arshney , R. J. Prenger , T . L . Marlatt, B. Y . Chen, and W . G. Hanley , “Practical ensemble classification error boun ds for different operating points, ” IEE E T ransactions on K nowledg e and Data Engineering , vol. 25, no. 11, pp. 2590–26 01, Nov . 2013. [31] J. Attenberg, P . Ipeirotis, and F . Provost, “Beat the machine: Challenging humans to find a predictiv e model’ s “unknown unkno wns”, ” ACM J . Data Inf. Qual. , vol. 6, no. 1, p. 1, Mar . 2015. [32] G. M. W ei ss, “Mining with rarity: A unifying frame work, ” SIGKDD Explor . Newsletter , vol. 6, no. 1, pp. 7–19, Jun. 2004. [33] A. Sahuguet, J. Krauss, L. Palacios, and D. Sangokoy a, “Open civic data: Of the people, by the people, for the people, ” Bull. T ec h. Comm. Data Eng. , vol. 37, no. 4, pp. 15–26, Dec. 2014. 19 [34] E. Shaw , “Improving service and communication with open data: A history and how-to, ” Ash Center, Harvard Ken nedy School, T ech. Rep., Jun. 2015. [35] S . Kapoor , A. Mojsilovi ´ c, J. N. Strattner, and K. R. V arshney , “F r om open data ecosystems t o systems of innov ation: A journey t o realize the promise of open data, ” in Pr oc. Data for Good Exchang e Conf. , New Y ork, NY , S ep. 2015. [36] G. Schirner, D. E rdogmus, K. Cho wdhury , and T . Padir , “The future of human-in-the-loop cybe r-physical systems, ” Computer , no. 1, pp. 36–45, 2013. [37] Y . Kassahun, B. Y u, A. T . T i bebu , D. Stoyanov , S. Giannarou, J. H. Metzen, and E. V ander Poorten, “Surgical robotics beyo nd enhanced dexterity instrumentation: a surv ey of machine learning techniques and t heir role in intelligent and autonomou s surgical actions, ” International Journa l of Computer Assisted Radiology and Surg ery , vol. 11, no. 4, pp. 553–56 8, 2016. [38] H. C. Lin, I. S hafran, T . E. Murphy , A. M. Okamura, D. D. Y uh, and G . D. Hager , Automatic Detection and Se gmentation of Robot-Assisted Sur gical Motions . Berlin, Heidelberg: Springer Berli n Heidelberg, 2005, pp. 802–810. [39] H. C. Lin, I. Shafran, D. Y uh, and G. D. Hager , “T o wards automatic skill ev aluation: Detection and segmentation of robot-assisted surgical motions, ” Computer Ai ded Sur gery , vol. 11, no. 5, pp. 220–23 0, 2006. [40] C. E . Reiley , E. P laku, and G. D. Hager , “Motion generation of r obotic sur gical tasks: Learning from expe rt demonstrations, ” in 2010 Annual International Confer ence of the IEEE Engineering in Medicine and Biology , Aug 2010, pp. 967–970 . [41] A. Shademan, R. S . Deck er , J. D. Opfermann, S. Leonard, A. Kri eger , and P . C. W . Kim, “Supervised autonomous robotic soft tissue surgery , ” Science T ranslational Medicine , vol. 8, no. 337, pp. 37ra64–337ra64, 2016. [42] H. Alemzadeh, D. Chen, A. Lewis, Z. Kalbarczyk, J. Raman, N. Lev eson, and R. K. Iyer , “Systems-theoretic safety assessment of robotic telesurgical systems, ” in P r oc. I nt. Conf. Comput. Safety Reliability Secur . , 2015, pp. 213–227. [43] H. A zi mi an, M. D. Naish, B. Kiaii, and R. V . Patel, “ A chance-constrained programmin g approach to preoperativ e planning of robotic cardiac surgery under task-lev el uncertainty , ” IEEE T rans. Biomed. Health Inf. , vol. 19, no. 2, pp. 612–1898, Mar . 2015. [44] S . Rayej, “Ho w do self-driving cars work?” http://robohub .org/ho w-do-self-driving-cars-w ork/, 2014. [45] J. Lowy , “Driver killed in self-driving car accident for first time, ” http://www .pbs.org/ne wshour/rundo wn/driv er-killed-in- self-driv ing-car-accide nt-for-first-time, 2016. [46] J. Duchi, P . Glynn, and R. Johari, “Uncertainty on uncertainty , robustness, and simulation, ” SAIL -T o yota C enter for A I Research, Stanford Univ ersity , T ech. Rep., Jan. 2016. [47] Y . Zhu and V . Janapa Reddi, “Cognitiv e computing safety: The new horizon for reliabili t y , ” IEEE Micr o , f orthcoming. [48] P . Koop man and M. W agner , “Challenges in autonomous vehicle testing and v alidation, ” SAE International J ournal of T ransportation Safety , vol. 4, no. 2016-01-0128, pp. 15–24, 2016. [49] E. Brynjolfsson, L. Hitt, and H. Kim, “Strength in numbers: Ho w does data-driven decision-making af fect firm performance?” in Proc . Int. C onf. Inf. Syst. , Shanghai, China, Dec. 2011, p. 13. [50] M. Singh, K. R. V arshn ey , J. W ang, A. Mojsilovi ´ c, A. R. Gill , P . I. Faur , and R. E zry , “ An analytics approach for proactiv ely combating vo luntary attrition of employ ees, ” in Pro c. I E EE Int. Conf. Data Min. W orkshops , Brussels, Belgium, D ec. 2012, pp. 317–323. [51] D. W ei and K. R. V arshney , “Robust binary hypothesis testing under contaminated likelihoods, ” in Pr oc. I E EE Int. Conf. Acoust. Speech Signal Process. , Brisbane, Australia, Apr . 2015, pp. 3407–34 11. [52] D. M. Malioutov and K. R . V arshney , “Exact rule learning via Boolean compressed sensing, ” in Proc . Int. Conf. Mach. Learn. , Atlanta, GA, Jun. 2013, pp. 765–773. [53] H. Gerard, K. Rao, M. S imithraaratchy , K. R. V arshney , K. Kabra, and G. P . Needham, “Predictive modeling of customer 20 repayment for sustainable pay-as-you-go solar po wer in rural India, ” in Pro c. Data f or Good Exchang e Conf. , New Y ork, NY , Sep. 2015. [54] B. Goodman and S . Flaxman, “European Union regulations on algorithmic decision-making and a ’ right to explanation’, ” in Pro c. ICML W orksho p Human Interpre tability , New Y ork , NY , Jun. 2016, pp. 26–30.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment