Opportunistic Adaptation Knowledge Discovery
Adaptation has long been considered as the Achilles' heel of case-based reasoning since it requires some domain-specific knowledge that is difficult to acquire. In this paper, two strategies are combined in order to reduce the knowledge engineering c…
Authors: Fadi Badra (INRIA Lorraine - LORIA), Amelie Cordier (LIRIS), Jean Lieber (INRIA Lorraine - LORIA)
Opportunistic Adaptation Knowledge Discovery Fadi Badra 1 , Am ´ elie Cor dier 2 , and Jean Lieber 1 1 LORIA (CNRS , INRIA, Nancy Universities) BP 239, 54506 V andœuvre-l ` es-Nancy , France { badra,lieb er } @loria.fr 2 LIRIS CNRS UMR 5202, Universit ´ e L yon 1, INSA L yon, Universit ´ e L yo n 2, ECL 43, bd du 11 novembre 1918 , V illeurbanne, France Amelie.Cor dier@liris .cnrs.fr Abstract. Adaptation has long been considered as the Achilles’ heel of case-based reasoning since i t requires some domain-specific k nowledge that is di ffi cult to acquire. In this paper , two str ategies are combined in order to reduce the knowledge engineering cos t induced by the adaptation knowledge (AK) acquis i tion task: AK is l earned from the case base by the means of knowledge dis covery techniques, and the AK acquisition sessions are opportunistically triggered, i.e., at problem-solving time. 1 Introduction Case-based reasoning (CBR [6]) is a reasoning pa radigm b a sed on the reuse of previous p roblem-solving experiences, called cases. A CBR system often has profit of a retrieval procedure, selecting in a case base a source case similar to the target problem, a nd an adaptation procedure, that a dapts the retrieved source case to the specificity of the target problem. The adaptation procedure depends on domain-dependent a daptation knowledge (AK, in the following). Acquiring AK can be done from experts or by using machine lea rning techniques. A n intermediate approach is knowledge discovery (KD) that combines e ffi cient learning algorithms with human-machine interaction. Most of previous AK a cquisition strategies are o ff -line: they are disconnected from the use of the CB R system. By contrast, recent work a ims at integrating AK acquisition from experts to specific reasoning sessions: this opp ortunistic AK acquisition ta kes adva ntage of the problem-solving c ontext. This paper presents an approach to AK discovery that is opportunistic: the KD is triggered at problem-solving time. The pa per is organized as follows. Section 2 introduces some basic notions and notations about CB R. S ection 3 presents the CBR system T aaable , which constitutes the application c ontext of the study , and motivates the need for adaptation kno wledge acquisitio n in this application c ontext. Section 4 presents the proposed opportunistic and interactive AK d iscovery method. In Sect. 5, this method is a pplied to acquire ada ptation knowledge in the context of the T aaable system. Sec tion 6 discusses this approach and situates it among related work. Section 7 concludes and presents some future work. 2 Basic Notions About CBR In the following, problems are assumed to be represented in a language L pb and solutions in a language L sol . A source c a se represents a problem-solving episode by a pa ir ( srce , So l ( srce )), in which srce ∈ L pb is the r epresentation of a problem statement and Sol ( srce ) ∈ L sol is the representation of its associated solution. CB R a ims at solving a target problem tgt using a set of source ca ses CB called the case base . The CBR process is usually decomposed in two main steps: retrieval and adapta tion. Retrieval selects a source case ( s rce , S ol ( srce )) from the case base such that srce is judged to be similar to t gt according to a given similarity criterion. Adaptat ion consists in modifying So l ( srce ) in order to propose a ca ndidate solution g Sol ( tgt ) for tgt to the user . If the user valida tes the candidate solution g Sol ( tgt ), then g Sol ( tgt ) is considered to be a solution Sol ( tgt ) for tg t . 3 Application Context: the T aaable System The T aaable system [3] is a c ooking CBR system. In the cooking domain, CBR aims at answering a query using a set of recipes. In order to answer a query , the system retrieves a recipe in the recipe set a nd adap ts it to produce a recipe satisfying the query . The T aaable system was proposed to participate to the Computer Cooking Contest (CCC) challenge in 2 008 [4]. In the CCC challenge, queries a re given in natural la nguage and exp ress a set of constraints that the desired recipe should sa tisfy . These constraints concern the ingredients to be in- cluded or avoided, the type of ingredients (e.g., mea t or f ruit), the dietary pr a c- tice (e.g., nut-free diet), the type of meal (e.g., soup) or the type of c uisine ( e.g., chinese cuisine). An example of query is: “Cook a chinese soup with le e k but no peanut oil.” Recipes are given in tex tua l f orm, with a shallow XML structure, and include a set of ingredients together with a textual part describing the recipe preparation. The T aaable system is accessible online ( http:/ /taaa ble.fr ). 3.1 Represent ation Issues A Cooking Ontolog y . The system makes use of a cooki ng ontolo gy O represented in propositional logic. Each concept of O corresponds to a propositional variable taken from a finite set V of propositional varia b le s. O is ma inly composed of a set of concepts organized in a hierarchy , which corresponds, in propositional logic, to a set of logical implications a ⇒ b . For example, the a xiom leek ⇒ onio ns of O states that leeks are onions. Problem and Solution Represe ntation. In T aaable , a problem pb ∈ L pb represents a query and a solution Sol ( p b ) of pb represent s a recipe that matches this query . L pb and L sol are chosen fra gments of proposition al logic defined using the v o- cabulary V intr oduced in the cooking ontology O . One propositional variable is defined in L pb and L sol for each concept name of O and the only logical connec- tive used in L pb and L sol is the conjunction ∧ . For example, the representation tgt ∈ L pb of the query mentioned above is: tgt = chine se ∧ so up ∧ leek ∧ ¬ p eanut oil The case ba se CB contains a set of recipes. Each recipe is indexed in the case base by a propositi onal formula R ∈ L sol . For e xample, the index R of the recipe W onton Soup is: R = chine se ∧ s oup ∧ green onion ∧ . . . ∧ pea nut oi l ∧ Nothing else Nothing else de notes a conjunction of nega tive literals ¬ a for all a ∈ V such that chines e ∧ soup ∧ g reen onion ∧ . . . ∧ pe anut o il 2 O a . This kind of “closed world a ssumption” states explicitly tha t f or all propositio nal variable a ∈ V , either R O a (the recipe contains the ingredient represented by a ) or R O ¬ a (the recipe does not contain the ingredient represented by a ). Each recipe index R represents a set of source cases: R represents the set of source ca se s ( s rce , S ol ( srce )) such tha t Sol ( srce ) = R a nd srce is solved by R , i.e., srce is such that R O srce . Adaptation Knowledge. In T aaable , adaptation knowledge is given by a set of reformulations ( r , A r ) in which r is a binary relation between problems and A r is an adapta tion function associated with r [13]. A reformulation has the following semantics: if two problems pb 1 and pb 2 are related by r —denoted by pb 1 r p b 2 — then for every recipe So l ( pb 1 ) matching the query pb 1 , A r ( pb 1 , Sol ( pb 1 ) , pb 2 ) = g Sol ( pb 2 ) matc hes the query pb 2 . In this paper , binary relations r are given by substitutions of the form σ = α β , where α a nd β are litera ls (either positive or negative). For example, the substitution σ = leek oni ons generalizes leek into onions. Adaptation functions A r are given by substitutions of the form Σ = A B in which A and B a re conjunctions of litera ls. For example, the substitution Σ = so up ∧ pepper so up ∧ gi nger states that pepper c an be replaced by ginger in soup recipes. A substitution Σ can be automatically genera ted from a substitution σ : Σ = b a if σ is of the form a b and Σ = ∅ ¬ a if σ is of the form ¬ a ∅ . The main source of adaptation knowledge is the ontology O . A substitution σ = a b is automatically generated f rom eac h axiom a ⇒ b of O and corre- spond to a substitution by g eneralization . A substitutio n σ = a b can be applied to a query pb if pb O a . σ generates a new query σ ( pb ) in which the propositional variable a has been substituted by the propositio nal variable b . For example, the substitution σ = leek onio ns is generated automatically f rom the ax iom leek ⇒ onion s of O . σ can be applied to the query tgt to produce the query σ ( tgt ) = ch inese ∧ soup ∧ onion s ∧ ¬ pean ut oil , in which leek has been substituted by o nions . For ea ch p ropositional var iable a of V , a n additional substitution of the form σ = ¬ a ∅ is generated. Such a substitution can be applied to a probl em pb if p b O ¬ a and generates a new problem σ ( p b ) in which the negative literal ¬ a is removed. This has the e ff ect to loosen the constraints imposed on a query e.g., by omitting in the query a n unwanted ingredient. For exa mple, the substitution ¬ peanut o il ∅ applied to tgt generates the query σ ( tgt ) = chine se ∧ soup ∧ leek , in which the condition on the ingredient peanut oil is omitted. However , when O is the only source of adaptation knowledge, the system is only able to perform simple ad a ptations, in which the modifications made to Sol ( src e ) correspond to a sequence of substitutions tha t can be used to transform s rce into tg t . Therefore, an additional ad aptation knowledge ba se AKB is introduced. A KB contains a set of reformulations ( σ , Σ ) that capture more complex adaptation stra tegies. 3.2 The CBR Process in T aaabl e Retrieval. The retrieval algorithm is based on a smooth classification algorithm on an index hierarchy . Such an algorithm aims at determining a set of modifications to apply to tg t in order to obtain a modified query srce that matches a t least one recipe Sol ( srce ) of the case ba se. The algorithm computes a similarity path , which is a composition of substitutions SP = σ q ◦ σ q − 1 ◦ · · · ◦ σ 1 such that there ex ists at least one recipe Sol ( srce ) matching the modified query sr ce = σ q ( σ q − 1 ( . . . σ 1 ( tgt ) . . . )), i.e., such that S ol ( srce ) O srce holds. Thus, a simi larity path SP can be written: Sol ( srce ) O srce σ q ← − σ q − 1 ← − − · · · σ 1 ← − tgt For e x ample, to solve the above query tgt , the system generates a similarity path SP = σ 2 ◦ σ 1 , with: tgt = chine se ∧ so up ∧ leek ∧ ¬ p eanut oil σ 1 = ¬ peanu t oil ∅ , σ 2 = l eek onio ns srce = chin ese ∧ s oup ∧ onion s Sol ( srce ) = chin ese ∧ s oup ∧ green onion ∧ . . . ∧ peanu t oi l ∧ Nothing else In this similarity path, Sol ( sr ce ) is the propos itional representation of the recipe W onton Soup . Since the ontology O contains the axiom gr een onion ⇒ onio ns , the modified query src e = σ 2 ◦ σ 1 ( tgt ) verifies Sol ( s rce ) O srce . Adaptation. T o a similarity path is a ssociated an adaptatio n path AP , which is a composition of substitutions AP = Σ 1 ◦ Σ 2 ◦ · · · ◦ Σ q such that the modified recipe g Sol ( tgt ) = Σ 1 ( Σ 2 ( . . . Σ q ( Sol ( srce )) . . . )) solves the initial query tgt , i.e., verifies g Sol ( tgt ) O tgt . Thus, an ada ptation path AP can be written Sol ( srce ) Σ q − → Σ q − 1 − − − → · · · Σ 1 − → g Sol ( tgt ) O tgt The ad aptation path AP is constr ucted from the similarity path SP by associ- ating a substitution Σ i to each substitution σ i . T o determine which substitution Σ i to associate to a given substitution σ i , the e x ternal ada ptation knowledge base A KB is searched first. For a substitution σ i = α β , the system looks for a substitution Σ = A B such that A O β and B O α . For example, if σ 2 = le ek onions is used in SP and AKB c ontains the reformulation ( σ , Σ ) with σ = σ 2 and Σ = green onion leek ∧ gi nger , Σ will b e se- lected to constitute the substitution Σ 2 in AP since gree n onion O onions and lee k ∧ gin ger O leek . If no substitution Σ is found in AKB for a given substitution σ i then Σ i is generated a utomatically from σ i . In the previous example, AKB is consi dered to be empty so Σ 1 and Σ 2 are gen- erated a utomatically from the substitutions σ 1 and σ 2 : Σ 1 = ∅ ¬ peanut oil since σ 1 = ¬ peanut oil ∅ and Σ 2 = onions leek since σ 2 = leek onions . According to the axiom gr een onion ⇒ onio ns of O , the system fur- ther specializes the substitution Σ 2 into the substitution gr een onion lee k and the user is propos ed to replace green onions by leek in the recipe Wonton Soup and to suppress peanut oil. The generated adaptation path is AP = Σ 1 ◦ Σ 2 (Fig. 1 ) , with: Sol ( srce ) = chin ese ∧ s oup ∧ green oni on ∧ . . . ∧ peanut oil ∧ Nothing else Σ 2 = gre en o nion lee k , Σ 1 = ∅ ¬ pea nut oi l g Sol ( tgt ) = chine se ∧ so up ∧ leek ∧ . . . ∧ ¬ peanu t oil ∧ Nothing else tgt = chine se ∧ so up ∧ leek ∧ ¬ p eanut oil The inferred solution g Sol ( tgt ) solves the initial query tgt : g Sol ( tgt ) O tgt . srce Sol ( srce ) pb g Sol ( pb ) tgt g Sol ( tgt ) σ 1 σ 2 Σ 2 Σ 1 Fig. 1. A similarity path and the associated adap ta tion path. 3.3 Why Learnin g Adaptation Knowledge in T aaabl e ? In the version of the T aaable system that was proposed to participate in the CCC c ha llenge, AKB = ∅ so a d aptation knowledge is inferred from the ontolo gy O . The main advantage of this approach lies in its simplicity: no external source of adapta tion knowledge is neede d and the system is able to propose a solution to a ny ta rget problem. However , the system’s ada ptation capa b ilities (simple substitutions) appear to be very limited and the user has no means to give some feedback on the quality of the proposed adaptation. For example, the substitution Σ 1 = ∅ ¬ pea nut oi l suggests to remove the ingredient pea nut oil in the retrieved recipe, but as the oil is used in this recipe to saute the bok choy , the adapted recipe turns out to be practically unfeasible. A better adaptation would suggest to replace peanut oil by e . g., sesame oil, which can be modeled by the substitution Σ 1 = peanu t oil se same oil . T o generate this substitution automatica lly , the system could for exa mple exploit the fact that the concepts peanut oil and sesa me o il are both sub-c oncepts of the concept oil in O . But still, some a dditional knowledge would be needed to express the fact that peanut oil should be replaced by sesame oil, and not by olive oil or hot chili oil, as oli ve oil and hot chil i o il are also sub-concepts of oil in O . Be sides, the system should be awa re that this substituti on is recommended only in Asian cuisine , which can be modeled by the more precise substitution Σ 1 = a sian ∧ peanut oil asian ∧ sesam e o il . Furthermore, the second substitution Σ 2 = green onions leek suggests to solely replace sliced green onions by uncooked leek. But the green onion was used in the original W onton Soup for garniture, so the user might consider that r aw leek added a s ga rniture alters too much the taste of a soup. A better adaptation wo uld consi st in frying leek with e.g., tempeh and red bell pepper to prepare the garniture. Such an a daptation can be modeled by the substitution Σ 2 = green onions leek ∧ te mpeh ∧ red be ll pepper . This substitution, which reflects a cooking know-how , ca n hardly be genera ted automatically from the ontology . These examples show that in order to improve its a daptation capabilities, the system would greatly benefit from the availability of a set of a daptation rules that would capture more complex ada ptation strategies. These a daptation rules cannot be generated automatica lly from the ontology a nd need to be acquired from other knowledge sources. These examples also show that the human expert plays a major role in ad a ptation knowledge acquisition a nd that in the cooking domain, adaptation rules are often highly contextual. 4 Opportunistic Adaptation Knowledge Discovery The presented AK acquisition method combines two previous approaches of AK acquisition. The first one was implemented in the C abamak A system [ 5] and learns A K from di ff erences be twee n cases by the means of knowledge discovery techniques (section 4.1). The second one was implemented in the I ak A system [8] and a cquires adaptation knowledge a t problem-solving time through interactions with the user (section 4.2). 4.1 Adapta tion Knowledge Discovery from the Case Base Machine learning algorithms aim at extracting some regularities from a set of observations. Knowledge discovery techniques combine e ffi cient ma c hine learning algorithms with human-machine interaction. In [5], AK is learned from di ff erences betwee n cases by the means of knowledge discovery techniques. A set of pairs of sources cases is taken as input of a frequent itemset extra ction algorithm, which outputs a set of itemsets. Each of these itemsets can be inter- preted as a n adaptation rule. This approach of A K learning was motivated by the original idea proposed by Kathleen Hanney and M ark T . Keane in [11], in which the authors suggest that AK may be learned f rom di ff erences between cases. The main a ssumption is that the di ff erences that occur b e tween cases in the case base are ofte n representative of di ff erences tha t will occur between future problems and the case base. T o learn adapta tion rules from di ff erences between ca ses, representin g varia- tions between ca ses is essential. In [2], expressive representation formalisms are proposed and it is shown that defining a partial order on the varia tion language can help organizing the lea rned rules by generality . 4.2 Opportunistic and Interactive Knowledge Acquisi tion Experiential knowledge, or know-how , can often be acquired on-line, when users are using CBR tools. It is the aim of interactive a nd opportunistic knowl- edge acquisition stra tegies to support such an acquisition . In these strategies, the system exploits its interactions with its user to build new pieces of knowledge, to test them and, in case of success, to retain them. Moreover , the knowledge acquisition process is often opportunistic, i.e, triggered by a previous reasoning failure: reasonin g failures highlight missing knowledge a nd thus constitute a guidance for the acquisition process. A major advantage of interactive knowl- edge acquisition stra tegies is that they ensure that the user is in a favorable context when he pa rticipates to the a c quisition process. In [7], a review of inter- active a nd opportunistic knowledge acquisition a p proaches is proposed, and two stra tegies are d eveloped. This work illustrates the e ffi ciency of interactive and opportunist ic knowledge acquisition approaches to acquire specific know l- edge. On the other hand, it shows that such approaches only allow the systems to acquire small pieces of knowledge at a time. 4.3 Combining the two Approaches When properly used, knowledge discovery techniques may have the strong advantage of a utomating a part of the knowledge acquisition process. In these approaches, dedicated human-machine interfac es allow the expert, through predefined interactions, to provide fee d back on a set of suggestions generated automatically by the system. The role of the e xpert is thus reduced to the val- idation of a pre-selected set of knowledge p ieces. The a cquired knowledge is directly usable b y the system, without the need for an a d ditional formaliza- tion step. A utomatic approaches also be nefit from e ffi cient machine learning algorithms that can be applied , as in [2], to lear n a daptation rules at di ff erent levels of generality . However , these approaches still produce a large number of candidate knowledge units that have to be validated by a domain expert out of any context, which constitutes an important d rawback. Acquiring adapta tion knowledge o ffl ine , i.e., indepe nde ntly of a particu- lar problem-solving session, appe a rs to be problematic. O ffl ine AK acquisition forces the system’s de signer to anticipate the need for ad a ptation knowledge in problem-solving a nd to acquire it in adva nce, which can be very tedious, if not impossible. O ffl ine acquisition of a daptation knowledge also makes di ffi - cult to come up with fine-grained ad a ptation rules, since adaptation knowledge is often highly contextual. For example, in the cookin g domain, a n egg can sometimes be substituted by 100 grams of tofu, but this a daptation rule ma y be app lied only to certain types of dishes, like cakes or mayonnaise, and has proved to be irrelevant in order to adapt a mousse recipe or an omelet recipe. Acquiring such a rule would requir e to circumscribe its domain of validity in order to avoid over-generalization. Moreover , initial acquisition of a daptation knowledge prevents the system from learning from expe r ience. A CBR system with fixed adaptation knowledge has no way to improve its problem-solvin g capabilities, except by retaining in the c a se ba se a new experience each time a problem has bee n solved, as it is usually done in traditional CBR systems [6]. On the other ha nd, interactive a nd opportunistic knowledge acquisition approaches heavily rely on the human expert but ensure that the expert is “in context” when validating knowledge units that are to be acquired. Combining knowledge d iscovery techniques and interac tive approaches, as it is proposed here, could over come one of the limitations of KD by dra matically reducing the number of candidate adaptation rules presented to the e xpert. By triggering the process in an opportunistic manner , the expert is able to parametriz e the KD in order to focus on specific knowledge to acquire in context. The resulting AK discovery process: – is performed on-line , i.e., in the context of a problem-solving session, – is interactiv e a s a daptation knowledge is lea rned by the system through interactions with its user who acts as an expert, – is opportunistic as it is triggered by reasoning failure, and, consequently , often helps repairing a fa iled a daptation, – ma kes use of knowledge discovery techniques to provide assistance to the user in the formulation of new knowledge: the user is presented with a set of suggestions that are genera ted a utomatically from the case ba se . 5 Applying Opportunistic AK Discovery to T aaable In this section, an opportunistic AK discovery process is applied to the context of the T aaable system. 5.1 AK Discovery In T aaable , the AK discovery process consists in lea rning a set of substitutions from the case b a se by comparing two sets of recipes. The T raining Set. The training set TS is f ormed by selecting from the case base a set of pairs of recipes ( R k , R ℓ ) ∈ CB × CB a nd by representing for each selected pair of recipes ( R k , R ℓ ) the variation ∆ k ℓ from R k to R ℓ . The choice of the training set TS results from a set of interactions with the user during which he / she is asked to formulate the cause of the adapta tion failure a nd to pick up a repair strategy . Represe nting V ariations. The v a riation ∆ k ℓ from a recipe R k to a recipe R ℓ is represented in a language L ∆ by a set of properties. Three properties a - , a + and a = are defined in L ∆ for each propositional v a riable a of V , and ∆ k ℓ ∈ L ∆ contains: – the property a - if R k O a and R ℓ 2 O a , – the property a + if R k 2 O a and R ℓ O a , – the property a = if R k O a and R ℓ O a . For example, if: R k = c hines e ∧ sou p ∧ . . . ∧ peanut oil ∧ Nothing else R ℓ = c hines e ∧ sou p ∧ . . . ∧ olive oil ∧ Nothing else then ∆ k ℓ = { ch inese = , soup = , oil = , peanut oil - , olive oil + , . . . } , provided that peanut oil O oil , o live oil O oil , R ℓ 2 O peanut oil and R k 2 O olive oil . The inclusion relation ⊆ constitutes a partia l order on L ∆ that can be used to organize variations by generality: a variation ∆ is mor e general than a variation ∆ ′ if ∆ ⊆ ∆ ′ . Mining. The learning process consists in highlighting some varia tions ∆ ∈ L ∆ that are more genera l than a “large” number of elements ∆ k ℓ of TS . More formally , let suppor t ( ∆ ) = card { ∆ k ℓ ∈ T S | ∆ ⊆ ∆ k ℓ } card TS Learning ada ptation rules aims at finding the ∆ ∈ L ∆ such that suppor t ( ∆ ) ≥ σ s , where σ s ∈ [ 0; 1] is a lear ning parameter called the support thresho ld. It can be noticed that if ∆ 1 ⊆ ∆ 2 then s upport ( ∆ 1 ) ≥ suppor t ( ∆ 2 ). The support threshold also has an influence on the number of generated variations. The number of generated va riations increases when σ s decreases. Thus, specifying a high thresho ld restricts the genera tion of va r iations to the most general ones, which can limit the number of generated variations and save computation time but has the e ff ect to discard the most spec ific ones from the result set. Each learned varia tion ∆ = { p 1 , p 2 , . . . , p n } ∈ L ∆ is interpreted as a substitu- tion of the form A B such that: – A O a a nd B 2 O a if a - ∈ ∆ , – A 2 O a a nd B O a if a + ∈ ∆ , – A O a a nd B O a if a = ∈ ∆ . For example, the variation ∆ = { oi l = , peanut oil - , olive o il + } is interpreted as the substitution Σ = pe anut oil o live oil . The conjunct oil is not present neither in A nor in B sin ce it is useless: peanu t oil O oil and olive oil O oil . Filtering. For a retrieved recipe Sol ( s rce ), the result set ca n be filtered in order to retain only the substitutions Σ = A B that can be applied to modify So l ( srce ), i.e., such that Sol ( sr ce ) O A . V a lidation. Knowledge discovery aims at building a model of reality from a set of observations. B ut as a model of a part of reality is only v a lid with respect to a particular observe r , any learned substitution has to be validated by a human expert in order to a cquire the status of piece of knowledge. 5.2 Opportunistic Adaptati on Knowledge Discovery The A K discovery process turns the case base into an add itional source of adaptation knowledge. T his new source of knowledge is used during a problem- solving session to provide the CBR system with adapta tion knowledge “ on demand”. A set of variations ∆ is lear ned from the case ba se by comparing two sets of recipes and each learned variation ∆ is interpreted as a substitution Σ that ca n be used to repair the adapta tion path AP . E ach learned substitution Σ is presented to the user for validation together with the corrected solution g Sol ( tgt ) resultin g f rom its application. When the user validates the corrected solution, a new reformulation ( σ , Σ ) is added to the adapta tion knowledge base AKB so that the lea rned substitutio n Σ can be later reused to ad apt new recipes. The AK discovery process is triggered either during the adapta tion phase, to come up with suggestions of gradual solution refinements (see section 5.4 for an example) , or d uring the solution test phase to repair a failed ad aptation in response to the user ’s feedback (see section 5.5 for an example). 5.3 Implement ation T o test the proposed adaptation knowledge a c quisition method, a prototype was implemented that integrates the T aaable system [3] and the C abamak A system [5]. The ca se base contains 8 62 recipes taken from the CCC 20 08 recipe set. The T aaable system is used to per f orm retrieva l and adap tation. The C abamak A system is used to learn a set of substitutions Σ from the case base from the comparison of two sets of recipes. As in [5], the mining step is performed thanks to a frequent closed itemset extraction algorithm. 5.4 A First Exampl e: Cooking a Chocolate Cake An example is presented to illustrate how the case base is used as an a dditional source of adaptation knowledge. The AK discovery process is parametrized automatically and is used to provide assistance to the user by suggesting some gradual refinements for the proposed solution . 1. Repres enting t h e T arget Problem. In this example, the user wants to cook a chocolate cake with baking chocolate and ora nges. The target problem is: tgt = cake ∧ baking chocol ate ∧ ora nge In the T aaable interface, the field “Ingredients I W ant” is filled in with the tokens baking chocol ate and o range a nd the field “T ypes I W a nt” is filled in with the token cake . 2. Retrieval. The retrieva l procedure gener a tes the similarity pa th SP = σ 1 in which the substitution σ 1 = baki ng c hocol ate cho colate is generated automatically from the ontology O from the axiom baki ng c hocol ate ⇒ chocol ate . SP is applied to tgt in order to produce the modified query srce = cake ∧ chocola te ∧ orange . The system retrieves the recipe Ult ralig ht Chocolate Cake , whose representation Sol ( srce ) is: Sol ( srce ) = cake ∧ cocoa ∧ orang e ∧ . . . ∧ Nothing else Since the ontology O contains the axiom c ocoa ⇒ ch ocolat e , Sol ( srce ) solves the query srce : Sol ( srce ) is such that Sol ( srce ) O srce . 3. Adaptation. AKB is assumed to be empty , so to construct the a daptation path AP , the substitution ch ocola te bak ing chocol ate is generated auto- matically from σ 1 . This substitution is f urther specialized into the substitu- tion Σ 1 = coco a baking chocol ate , according to the a xiom cocoa ⇒ chocol ate of O . A first solution g Sol ( tgt ) is computed by app lying to Sol ( srce ) the ada ptation path AP = Σ 1 . The user suggests that an ingre- dient is missing in g Sol ( tgt ) but could not identify a repair strategy . An AK discovery is triggered in order to suggest grad ua l refinements of g Sol ( tgt ). 4. Choosing th e T raining Set. The training set TS is chosen from Σ 1 : AK is learned by comparing the recipes containing cocoa with the recipes containing bak- ing chocolate. TS is composed of the set of variations ∆ k ℓ ∈ L ∆ between pairs of recipes ( R k , R ℓ ) ∈ CB × CB such that { c ocoa - , baking chocol ate + } ⊆ ∆ k ℓ . 5. Mining and Filtering. A va lue is given to the support threshold σ s and the mining step outputs a set of va r iations. A filter retains only the variations that correspond to substitution s applica ble to modify Sol ( srce ). 6. Solution T est and V a lidation. The user selects the lear ned var iation ∆ = { c ocoa - , baking chocol ate + , oil - } f rom the result set. ∆ is interpreted as the substitution Σ = cocoa ∧ oi l bak ing chocol ate , which suggests to replace cocoa by baking chocolate in the retrieved recipe and to remove oil. The user explains this rule by the fa ct that b aking chocolate contains more fat than cocoa, and therefore substituting cocoa by ba king chocolate implies to reduce the quantity of fat in the recipe. Further solution refinements are proposed to the user . The set of learned variations is filtered in order to retain on ly the substitution s ∆ ′ that are more specific than ∆ , i.e., such that ∆ ⊆ ∆ ′ . A mong the retained variations is the variation ∆ ′ = { co coa - , baking chocol ate + , oil - , vanilla - } , which is inter- preted as the substitution Σ ′ = coc oa ∧ oil ∧ vanill a bakin g c hocola te . Σ ′ suggests to also remo ve vanilla in the recipe Ult ralight Chocolate Cake . The user is satisfied with the refined solution g Sol ( tgt ) resulting from the a p- plication of the adaptation path AP = Σ ′ to Sol ( s rce ), so the reformulation ( baking chocolate chocolate , cocoa ∧ oil ∧ vanilla baking chocolate ) is added to the ada p ta tion knowledge base A KB . 5.5 A Second Exampl e: Coo king a Chinese Soup A second example is presented in which the AK discovery process is triggered in response to the user fee dback in order to repair the adapta tion presented in Sect. 3. In this example, the user is encouraged to formulate the cause of the adaptation failure. A repair strategy is chosen that is used to parametrize the AK discovery process. 1. Repres enting the T arget Problem. In this example, the ta rget problem tg t is: tgt = chi nese ∧ soup ∧ leek ∧ ¬ pe anut oil In the T aaable interface, the field “Ingredients I W ant” is filled in with the token leek , the field “ Ingredients I Don’t W ant” is filled in with the token peanut oil and the field “T ypes I W a nt” is filled in with the tok ens chine se and s oup . 2. Retrieval. As in Sect. 3, two substitutions σ 1 = ¬ pea nut oi l ∅ a nd σ 2 = leek oni ons are genera te d automatically from the ontology O . The similarity path SP = σ 2 ◦ σ 1 is applied to tgt in order to p roduce the modified query sr ce = chine se ∧ soup ∧ onio ns . The system retrieves the recipe Won ton Soup , whose representation Sol ( s rce ) solves the query s rce : Sol ( srce ) is such that Sol ( srce ) O srce . 3. Adaptation. Initially , AKB = ∅ , so to construct the ada ptation p a th AP , two substitutions Σ 1 = ∅ ¬ pean ut oil and Σ 2 = green onion le ek a re automatically generated from σ 1 and σ 2 . 4. Solution T est and V alidation. The solution g Sol ( tgt ) is presented to the user for validation, together with the adaptation path AP = Σ 1 ◦ Σ 2 that was used to generate it. 5. The U ser is Unsatisfied! The user complains that the ada pted recipe is prac- tically unfea sible beca use the proposed solution g Sol ( tgt ) does not contain oil anymore, and oil is neede d to saute the bok choy . 6. What has Caused the Adapt ation Failure? The cause of the adaptation failure is identified through intera ctions with the user . The user validates the inter- mediate solution g Sol ( pb ) that results from the application of the substitution Σ 2 = gree n onion l eek to Sol ( srce ). But the user invalidates the solu- tion g Sol ( tgt ) that results from the application of Σ 1 = ∅ ¬ peanut oil to g Sol ( pb ). The substitution Σ 1 is identified as responsible for the adaptation failure since its application results in the removal of oil in the recipe. 7. Choosing a Repair Strat egy . A repair strategy is chosen according to the user ’s feedback. The user expresses the need for oil in the adapte d recipe, so the repair strategy consists in replacing p e anut oil by a nother oil. An AK discovery pro cess is triggered to d ecide which oil to replace pe a nut oil with. 8. Choosing the T raining Set. A set of recipes that contain pe anut oil is c ompared with a set of recipes containing other types of oil. The tra ining set TS is composed of the set of variations ∆ k ℓ ∈ L ∆ between pairs of recipes ( R k , R ℓ ) ∈ CB × CB such that { o il = , peanut oil - } ⊆ ∆ k ℓ . 9. Mining and Filtering. A va lue is given to the support threshold σ s and the mining step outputs a set of va r iations. A filter retains only the variations that correspond to substitution s applica ble to modify Sol ( pb ). 10. Solution T est and V alid a tion. The user selects the learned varia tion ∆ = { o il = , peanut oil - , olive o il + } from the result set. ∆ is interpreted as the substitution Σ = p eanut oil olive oil , which suggests to replace peanut oil by olive oil in the retrieved recipe. The a daptation path AP = Σ ◦ Σ 2 is computed and the repaired solution g Sol ( tgt ) is presented to the user for validation. The user is satisfied with the corrected solution g Sol ( tgt ), so the reformulation ( ∅ ¬ peanut oil , pe anut o il olive oil ) is ad ded to the ad aptation knowledge base AKB . 6 Discussion and Related W ork AK acquisition is a di ffi cult task that is recognized to be a major bottleneck for CBR system designers due to the high knowledge-engineering costs it gener- ates. T o overcome these knowledge-engineering costs, a fe w a pproaches ( e.g., [5,9,11]) have applied machine learning techniques to lea rn AK o ffl ine from di ff erences betwee n cases of the case base. In [11], a set of pairs of source cases is selected from the case base a nd each selected pair of source ca ses is consid- ered as a specific adaptation rule. The featura l di ff erences between problems constitute the a ntecedent part of the rule and the featural di ff erences between solutions constitute the consequent part. Michalski’s closing interval rule algo- rithm is then ap plied to gener a lize ada ptation rule anteced ents. In [ 9], adapta - tion kno wledge takes the form of a set of adaptation case s. Each adaptation case associates an adapta tion action to a representation of the di ff erences between the two source problems. M a chine learning algorithms like C4.5 or RISE are applied to learn generalized a daptation knowledge from these adaptation cases in order to impro ve the system’s case-b a sed ada ptation procedure. When applying mac hine learning techniques to learn ada ptation knowledge from di ff erences between cases, one main challenge concerns the choice of the training set: which cases are worth comparing? Arguing that (1) the size of the training set should be reduced to minimize the cost of the ada ptation rule generation process and that (2 ) the source c a ses that are worth comparing should be the ones that are more simi lar , only the pairs of source cases that were judged to be similar a ccording to a given similarity measur e are selected in [9] a nd [11]. However , committing to a particular similarity measure might be somewhat arbitrary . Therefore, in [ 5], the authors decided to include in the tra ining set all the pairs of distinct source cases of the case base. This paper introduces a third approach: the choice of the tra ining set is determined interactively and according to the problem-solvin g context, taking advantage of the fact that the AK discovery process is triggered on-line. This approach appears to be very promisin g since the lea rning algorithm can be parametrized in order to learn only the knowledge that is needed to solve the target problem. The examples presented above a lso show that knowledge discovery tech- niques allow to come up with more complex adaptation strategies than the simple one-to-one ingredient substitutions generate d from the ontology O . In particular , these te chniques ca n help ide ntifying intera ctions between the dif- ferent ingredients that a ppear in the recipes (like e . g., that cocoa contains less fat than ba king chocolate, so oil should be removed) as well as co-occurrences of ingredients (like say , that cinnamon is well-suited with apples). Besides, a dap- tation knowledge is lea rned at di ff erent levels of genera lity , so the user can be guided into gra d ual solution refinements. Several CBR systems make use of intera c tive and / or opportunistic knowl- edge acquisition approaches to improve their learning capabilities. For exa m- ple, in Creek, a n approach that combines c a se-based a nd model-based meth- ods, genera l knowledge is ac quired throug h interactions with the user [1]. This knowledge ac quisition process is provided in a ddition to the tr a ditional c a se acquisition and allows the system to acquire knowledge that cannot be captured through cases on ly . In the Dial system, adaptation knowledge is acquired in the form of adaptation cases: when a case has to be a dapted, the ada ptation pr ocess is memorized in the f orm of a case and can b e reused to adapt a nother ca se. Hence, adapta tion knowledge is acquired through a C B R process inside the main CBR cycle. It must be remarked that ada ptation cases can either be built automatically by ada p ta tion of previous ada ptation cases or manually by a user who interactively builds the adaptation case in response to a p roblem by select- ing the appropriates operations to perform [12]. Hence , knowledge a cquisition in Dial appe ars to be both interactive and opportunistic. Chef is obviously re- lated to the work described here [10]. Chef is a case-based planner in the cooking domain, its task is to build recipes on the basis of a user ’s request. The input of the system is a set of goals (tastes, textures, ingredients, types of dishes) and the output is a plan for a single recipe that satisfies all the goals. T o solve this task, Chef is able to build new plans from old ones stored in memory . The system is provided with the a bility to choose plans on the basis of the problems that they solve as well as the goals they satisfy , but it is also able to predict problems and to modify p la ns to avoid failures (plans are indexed in memory by the problems they avoid). Hence, Chef learns by providing causal explanations of failures thus marking elements as ”predictive” of f ailures. In other words, the acquired knowledge allows the syst em to avoid id entical fa ilures to occur again. In our approach, we propose to go one step further by using failure to a cquire knowledge that ca n be more widely used. 7 Conclusion and Future W ork In this pape r , a novel approach for a daptation knowledge acquisition is pre- sented in which the knowledge learned at problem-solving time by knowledge discovery techniques is directly reused for problem-solving. An application is proposed in the c ontext of the cooking CBR system T aaable and the feasibility of the approach is demonstrated on some use cases. Future work will include developing a graphical user interface and doing more e x tensive testing. Op- portunistic a nd interactive knowledge discovery in T aaable implies that the user plays the role of the domain exper t, which raises several issues. For ex- ample, how to be sure that the knowledge expressed by a par ticula r user is valuable? How to ensure that the ada ptation knowledge base will remain con- sistent with time? Besides, T aaable is meant to be multi-user , so if the system’s knowledge evolves with exper ie nce, some synchro nization problems might oc- cur . Therefore, the envisioned multi-user , ever-learning T aaable system needs to be thought of as a collaborative tool in which knowledge acquired by some users can be revised by others. References 1. Aamodt, A.: Knowledge-Intensive Case-Based Reasoning in Creek , in Proceedings of the 7th European Conference on Case-Based Reasoni ng (ECCBR’04) , 1–15, 2004. 2. Badra, F ., L ieber , J.: Representing Case V ariations for Learning General and Spe- cific Adaptation Rule s , in Proceedings of the Fourth Starting AI Researc her ’ s Symposium (ST AI R S 200 8) , eds A. Ce sta and N. Fakotakis, 1–11, 2008. 3. Badra, F ., Bendaoud, R., Bentebibel, R., Champin, P .-A., Cojan, J., Cordier , A., Despr ´ es, S., Jean-Daubias, S., Lieber , J., Mei lender , T ., Mil le, A. , Nauer , E., Napoli, A., T ous saint, Y .: T aaable: T ext Mi ning, Ontology Engineering, and Hierarchical Classi fication for T extual Case-Based Cooking, in Computer Cooking Contest - Workshop at Eu ropean Con- ference on C ase-Based Reasoning (ECCB R’08) , eds Scha af, M. , 219 –228, 2008. 4. ECCBR W or kshops, ECCBR 2008, The 9th European Conference on Case-B ased Rea- soning, W or k shop Proceedings, eds Schaaf, M ., 2008. 5. d’Aquin, M., Badra, F ., Lafrogne, S., Lieber , J., Napoli, A., Szathmary , L . : Case Base Mining for Adaptat ion Knowledge Acquisitio n, in Proceedings of the In ternational Con- ference on A rt i ficial Intelligence, IJCAI’07 , 750–756, 2007. 6. de M ´ antaras, R. L., Plaza, E.: Case-Based Reasoning: An O verview . in AI Communica- tions , 10(1):21– 29, 1997 . 7. Cordier , A.: Interactive and Oppor tunistic Knowledge Acquisition in Case-Based Reasoning, Phd Thesis, Universit ´ e L yon 1, 2008. 8. Cordier , A., F uchs, B., Lana de Carvalho, L., Lieber , J. , Mille, A.: Opportunistic Ac- quisition of Adaptation Knowledge and Cases - The IakA Approach, i n Proceedings of the 9th E uropean Conf erence on Case-Based Reasoning (ECCBR’ 08) , eds Altho ff , K.-D. and Bergmann, R. and Minor , M. and Hanft, A., 150–1 64, 2008 . 9. Craw , S., W i ratunga, N., Rowe, R.: Learning Ad aptation Knowledge to Improve Case- Based Reasoning. in Artificial Intelligence , 170(16-17):117 5–1192, 2006. 10. Hammond, K. : CHEF : A model of case-based pl anning, i n Proceedings of the 5th National Conference on Artificial Intelligence , eds AAAI Press, 267–271, 1986 . 11. Hanney , K., Keane, M. T .: The Adaptation Knowledge Bottleneck: How to Unblock it By Learning From Cases, in Proceedings of the 2nd International Conference on CBR , 359–37 0, 1997. 12. L eake, D., Kinley , A., W il son, D.: Acquiring Case Adaptation Knowledge: A Hy brid Approach, in Proc. of the 13th National Conference on Artificial Intelli gence , 684–689, 1996. 13. M elis, E., Lie ber , J., Napoli, A.: R e formulation in Case-Based Reasoning, in Fourth European Workshop on Case-Based Reasoning, EW CBR-98 , eds B . Smyth an d P . Cunning- ham, Lecture Notes in Artificial Intelligence, 1488:172–1 83, 1998.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment