Structural abstract interpretation, A formal study using Coq
interpreters are tools to compute approximations for behaviors of a program. These approximations can then be used for optimisation or for error detection. In this paper, we show how to describe an abstract interpreter using the type-theory based the…
Authors: Yves Bertot (INRIA Sophia Antipolis)
Strutural abstrat in terpretation A formal study using Co q Y v es Bertot ⋆ INRIA Sophia-Méditerranée Abstrat. Abstrat in terpreters are to ols to ompute appro ximations for b eha viors of a program. These appro ximations an then b e used for optimisation or for error detetion. In this pap er, w e sho w ho w to desrib e an abstrat in terpreter using the t yp e-theory based theorem pro v er Co q, using indutiv e t yp es for syn tax and strutural reursiv e programming for the abstrat in terpreter's k ernel. The abstrat in terpreter an then b e pro v ed orret with resp et to a Hoare logi for the programming language. 1 In tro dution Higher-order logi theorem pro v ers pro vide a desription language that is p o w- erful enough to desrib e programming languages. Indutiv e t yp es an b e used to desrib e the language's main data struture (the syn tax) and reursiv e funtions an b e used to desrib e the b eha vior of instrutions (the seman tis). Reursiv e funtions an also b e used to desrib e to ols to analyse or mo dify programs. In this pap er, w e will desrib e su h a olletion of reursiv e funtion to analyse programs, based on abstrat in terpretation [7 ℄. 1.1 An example of abstrat in terpretation W e onsider a small programming language with lo op statemen ts and assign- men ts. Lo ops are written with the k eyw ords while , do and done , assignmen ts are written with := , and sev eral instrutions an b e group ed together, separat- ing them with a semi-olumn. The instrutions group ed using a semi-olumn are supp osed to b e exeuted in the same order as they are written. Commen ts are written after t w o slashes // . W e onsider the follo wing simple program: x:= 0; // line 1 While x < 1000 do // line 2 x := x + 1 // line 3 done // line 4 ⋆ This w ork w as partially supp orted b y ANR on trat Comp ert, ANR-05-SSIA-0019. W e w an t to design a to ol that is able to gather information ab out the v alue of the v ariable x at ea h p osition in the program. F or instane here, w e kno w that after exeuting the rst line, x is alw a ys in the in terv al [0,0℄; w e kno w that b efore exeuting the assignmen t on the third line, x is alw a ys smaller than 10 (b eause the test x < 10 w as just satised). With a little thinking, w e an also guess that x inreases as the lo op exeutes, so that w e an infer that b efore the third line, x is alw a ys in the in terv al [0,9℄. On the other hand, after the third line, x is alw a ys in the in terv al [1, 10℄. No w, if exeution exits the lo op, w e an also infer that the test x < 10 failed, so that w e kno w that x is larger than or equal to 10 , but sine it w as at b est in [0,10℄ b efore the test, w e an guess that x is exatly 10 after exeuting the program. So w e an write the follo wing new program, where the only dierene is the information added in the ommen ts: // Nothing is known about x on this line x := 0; // 0 <= x <= 0 while x < 10 do // 0 <= x <= 9 x := x + 1 // 1 <= x <= 10 done // 10 <= x <= 10 W e w an t to pro due a to ol that p erforms this analysis and pro dues the same kind of information for ea h line in the program. Our to ol will do sligh tly more: rst it will also b e able to tak e as input extra information ab out v ariables b efore en tering the program, seond it will pro due information ab out v ariables after exeuting the program, third it will asso iate an invariant prop ert y to all while lo ops in the program. Su h an in v arian t is a prop ert y that is true b efor e and after all exeutions of the lo op b o dy (in our example the lo op b o dy is x := x+1 ). A fourth feature of our to ol is that it will b e able to detet o asions when w e an b e sure that some o de is nev er exeuted. In this ase, it will mark the program p oin ts that are nev er rea hed with a false statemen t meaning when this p oin t of the program is rea hed, the false statemen t an b e pro v ed (in other w ords, this annot happ en). Our to ol will also b e designed in su h a w a y that it is guaran teed to terminate in reasonable time. Su h a to ol is alled a stati analysis to ol, b eause the extra information an b e obtained without running the program: in this example, exeuting the program requires at least a thousand op erations, but our reasoning eort tak es less than ten steps. T o ols of this kind are useful, for example to a v oid bugs in programs or as part of eien t ompilation te hniques. F or instane, the rst mail-spread virus exploited a programming error kno wn as a buer o v ero w (an arra y up date w as op erating outside the memory allo ated for that arra y), but buer o v ero ws an b e deteted if w e kno w o v er whi h in terv al ea h v ariable is lik ely to range. 1.2 F ormal desription and pro ofs Users should b e able to trust the information added in programs b y the analy- sers. Program analysers are themselv es programs and w e an reason ab out their orretness. The program analysers w e study in this pap er are based on abstrat in terpretation [7℄ and w e use the Co q system [ 13 ,3℄ to reason on its orretness. The dev elopmen t desrib ed in this pap er is a v ailable on the net at the follo wing address (there are t w o v ersions, ompatible with the latest stable release of Co q V8.1pl3 and with the up oming v ersion V8.2). http://hal.inria .fr /i nr ia- 00 32 957 2 This pap er has 7 more setions. Setion 2 giv es a rough in tro dution to the no- tion of abstrat in terpretation. Setion 3 desrib es the programming language that is used as our pla yground. The seman tis of this programming language is desrib ed using a w eak est pre-ondition alulus. This w eak est pre-ondition alulus is later used to argue on the orretness of abstrat in terpreters. In par- tiular, abstrat in terpretation returns an annotated instrution and an abstrat state, where the abstrat state is used as a p ost-ondition and the annotations in the instrution desrib e the abstrat state at the orresp onding p oin t in the program. Setion 4 desrib es a rst simple abstrat in terpreter, where the main ideas around abstratly in terpreting assignmen ts and sequenes are o v ered, but while lo ops are not treated. In Setion 4, w e also sho w that the abstrat in- terpreter an b e formally pro v ed orret. In Setion 5, w e address while lo ops in more detail and in partiular w e sho w ho w tests an b e handled in abstrat in terpretation, with appliations to dead-o de elimination. In Setion 6, w e ob- serv e that abstrat in terpretation is a general metho d that an b e applied to a v ariet y of abstrat domains and w e reapitulate the t yp es, funtions, and prop- erties that are exp eted from ea h abstrat domain. In Setion 7, w e sho w ho w the main abstrat in terpreter an b e instan tiated for a domain of in terv als, th us making the analysis presen ted in the in tro dution p ossible. In Setion 8, w e giv e a few onluding remarks. 2 An in tuitiv e view of abstrat in terpretation Abstrat in terpretation is a te hnique for the stati analysis of programs. The ob jetiv e is to obtain a to ol that will tak e programs as data, p erform some sym b oli omputation, and return information ab out all exeutions of the in- put programs. One imp ortan t asp et is that this to ol should alw a ys terminate (hene the adjetiv e stati ). The to ol an then b e used either diretly to pro vide information ab out prop erties of v ariables in the program (as in the Astree to ol [8℄), or as part of a ompiler, where it an b e used to guide optimization. F or instane, the kind of in terv al-based analysis that w e desrib e in this pap er an b e used to a v oid run time arra y-b ound he king in languages that imp ose this kind of disipline lik e Ja v a. The en tral idea of abstrat in terpretation is to replae the v alues normally manipulated in a program b y sets of v alues, in su h a w a y that all op erations still mak e sense. F or instane, if a program manipulates in teger v alues and p erforms additions, w e an deide to tak e an abstrat p oin t of view and only onsider whether v alues are o dd or ev en. With resp et to addition, w e an still obtain meaningful results, b eause w e kno w, for instane, that adding an ev en and an o dd v alue returns an o dd v alue. Th us, w e an deide to run programs with v alues tak en in a new t yp e that on tains v alues even and odd , with an addition that resp ets the follo wing table: odd + even = odd even + odd = odd odd + odd = even even + even = even . When dening abstrat in terpretation for a giv en abstrat domain, all op er- ations m ust b e up dated aordingly . The b eha vior of on trol instrutions is also mo died, b eause abstrat v alues ma y not b e preise enough to deide ho w a giv en deision should b e tak en. F or instane, if w e kno w that the abstrat v alue for a v ariable x is odd , then w e annot tell whi h bran h of a onditional statemen t of the follo wing form will b e tak en: if x < 10 then x := 0 else x := 1. After the exeution of this onditional statemen t, the abstrat v alue for x annot b e odd or even . This example also sho ws that the domain of abstrat v alues m ust on tain an abstrat v alue that represen ts the whole set of v alues, or said dieren tly , an abstrat v alue that represen ts the absene of kno wledge. This v alue will b e alled top later in the pap er. There m ust exist a onnetion b et w een abstrat v alues and onrete v alues for abstrat in terpretation to w ork w ell. This onnetion has b een studied sine [ 7℄ and is kno wn as a Galois onnetion. F or instane, if the abstrat v alues are even , odd , and top , and if w e an infer that a v alue is in {1,2}, then orret hoies for the abstrat v alue are top or even , but ob viously the abstrat in terpreter will w ork b etter if the more preise even is hosen. F ormal pro ofs of orretness for abstrat in terpretation w ere already studied b efore, in partiular in [11 ℄. The approa h tak en in this pap er is dieren t, in that it follo ws diretly the syn tax of a simple strutured programming language, while traditional desriptions are tuned to studying a on trol-o w graph language. The main adv an tage of our approa h is that it supp orts a v ery onise desription of the abstrat in terpreter, with v ery simple v eriations that it is terminating. 3 The programming language In this ase study , w e w ork with a v ery small language on taining only assign- men ts, sequenes, and while lo ops. The righ t-hand sides for assignmen ts are expressions made of n umerals, v ariables, and addition. The syn tax of the pro- gramming language is as follo ws: v ariable names are noted x , y , x 1 , x ′ , et. in tegers are noted n , n 1 , n ′ , et. Arithmeti expressions are noted e , e 1 , e ′ , et. F or our ase study , these expressions an only tak e three forms: e ::= n | x | e 1 + e 2 b o olean expressions are noted b , b 1 , b ′ , et. F or our ase study , these expres- sions an only tak e one form: b ::= e 1 < e 2 instrutions are noted i , i 1 , i ′ , et. F or our ase study , these instrutions an only tak e three forms: i ::= x := e | i 1 ; i 2 | while b do i done F or the Co q eno ding, w e use pre-dened strings for v ariable names and in te- gers for the n umeri v alues. Th us, w e use un b ounded in tegers, whi h is on trary to usual programming languages, but the question of using b ounded in tegers or not is irrelev an t for the purp ose of this example. 3.1 Eno ding the language In our Co q eno ding, the desription of the v arious kinds of syn tati omp onen ts is giv en b y indutiv e delarations. Require Import String ZArith List. Open Sope Z_sope. Indutive aexpr : Type := anum (x:Z) | avar (s:string) | aplus (e1 e2:aexpr). Indutive bexpr : Type := blt (e1 e2 : aexpr). Indutive instr : Type := assign (x:string)(e:exp r) | seq (i1 i2:instr) | while (b:bexpr)(i:ins tr ). The rst t w o lines instrut Co q to load pre-dened libraries and to tune the parsing me hanism so that arithmeti form ulas will b e understo o d as form ulas onerning in tegers b y default. The denition for aexpr states that expressions an only ha v e the three forms anum , avar , and aplus , it also expresses that the names anum , avar , and aplus an b e used as funtion of t yp e, Z -> aexpr , string -> aexpr , and aexpr -> aexpr -> aexpr , resp etiv ely . The denition of aexpr as an indutiv e t yp e also implies that w e an write reursiv e funtions on this t yp e. F or instane, w e will use the follo wing funtion to ev aluate an arithmeti expression, giv en a valuation funtion g , whi h maps ev ery v ariable name to an in teger v alue. Fixpoint af (g:string->Z)(e: ae xp r) : Z := math e with anum n => n | avar x => g x | aplus e1 e2 => af g e1 + af g e2 end. This funtion is dened b y pattern-mat hing. There is one pattern for ea h p ossible form of arithmeti expression. The third line indiates that when the input e has the form anum n , then the v alue n is the result. The fourth line indiates that when the input has the form avar x , then the v alue is obtained b y applying the funtion g to x . The fth line desrib es the omputation that is done when the expression is an addition. There are t w o reursiv e alls to the funtion af in the expression returned for the addition pattern. The reursiv e alls are made on diret subterms of the initial instrution, this is kno wn as strutur al r e ursion and guaran tees that the reursiv e funtion will terminate on all inputs. A similar funtion bf is dened to desrib e the b o olean v alue of a b o olean expression. 3.2 The seman tis of the programming language T o desrib e the seman tis of the programming language, w e simply giv e a w eak- est pre-ondition alulus [9 ℄. W e desrib e the onditions that are neessary to ensure that a giv en logial prop ert y is satised at the end of the exeution of an instrution, when this exeution terminates. This w eak est pre-ondition alulus is dened as a pair of funtions whose input is an instrution annotated with logial information at v arious p oin ts in the instrution. The output of the rst funtion all p is a ondition that should b e satised b y the v ariables at the b eginning of the exeution (this is the pre-ondition and it should b e as easy to satisfy as p ossible, hene the adjetiv e w eak est ); the output of the seond funtion, alled v , is a olletion of logial statemen ts. When these statemen ts are v alid, w e kno w that ev ery exeution starting from a state that satises the pre-ondition will mak e the logial annotation satised at ev ery p oin t in the program and mak e the p ost-ondition satised if the exeution terminates. annotating programs W e need to dene a new data-t yp e for instrutions annotated with assertions at v arious lo ations. Ea h assertion is a quan tier- free logial form ula where the v ariables of the program an o ur. The in tended meaning is that the form ula is guaran teed to hold for ev ery exeution of the program that is onsisten t with the initial assertion. The syn tax for assertions is desrib ed as follo ws: Indutive assert : Type := pred (p:string)(l:lis t aexpr) | a_b (b:bexpr) | a_onj (a1 a2:assert) | a_not (a: assert) | a_true | a_false. This denition states that assertions an ha v e six forms: the rst form repre- sen ts the appliation of a prediate to an arbitrary list of arithmeti expressions, the seond represen ts a b o olean test: this assertion holds when the b o olean test ev aluates to true , the third form is the onjuntion of t w o assertions, the fourth form is the negation of an assertion, the fth and sixth forms giv e t w o onstan t assertions, whi h are alw a ys and nev er satised, resp etiv ely . In a minimal de- sription of a w eak est pre-ondition alulus, as in [2 ℄, the last t w o onstan ts are not neessary , but they will b e useful in our desription of the abstrat in terpreter. Logial annotations pla y a en tral role in our ase study , b eause the result of abstrat in terpretation will b e to add information ab out ea h p oin t in the program: this new information will b e desrib ed b y assertions. T o onsider whether an assertion holds, w e need to kno w what meaning is atta hed to ea h prediate name and what v alue is atta hed to ea h v ariable name. W e supp ose the meaning of prediates is giv en b y a funtion m that maps prediate names and list of in tegers to prop ositional v alues and the v alue of v ariables is giv en b y a v aluation as in the funtion af giv en ab o v e. Giv en su h a meaning for prediates and su h a v aluation funtion for v ariables, w e desrib e the omputation of the prop ert y asso iated to an assertion as follo ws: Fixpoint ia (m:string->list Z->Prop)(g:strin g-> Z) (a:assert) : Prop := math a with pred s l => m s (map (af g) l) | a_b b => bf g b = true | a_onj a1 a2 => (ia m g a1) /\ (ia m g a2) | a_not a => not (ia m g a) | a_true => True | a_false => False end. The t yp e of this funtion exhibits a sp eiit y of t yp e theory-based theorem pro ving: prop ositions are desrib ed b y typ es . The Co q system also pro vides a t yp e of t yp es, named Prop , whose elemen ts are the t yp es that are in tended to b e used as prop ositions. Ea h of these t yp es on tains the pro ofs of the prop osition they represen t. This is kno wn as the Curry-Howar d isomorphism . F or instane, the prop ositions that are unpro v able are represen ted b y empt y t yp es. Here, as- sertions are data, their in terpretation as prop ositions are t yp es, whi h b elongs to the Prop t yp e. More details ab out this desription of prop ositions as t yp es is giv en in another artile on t yp e theory in the same v olume. Annotated instrutions are in a new data-t yp e, named a_instr , whi h is v ery lose to the instr data-t yp e. The t w o mo diations are as follo ws: rst an extra op erator pre is added to mak e it p ossible to atta h assertions to an y instrution, seond while lo ops are mandatorily annotated wih an invariant assertion. In onrete syn tax, w e will write { a } i for the instrution i arrying the assertion a (noted pre a i in the Co q eno ding). Indutive a_instr : Type := pre (a:assert)(i:a_in st r) | a_assign (x:string)(e:aexp r) | a_seq (i1 i2:a_instr) | a_while (b:bexpr)(a:ass ert )( i: a_i ns tr ). Reasoning on assertions W e an reason on annotated programs, b eause there are logial reasons for programs to b e onsisten t with assertions. The idea is to ompute a olletion of logial form ulas asso iated to an annotated program and a nal logial form ula, the p ost- ondition . When this olletion of form ulas holds, there exists an other logial form ula, the pr e- ondition whose satisabilit y b efore exeuting the program is enough to guaran tee that the p ost-ondition holds after exeuting the program. Annotations added to an instrution (with the help of the pre onstrut) m ust b e understo o d as form ulas that hold just b efore exeuting the annotated instrution. Assertions added to while lo ops m ust b e understo o d as invariants , they are mean t to hold at the b eginning and the end ev ery time the inner part of the while lo op is exeuted. When assertions are presen t in the annotated instrution, they are tak en for gran ted. F or instane, when the instrution is {x = 3} x := x + 1 , the omputed pre-ondition is x = 3 , whatev er the p ost-ondition is. When the instrution is a plain assignmen t, one an nd the pre-ondition b y substituting the assigned v ariable with the assigned expression in the p ost- ondition. F or instane, when the p ost ondition is x = 4 and the instrution is the assignemen t x := x + 1 , it sues that the pre-ondition x + 1 = 4 is satised b efore exeuting the assignmen t to ensure that the p ost-ondition is satised after exeuting it. When the annotated instrution is a while lo op, the pre-ondition simply is the in v arian t for this while lo op. When the annotated instrution is a sequene of t w o instrutions, the pre-ondition is the pre-ondition omputed for the rst of the t w o instrutions, but using the pre-ondition of the seond instrution as the p ost-ondition for the rst instrution. Co q eno ding for pre-ondition omputation T o eno de this pre-ondition funtion in Co q, w e need to desrib e funtions that p erform the substitution of a v ariable with an arithmeti expression in arithmeti expressions, b o olean expressions, and assertions. These substitution funtions are giv en as follo ws: Fixpoint asubst (x:string) (s:aexpr) (e:aexpr) : aexpr := math e with anum n => anum n | avar x1 => if string_de x x1 then s else e | aplus e1 e2 => aplus (asubst x s e1) (asubst x s e2) end. Definition bsubst (x:string) (s:aexpr) (b:bexpr) : bexpr := math b with blt e1 e2 => blt (asubst x s e1) (asubst x s e2) end. Fixpoint subst (x:string) (s:aexpr) (a:assert) : assert := math a with pred p l => pred p (map (asubst x s) l) | a_b b => a_b (bsubst x s b) | a_onj a1 a2 => a_onj (subst x s a1) (subst x s a2) | a_not a => a_not (subst x s a) | any => any end. In the denition of asubst , the funtion string_de ompares t w o strings for equalit y . The v alue returned b y this funtion an b e used in an if-then-else onstrut, but it is not a b o olean v alue (more detail an b e found in [ 3 ℄). The rest of the o de is just a plain tra v ersal of the struture of expressions and assertions. Note also that the last pattern-mat hing rule in subst is used for b oth a_true and a_false . One w e kno w ho w to substitute a v ariable with an expression, w e an easily desrib e the omputation of the pre-ondition for an annotated instrution and a p ost-ondition. This is giv en b y the follo wing simple reursiv e pro edure: Fixpoint p (i:a_instr) (post : assert) : assert := math i with pre a i => a | a_assign x e => subst x e post | a_seq i1 i2 => p i1 (p i2 post) | a_while b a i => a end. A v eriation ondition generator When it reeiv es an instrution arrying an annotation, the funtion p simply returns the annotation. In this sense, the pre-ondition funtion tak es the annotation for gran ted. T o mak e sure that an instrution is onsisten t with its pre-ondition, w e need to he k that the assertion really is strong enough to ensure the p ost-ondition. F or instane, when the p ost-ondition is x < 10 and the instrution is the annotated assigmen t { x = 2 } x := x + 1 , satisfying x = 2 b efore the as- signmen t is enough to ensure that the p ost-ondition is satised. On the other hand, if the annotated instrution w as {x < 10 } x := x + 1 , there w ould b e a problem b eause there are ases where x < 10 holds b efore exeuting the as- signmen t and x < 10 do es not hold after. In fat, for assigmen ts that are not annotated with assertions, the funtion p omputes the b est form ula, the we akest pr e- ondition . Th us, in presene of an annotation, it sues to v erify that the annotation do es imply the w eak est pre-ondition. W e are no w going to desrib e a funtion that ollets all the v er- iations that need to b e done. More preisely , the new funtion will ompute onditions that are suien t to ensure that the pre-ondition from the previ- ous setion is strong enough to guaran tee that the p ost-ondition holds after exeuting the program, when the program terminates. The v eriation that an annotated instrution is onsisten t with a p ost- ondition th us returns a sequene of impliations b et w een assertions. When all these impliations are logially v alid, there is a guaran tee that satisfying the pre-ondition b efore exeuting the instrution is enough to ensure that the p ost- ondition will also b e satised after exeuting the instrution. This guaran tee is pro v ed formally in [2℄. When the instrution is a plain assignmen t without annotation, there is no need to v erify an y impliation b eause the omputed pre-ondition is already go o d enough. When the instrution is an annotated instrution { A } i and the p ost-ondition is P , w e an rst ompute the pre-ondition P ′ and a list of impliations l for the instrution i and the p ost-ondition P . W e then only need to add A ⇒ P ′ to l to get the list of onditions for the whole instrution. F or instane, when the p ost-ondition is x=3 and the instrution is the as- signmen t x := x+1 , the pre-ondition omputed b y p is x + 1 = 3 and this is ob viously go o d enough for the p ost-ondition to b e satised. On the other hand, when the instrution is an annotated instrution, { P } x := x+1 , w e need to v erify that P ⇒ x + 1 = 3 holds. If w e lo ok again at the rst example in this setion, onerning an instrution {x < 10} x := x+1 and a p ost-ondition x < 10 , there is a problem, b eause a v alue of 9 satises the pre-ondition, but exeution leads to a v alue of 10, whi h do es not satisfy the p ost-ondition The ondition generator onstruts a ondition of the form x < 10 ⇒ x + 1 < 10 . The fat that this logial form ula is atually unpro v able relates to the fat that the triplet omp osed of the pre- ondition, the assignmen t, and the p ost-ondition is atually inonsisten t. When the instrution is a sequene of t w o instrutions i 1 ; i 2 and the p ost- ondition is P , w e need to ompute lists of onditions for b oth sub-omp onen ts i 1 and i 2 . The list of onditions for i 2 is omputed for the p ost-ondition for the whole onstrut P , while the list of onditions of i 1 is omputed taking as p ost-ondition the pre-ondition of i 2 for P . This is onsisten t with the in tuitiv e explanation that it sues that the pre-ondition for an instrution holds to ensure that the p ost-ondition will hold after exeuting that instrution. If w e w an t P to hold after exeuting i 2 , w e need the pre-ondition of i 2 for P to b e satised and it is the resp onsibilit y of the instrution i 1 to guaran tee this. Th us, the onditions for i 1 an b e omputed with this assertion as a p ost-ondition. When the instrution is a while lo op, of the form while b do { A } i done w e m ust remem b er that the assertion A should b e an in v arian t during the lo op exeution. This is expressed b y requiring that A is satised b efore exeuting i should b e enough to guaran tee that A is also satised after exeuting i . Ho w ev er, this is needed only in the ases where the lo op test b is also satised, b eause when b is not satised the inner instrution of the while lo op is not exeuted. A t the end of the exeution, w e an use the information that the in v arian t A is satised and the information that w e kno w the lo op has b een exeuted b eause the test ev en tually failed. The program is onsisten t when these t w o logial prop erties are enough to imply the initial p ost-ondition P . Th us, w e m ust rst ompute the pre-ondition A ′ for the inner instrution i and the p ost-ondition A , ompute the list of onditions for i with A as p ost-ondition, add the ondition A ∧ b ⇒ A ′ , and add the ondition A ∧ ¬ b ⇒ P . Co q eno ding of the v eriation ondition generator The v eriation onditions alw a ys are impliations. W e pro vide a new data-t yp e for these impli- ations: Indutive ond : Type := imp (a1 a2:assert). The omputation of v eriation onditions is then simply desrib ed as a plain reursiv e funtion, whi h follo ws the struture of annotated instrutions. Fixpoint v (i:a_instr)(post : assert) : list ond := math i with pre a i => (imp a (p i post))::v i post | a_assign _ _ => nil | a_seq i1 i2 => v i1 (p i2 post)++v i2 post | a_while b a i => (imp (a_onj a (a_b b)) (p i a)):: (imp (a_onj a (a_not (a_b b))) post):: v i a end. Desribing the seman tis of programming language using a v eriation on- dition generator is not the only approa h that an b e used to desrib e the lan- guage. In fat, this approa h is partial, b eause it desrib es prop erties of inputs and outputs when instrution exeution terminates, but it giv es no information ab out termination. More preise desriptions an b e giv en using op erational or denotational seman tis and the onsisteny of this v eriation ondition gener- ator with su h a omplete seman tis an also b e v eried formally . This is done in [2 ℄, but it is not the purp ose of this artile. When reasoning ab out the orretness of a giv en annotated instrution, w e an use the funtion v to obtain a list of onditions. It is then neessary to reason on the v alidit y of this list of onditions. What w e w an t to v erify is that the impliations hold for ev ery p ossible instan tiation of the program v ariables. This is desrib ed b y the follo wing funtion. Fixpoint valid (m:string->list Z ->Prop) (l:list ond) : Prop := math l with nil => True | ::tl => (let (a1, a2) := in forall g, ia m g a1 -> ia m g a2) /\ valid m tl end. An annotated program i is onsisten t with a giv en p ost-ondition p when the prop ert y valid (v i p ) holds. This means that the p ost-ondition is guaran- teed to hold after exeuting the instrution if the omputed pre-ondition w as satised b efore the exeution and the exeution of the instrution terminates. 3.3 A monotoniit y prop ert y In our study of an abstrat in terpreter, w e will use a prop ert y of the ondition generator. Theorem 1. F or every annotate d instrution i , if p 1 and p 2 ar e two p ost- onditions suh that p 1 is str onger than p 2 , if the pr e- ondition for i and p 1 is satise d and al l the veri ation onditions for i and the p ost- ondition p 1 ar e valid, then the pr e- ondition for i and p 2 is also satise d and the veri ation onditions for i and p 2 ar e also valid. Pr o of. This pro of is done in the on text of a giv en mapping from prediate names to atual prediates, m . The prop ert y is pro v ed b y indution on the struture of the instrution i . The statemen t p 1 is stronger than p 2 when the impliation p 1 ⇒ p 2 is v alid. In other w ords, for ev ery assignmen t of v ariables g , the logial v alue of p 1 implies the logial v alue of p 2 . If the instrution is an assignmen t, w e an rely on a lemma: the v alue of an y assertion subst x e p in an y v aluation g is equal to the v alue of the assertion p in the v aluation g ′ that is equal to g on ev ery v ariable but x , for whi h it returns the v alue of e in the v aluation g . Th us, the preondition for the assignmen t x := e for p i is subst x e p i and the the v alidit y of subst x e p 1 ⇒ sub st x e p 2 simply is an instane of the v alidit y of p 1 ⇒ p 2 , whi h is giv en b y h yp othesis. Also, when the instrution is an assignmen t, there is no generated v eriation ondition and the seond part of the statemen t holds. If the instrution is a sequene i 1 ; i 2 , then w e kno w b y indution h yp othesis that the pre-ondition p ′ 1 for i 2 and p 1 is stronger than the pre-ondition p ′ 2 for i 2 and p 2 and all the v eriation onditions for that part are v alid; w e an use an indution h yp othesis again to obtain that the pre-ondition for i 1 and p ′ 1 is stronger than the pre-ondition for i 1 and p ′ 2 , and the orresp onding v eriation onditions are all v alid. The last t w o pre-onditions are the ones w e need to ompare, and the whole set of v eriation onditions is the union of the sets whi h w e kno w are v alid. If the instrution is an annotated instrution { a } i , the t w o pre-onditions for p 2 and p 1 alre alw a ys a , so the rst part of the statemen t trivially holds. Moreo v er, w e kno w b y indution h yp othesis that the pre-ondition p ′ 1 for i and p 1 is stronger that the pre-ondition p ′ 2 for i and p 2 . The v eriation onditions for the whole instrution and p 1 (resp. p 2 ) are the same as for the sub-instrution, with the ondition a ⇒ p ′ 1 (resp. a ⇒ p ′ 2 ) added. By h yp othesis, a ⇒ p ′ 1 holds, b y indution h yp othesis p ′ 1 ⇒ p ′ 2 , w e an th us dedue that a ⇒ p ′ 2 holds. If the instrution is a lo op while b do { a } i d one , most v eriation onditions and generated pre-onditions only dep end on the lo op in v arian t. The only thing that w e need to he k is the v eriation ondition on taining the in v arian t, the negation of the test and the p ost-ondition. By h yp othesis, a ∧ ¬ b ⇒ p 1 and p 1 ⇒ p 2 are v alid. By transitivit y of impliation w e obtain a ∧ ¬ b ⇒ p 2 easily . In Co q, w e rst pro v e a lemma that expresses that the satisabilit y of an asser- tion a where a v ariable x is substituted with an arithmeti expression e' for a v aluation g is the same as the satisabilit y of the assertion a without substitu- tion, but for a v aluation that maps x to the v alue of e' in g and oinides with g for all other v ariables. Lemma subst_sound : forall m g a x e', ia m g (subst x e' a) = ia m (fun y => if string_de x y then af g e' else g y) a. This lemma requires similar lemmas for arithmeti expressions, b o olean expres- sions, and lists of expressions. All are pro v ed b y indution on the struture of expressions. An example pro of for substitution F or instane, the statemen t for the substitution in arithmeti expressions is as follo ws: Lemma subst_sound_a : forall g e x e', af g (asubst x e' e) = af (fun y => if string_de x y then af g e' else g y) e. The pro of an b e done in Co q b y an indution on the expression e . This leads the system to generate three ases, orresp onding to the three onstrutors of the aexpr t yp e. The om bined tati w e use is as follo ws: intros g e x e'; indution e; simpl; auto. The tati indution e generates three goals and the tatis simpl and auto are applied to all of them. One of the ases is the ase for the anum onstrutor, where b oth instanes of the af funtion ompute to the v alue arried b y the onstrutor, th us simpl fores the omputation and leads to an equalit y where b oth sides are equal. In this ase, auto solv es the goal. Only the other t w o goals remain. The rst other goal is onerned with the avar onstrut. In this ase the expression has the form avar s and the expression subst x e' (avar s) is transformed in to the follo wing expression b y the simpl tati. if string_de x s then e' else (avar s) F or this ase, the system displa ys a goal that has the follo wing shap e: g : string -> Z s : string x : string e' : aexpr ================ === == == === == af g (if string_de x s then e' else avar s) = (if string_de x s then af g e' else g s) In Co q goals, the information that app ears ab o v e the horizon tal bar is data that is kno wn to exist, the information b elo w the horizon tal bar is the expression that w e need to pro v e. Here the information that is kno wn only orresp onds to t yping information. W e need to reason b y ases on the v alues of the expression string_de x s . The tati ase ... is used for this purp oses. It generate t w o goals, one orresp onding to the ase where string_de x s has an amativ e v alue and one orresp onding to the ase where string_de x s has a negativ e v alue. In ea h the goal, the if-then-else onstruts are redued aordingly . In the goal where string_de x s is armativ e, b oth sides of the equalit y redue to af g e' ; in the other goal, b oth sides of the equalit y redue to g x . Th us in b oth ases, the pro of b eomes easy . This reasoning step is easily expressed with the follo wing om bined tati: ase (string_de x s); auto. There only remains a goal for the last p ossible form of arithmeti expression, aplus e1 e2 . The indution tati pro vides indution hyp otheses stating that the prop ert y w e w an t to pro v e already holds for e1 and e2 . After sym b oli omputation of the funtions af and asubst , as p erformed b y the simpl tati, the goal has the follo wing shap e: ... IHe1 : af g (asubst x e' e1) = af (fun y : string => if string_de x y then af g e' else g y) e1 IHe2 : af g (asubst x e' e2) = af (fun y : string => if string_de x y then af g e' else g y) e2 ================ === == == === == af g (asubst x e' e1) + af g (asubst x e' e2) = af (fun y : string => if string_de x y then af g e' else g y) e1 + af (fun y : string => if string_de x y then af g e' else g y) e2 This pro of an b e nished b y rewriting with the t w o equalities named IHe1 and IHe2 and then reognizing that b oth sides of the equalit y are the same, as required b y the follo wing tatis. rewrite IHe1, IHe2; auto. Qed. W e an no w turn our atten tion to the main result, whi h is then expressed as the follo wing statemen t: Lemma v_monotoni : forall m i p1 p2, (forall g, ia m g p1 -> ia m g p2) -> valid m (v i p1) -> valid m (v i p2) /\ (forall g, ia m g (p i p1) -> ia m g (p i p2)). T o express that this pro of is done b y indution on the struture of instrutions, the rst tati sen t to the pro of system has the form: intros m; indution i; intros p1 p2 p1p2 v1. The pro of then has four ases, whi h are solv ed in ab out 10 lines of pro of sript. 4 A rst simple abstrat in terpreter W e shall no w dene t w o abstrat in terpreters, whi h run instrutions sym b oli- ally , up dating an abstrat state at ea h step. The abstrat state is then trans- formed in to a logial expression whi h is added to the instrutions, th us pro- duing an annotated instrution. The abstrat state is also returned at the end of exeution, in one of t w o forms. In the rst simple abstrat in terpreter, the nal abstrat state is simply returned. In the seond abstrat in terpreter, only an optional abstrat state will b e returned, a None v alue b eing used when the abstrat in terpreter an detet that the program an nev er terminate: the seond abstrat in terpreter will also p erform dead o de detetion. F or example, if w e giv e our abstrat in terpreter an input state stating that x is ev en and y is o dd and the instrution x:= x+y; y:=y+1 , the resulting v alue will b e: ({even x /\ odd y} x:=x+y; {odd x /\ odd y} y:= y+1, (x, odd)::(y,even)::n il ) W e supp ose there exists a data-t yp e A whose elemen ts will represen t abstrat v alues on whi h instrutions are supp osed to ompute. F or instane, the data- t yp e A ould b e the t yp e on taining three v alues even , odd , and top . Another traditional example of abstrat data-t yp e is the t yp e of in terv als, that are either of the form [ m, n ] , with m ≤ n , [ −∞ , n ] , [ m, + ∞ ] , or [ −∞ , + ∞ ] . The data-t yp e of abstrat v alues should ome with a few elemen ts and fun- tions, whi h w e will desrib e progresssiv ely . 4.1 Using Galois onnetions Abstrat v alues represen t sp ei sets of onrete v alues. There is a natural order on sets : set inlusion. Similarly , w e an onsider an order on abstrat v alues, whi h mimis the order b et w een the sets they represen t. The traditional approa h to desrib e this orresp ondane b et w een the order on sets of v alues and the order on abstrat v alues is to onsider that the t yp e of abstrat v alues is giv en with a pair of funtions α and γ , where α : P ( Z ) → A and γ : A → P ( Z ) . The funtion γ maps an y abstrat v alue to the set of onrete v alues it represen ts. The funtion α maps an y set of onrete v alues to the smallest abstrat v alue whose in terpretation as a set on tains the input. W ritten in a mathematial form ula where ⊑ denotes the order on abstrat v alues, the t w o funtions and the orders on sets of onrete v alues and on abstrat v alues are related b y the follo wing statemen t: ∀ a ∈ A, ∀ b ∈ P ( Z ) .b ⊂ γ ( a ) ⇔ α ( b ) ⊑ a. When the funtions α and γ are giv en with this prop ert y , one sa ys that there is a Galois onne tion . In our study of abstrat in terpretation, the funtions α and γ do not app ear expliitly . In a sense, γ will b e represen ted b y a funtion to_pred mapping abstrat v alues to assertions dep ending on arithmeti expressions. Ho w ev er, it is useful to k eep these funtions in mind when trying to gure out what prop erties are exp eted for the v arious omp onen ts of our abstrat in terpreters, as w e will see in the next setion. 4.2 Abstrat ev aluation of arithmeti expressions Arithmeti expressions on tain in teger onstan ts and additions, neither of whi h are onerned with the data-t yp e of abstrat v alues. T o b e able to asso iate an abstrat v alue to an arithmeti expression, w e need to nd w a ys to establish a orresp ondane b et w een onrete v alues and abstrat v alues. This is done b y supp osing the existene of t w o funtions and a onstan t, whi h are the rst three v alues axiomatized for the data-t yp e of abstrat v alues (but there will b e more later): from_Z : Z -> A , this is used to asso iate a relev an t abstrat v alue to an y onrete v alue, a_add : A -> A -> A , this is used to add t w o abstrat v alues, top : A , this is used to represen t the abstrat v alue that arries no infor- mation. In terms of Galois onnetions, the funtion from_Z orresp onds to the fun- tion α , when applied to singletons. The funtion a_add m ust b e designed in su h a w a y that the follo wing prop ert y is satised: ∀ v 1 v 2 , { x + y | x ∈ ( γ ( v 1 ) , y ∈ ( γ ( v 2 )) } ⊂ γ ( a _ add v 1 v 2 ) . With this onstrain t, a funtion that maps an y pairs of abstrat v alues to top w ould b e aeptable, ho w ev er it w ould b e useless. It is b etter if a_add v 1 v 2 is the least satisfatory abstrat v alue su h that the ab o v e prop ert y is satised. The v alue top is the maximal elemen t of A , the image of the whole Z b y the funtion α . 4.3 Handling abstrat states When omputing the v alue of a v ariable, w e supp ose that this v alue is giv en b y lo oking up in a state, whi h atually is a list of pairs of v ariables and abstrat v alues. Definition state := list(string*A). Fixpoint lookup (s:state) (x:string) : A := math s with nil => top | (y,v)::tl => if string_de x y then v else lookup tl x end. As w e see in the denition of lookup , when a v alue is not dened in a state, the funtion b eha v es as if it w as dened with top as abstrat v alue. The omputation of abstrat v alues for arithmeti expressions is then desrib ed b y the follo wing funtion. Fixpoint a_af (s:state)(e:aexp r) : A := math e with avar x => lookup s x | anum n => from_Z n | aplus e1 e2 => a_add (a_af s e1) (a_af s e2) end. When exeuting assignmen ts abstratly , w e are also supp osed to mo dify the state. If the state on tained no previous information ab out the assigned v ariable, a new pair is reated. Otherwise, the rst existing pair m ust b e up dated. This is done with the follo wing funtion. Fixpoint a_upd(x:string)( v:A )( l: sta te ) : state := math l with nil => (x,v)::nil | (y,v')::tl => if string_de x y then (y, v)::tl else (y,v')::a_upd x v tl end. Later in this pap er, w e dene a funtion that generates assertions from states. F or this purp ose, it is b etter to up date b y mo difying existing pairs of a v ariable and a v alue rather than just inserting the new pair in fron t. 4.4 The in terpreter's main funtion When omputing abstrat in terpretation on instrutions w e w an t to pro due a nal abstrat state and an annotated instrution. W e will need a w a y to trans- form an abstrat v alue in to an assertion. This is giv en b y a funtion with the follo wing t yp e: to_pred : A -> aexpr -> assert this is used to express that that the v alue of the arithmeti expression in a giv en v aluation will b elong to the set of onrete v alues represen ted b y the giv en abstrat v alue. So to_pred is axiomatized in the same sense as from_Z , a_add , top . Relying on the existene of to_pred , w e an dene a funtion that maps states to assertions: Fixpoint s_to_a (s:state) : assert := math s with nil => a_true | (x,a)::tl => a_onj (to_pred a (avar x)) (s_to_a tl) end. This funtion is implemen ted in a manner that all pairs presen t in the state are transformed in to assertions. F or this reason, it is imp ortan t that a_upd w orks b y mo difying existing pairs rather than hiding them. Our rst simple abstrat in terpreter only implemen ts a trivial b eha vior for while lo ops. Basially , this sa ys that no information an b e gathered for while lo ops (the result is nil , and the while lo op's in v arian t is also nil ). Fixpoint ab1 (i:instr)(s:sta te ) : a_instr*state := math i with assign x e => (pre (s_to_a s) (a_assign x e), a_upd x (a_af s e) s) | seq i1 i2 => let (a_i1, s1) := ab1 i1 s in let (a_i2, s2) := ab1 i2 s1 in (a_seq a_i1 a_i2, s2) | while b i => let (a_i, _) := ab1 i nil in (a_while b (s_to_a nil) a_i, nil) end. In this funtion, w e see that the abstrat in terpretation of sequenes is simply desrib ed as omp osing the eet on states and reom bining the instrution obtained from ea h omp onen t of the sequene. 4.5 Exp eted prop erties for abstrat v alues T o pro v e the orretness of the abstrat in terpreter, w e need to kno w that the v arious funtions and v alues pro vided around the t yp e A satisfy a olletion of prop erties. These are gathered as a set of h yp otheses. One v alue that w e ha v e not talk ed ab out y et is the mapping from predi- ate names to atual prediates on in tegers, whi h is neessary to in terpret the assertions generated b y to_pred . This is giv en axiomatially , lik e top and the others: m : string -> list Z -> Prop , maps all prediate names used in to_pred to atual prediates on in tegers. The rst h yp othesis expresses that top brings no information. Hypothesis top_sem : forall e, (to_pred top e) = a_true. The next t w o h yp otheses express that the prediates asso iated to ea h ab- strat v alue are p ar ametri with resp et to the arithmeti expression they reeiv e. Their truth do es not dep end on the exat shap e of the expressions, but only on the onrete v alue su h an arithmeti expression ma y tak e in the urren t v alu- ation. Similarly , substitution basially aets the arithmeti expression part of the prediate, not the part that dep ends on the abstrat v alue. Hypothesis to_pred_sem : forall g v e, ia m g (to_pred v e) = ia m g (to_pred v (anum (af g e))). Hypothesis subst_to_pred : forall v x e e', subst x e' (to_pred v e) = to_pred v (asubst x e' e). F or instane, if the abstrat v alues are in terv als, it is natural that the to_pred funtion will map an abstrat v alue [3,10℄ and an arithmeti expression e to an assertion between(3, e, 10) . When ev aluating this assertion with resp et to a giv en v aluation g , the in tegers 3 and 10 will not b e aeted b y g . Similarly , substitution will not aet these in tegers. The last t w o h yp otheses express that the in terpretation of the asso iated prediates for abstrat v alues obtained through from_Z and a_add are onsisten t with the onrete v alues omputed for immediate in tegers and additions. The h yp othesis from_Z_sem atually establishes the orresp ondene b et w een from_Z and the abstration funtion α of a Galois onnetion. The h yp othesis a_add_sem expresses the ondition whi h w e desrib ed informally when in tro duing the funtion a_add_sem . Hypothesis from_Z_sem : forall g x, ia m g (to_pred (from_Z x) (anum x)). Hypothesis a_add_sem : forall g v1 v2 x1 x2, ia m g (to_pred v1 (anum x1)) -> ia m g (to_pred v2 (anum x2)) -> ia m g (to_pred (a_add v1 v2) (anum (x1+x2))). 4.6 A v oiding dupliates in states The w a y s_to_a and a_upd are dened is not onsisten t: s_to_a maps ev ery pair o uring in a state to an assertion fragmen t, while a_upd only mo dies the rst pair o uring in the state. F or instane, when the abstrat in terpretation omputes with in terv als, s is ("x", [1,1℄)::("x",[1,1 ℄) ::n il , and the instrution is x := x + 1 , the re- sulting state is ("x",[2,2℄)::("x ",[ 1, 1℄ ):: ni l and the resulting annotated instrution is { 1 ≤ x ≤ 1 ∧ 1 ≤ x ≤ 1 } x:= x+1 . The p ost-ondition orre- sp onding to the resulting state is 2 ≤ x ≤ 2 ∧ 1 ≤ x ≤ 1 . It is on traditory and annot b e satised when exeuting from v aluations satisfying the pre-ondition, whi h is not on traditory . T o op e with this diult y , w e need to express that the abstrat in terpreter w orks orretly only with states that on tain no dupliates. W e formalize this with a prediate onsistent , whi h is dened as follo ws: Fixpoint mem (s:string)(l:li st string): bool := math l with nil => false | x::l => if string_de x s then true else mem s l end. Fixpoint no_dups (s:state)(l:lis t string) :bool := math s with nil => true | (s,_)::tl => if mem s l then false else no_dups tl (s::l) end. Definition onsistent (s:state) := no_dups s nil = true. The funtion no_dups atually returns true when the state s on tains no du- pliates and no elemen t from the exlusion list l . W e pro v e, b y indution on the of struture of s , that up dating a state that satises no_dups for an exlusion list l , using a_upd for a v ariable x outside the exlusion list returns a new state that still satises no_dups for l . The statemen t is as follo ws: Lemma no_dups_update : forall s l x v, mem x l = false -> no_dups s l = true -> no_dups (a_upd x v s) l = true. The pro of of this lemma is done b y indution on s , making sure that the prop ert y that is established for ev ery s is univ ersally quan tied o v er l : the indution h yp othesis is atually used for a dieren t v alue of the the exlusion list. The orollary from this lemma orresp onding to the ase where l is instan- tiated with the empt y list expresses that a_upd preserv es the onsistent prop- ert y . Lemma onsistent_update : forall s x v, onsistent s -> onsistent (a_upd x v s). 4.7 Pro ving the orretness of this in terpreter When the in terpreter runs on an instrution i and a state s and returns an annotated instrution i ′ and a new state s ′ , the orretness of the run is expressed with three prop erties: The assertion s_to_a s is stronger than the pre-ondition p i ′ (s_to_a s ′ ) , All the v eriation onditions in v i ′ (s_to_a s ′ ) are v alid, The annotated instrution i ′ is an annotated v ersion of the input i . In the next few setions, w e will pro v e that all runs of the abstrat in terpreter are orret. 4.8 Soundness of abstrat ev aluation for expressions When an expression e ev aluates abstratly to an abstrat v alue a and onretely to an in teger z , z should satisfy the prediate asso iated to the v alue a . Of ourse, the ev aluation of e an only b e done using a v aluation that tak es are of pro viding v alues for all v ariables o uring in e . This v aluation m ust b e onsisten t with the abstrat state that is used for the abstrat ev aluation leading to a . The fat that a v aluation is onsisten t with an abstrat state is simply expressed b y sa ying that the in terpretation of the orresp onding assertion for this v aluation has to hold. Th us, the soundness of abstrat ev aluation is expressed with a lemma that has the follo wing shap e: Lemma a_af_sound : forall s g e, ia m g (s_to_a s) -> ia m g (to_pred (a_af s e) (anum (af g e))). This lemma is pro v ed b y indution on the expression e . The ase where e is a n um b er is a diret appliation of the h yp othesis from_Z_sem , the ase where e is an addition is a onsequene of a_add_sem , om bined with indution h yp otheses. The ase where e is a v ariable relies on another lemma: Lemma lookup_sem : forall s g, ia m g (s_to_a s) -> forall x, ia m g (to_pred (lookup s x) (anum (g x))). This other lemma is pro v ed b y indution on s . In the base ase, s is empt y , lookup s x is top , and the h yp othesis top_sem mak es it p ossible to onlude; in the step ase, if s is (y,v)::s' then the h yp othesis ia m g (s_to_a s) redues to to_pred v (avar y) /\ ia m g (s_to_a s') W e reason b y ases on whether x is y or not. If x is equal to y then to_pred v (avar y) is the same as to_pred v (anum (g x)) aording to to_pred_sem and lookup s x is the same as v b y denition of lookup , this is enough to onlude this ase. If x and y are dieren t, w e use the indution h yp othesis on s' . 4.9 Soundness of up date In the w eak est pre-ondition alulus, assignmen ts of the form x := e are tak en are of b y substituting all o urrenes of the assigned v ariable x with the arith- meti expression e in the p ost-ondition to obtain the w eak est pre-ondition. In the abstrat in terpreter, assignmen t is tak en are of b y up dating the rst instane of the v ariable in the state. There is a disrepany b et w een the t w o ap- proa hes, where the rst approa h ats on all instanes of the v ariable and the seond approa h ats only on the rst one. This disrepany is resolv ed in the onditions of our exp erimen t, where w e w ork with abstrat states that on tain only one binding for ea h v ariable: in this ase, up dating the rst v ariable is the same as up dating all v ariables. W e express this with the follo wing lemmas: Lemma subst_no_our : forall s x l e, no_dups s (x::l) = true -> subst x e (s_to_a s) = (s_to_a s). Lemma subst_onsistent : forall s g v x e, onsistent s -> ia m g (s_to_a s) -> ia m g (to_pred v (anum (af g e))) -> ia m g (subst x e (s_to_a (a_upd x v s))). Both lemmas are pro v ed b y indution on s and the seond one uses the rst in the ase where the substituted v ariable x is the rst v ariable o uring in s . This pro of also relies on the h yp othesis subst_to_pred . 4.10 Relating input abstrat states and pre-onditions F or the orretness pro of w e onsider runs starting from an instrution i and an initial abstrat state s and obtaining an annotated instrution i' and a nal abstrat state s' . W e are then onerned with the v eriation onditions and the pre-ondition generated for the p ost-ondition orresp onding to s' and the annotated instrution i' . The pre-ondition w e obtain is either the assertion orresp onding to s or the assertion a_true , when the rst sub-instrution in i is a while lo op. In all ases, the assertion orresp onding to s is stronger than the pre-ondition. This is expressed with the follo wing lemma, whi h is easily pro v ed b y indution on i . Lemma ab1_p : forall i i' s s', ab1 i s = (i', s') -> forall g a, ia m g (s_to_a s) -> ia m g (p i' a). This lemma is atually stronger than needed, b eause the p ost-ondition used for omputing the pre-ondition do es not matter, sine the resulting annotated instrution is hea vily annotated with assertions and the pre-ondition alw a ys omes from one of the annoations. 4.11 V alidit y of generated onditions The main orretness statemen t only onerns states that satisfy the onsistent prediate, that is, states that on tain at most one en try for ea h v ariable. The statemen t is pro v ed b y indution on instrutions. As is often the ase, what w e pro v e b y indution is a stronger statemen t; Su h a stronger statemen t also means stronger indution h yp otheses. Here w e add the information that the resulting state is also onsisten t. Theorem 2. If s is a onsistent state and running the abstr at interpr eter ab1 on i fr om s r eturns a new annotate d instrution i ′ and anal state s ′ , then al l the veri ation onditions gener ate d for i ′ and the p ost- ondition asso iate d to s ′ ar e valid. Mor e over, the state s ′ is onsistent. The Co q eno ding of this theorem is as follo ws: Theorem ab1_orret : forall i i' s s', onsistent s -> ab1 i s = (i', s') -> valid m (v i' (s_to_a s')) /\ onsistent s'. This statemen t is pro v ed b y indution on i . Three ases arise, orresp onding to the three instrutions in the language. 1. When i is an assignmen t x := e , this is the base ase. ab1 i s omputes to (pre (s_to_a s) (a_assign x e), a_upd x (a_af s e) s) F rom the lemma a_af_sound w e obtain that the onrete v alue of e in an y v aluation g that satises ia m g (s_to_a s) satises the follo wing prop- ert y: ia m g (to_pred (a_af s e) (anum (af g e))) The lemma subst_onsistent an then b e used to obtain the v alidit y of the follo wing ondition. imp (s_to_a s) (subst x e (s_to_a (a_upd x (a_af s e) s))) This is the single v eriation ondition generated for this instrution. The seond part is tak en are of b y onsistent_upda te . 2. When the instrution i is a sequene seq i1 i2 , the abstrat in terpreter rst pro esses i1 with the state s as input to obtain an annotated instrution a_i1 and an output state s1 , it then pro esses i2 with s1 as input to obtain an annotated instrution a_i2 and a state s2 . The state s2 is used as the output state for the whole instrution. W e then need to v erify that the on- ditions generated for a_seq a_i1 a_i2 using s_to_a a2 as p ost-ondition are v alid and s2 satises the onsistent prop ert y . The onditions an b e split in t w o parts. The seond part is v a_i2 (s_to_a a2) . the v alidit y of these onditions is a diret onsequene of the indution h yp otheses. The rst part is v a_i1 (p a_i2 (s_to_a s2)) . This is not a diret onsequene of the indution h yp othesis, whi h only states v a_i1 (s_to_a s1) . Ho w- ev er, the lemma ab1_p applied on a_i2 states that s_to_a s1 is stronger than p (s_to_a s2) and the lemma v_monotoni mak es it p ossible to onlude. With resp et to the onsistent prop ert y , it is reursiv ely trans- mitted from s to s1 and from s1 to s2 . 3. When the instrution is a while lo op, the b o dy of the lo op is reursiv ely pro essed with the nil state, whi h is alw a ys satised. Th us, the v eriation onditions all onlude to a_true whi h is trivially true. Also, the nil state also trivially satises the onsistent prop ert y . 4.12 The annotated instrution W e also need to pro v e that the pro dued annotated instrution really is an annotated v ersion of the initial instrution. T o state this new lemma, w e rst dene a simple funtion that forgets the annotations in an annotated instrution: Fixpoint leanup (i: a_instr) : instr := math i with pre a i => leanup i | a_assign x e => assign x e | a_seq i1 i2 => seq (leanup i1) (leanup i2) | a_while b a i => while b (leanup i) end. W e then pro v e a simple lemma ab out the abstrat in terpreter and this funtion. Theorem ab1_lean : forall i i' s s', ab1 i s = (i', s') -> leanup i' = i. The pro of of this lemma is done b y indution on the struture of i . 4.13 Instan tiating the simple abstrat in terpreter W e an instan tiate this simple abstrat in terpreter on a data-t yp e of o dd-ev en v alues, using the follo wing indutiv e t yp e and funtions: Indutive oe : Type := even | odd | oe_top. Definition oe_from_Z (n:Z) : oe := if Z_eq_de (Zmod n 2) 0 then even else odd. Definition oe_add (v1 v2:oe) : oe := math v1,v2 with odd, odd => even | even, even => even | odd, even => odd | even, odd => odd | _, _ => oe_top end. The abstrat v alues an then b e mapp ed in to assertions in the ob vious w a y using a funtion oe_pred whi h w e do not desrib e here for the sak e of oniseness. Running this simple in terpreter on a small example, represen ting the program x := x + y; y := y + 1 for the state ("x", odd)::("y", even)::nil is represen ted b y the follo wing dialog: Definition ab1oe := ab1 oe oe_from_Z oe_top oe_add oe_to_pred. Eval vm_ompute in ab1oe (seq (assign "x" (aplus (avar "x") (avar "y"))) (assign "y" (aplus (avar "y") (anum 1)))) (("x",even)::("y ",o dd ): :ni l) . = (a_seq (pre (a_onj (pred "even" (avar "x" :: nil)) (a_onj (pred "odd" (avar "y" :: nil)) a_true)) (a_assign "x" (aplus (avar "x") (avar "y")))) (pre (a_onj (pred "odd" (avar "x" :: nil)) (a_onj (pred "odd" (avar "y" :: nil)) a_true)) (a_assign "y" (aplus (avar "y") (anum 1)))), ("x", odd) :: ("y", even) :: nil) : a_instr * state oe 5 A stronger in terpreter More preise results an b e obtained for while lo ops. F or ea h lo op w e need to nd a state whose in terpretation as an assertion will b e an aeptable in v arian t for the lo op. W e w an t this in v arian t to tak e in to aoun t an y information that an b e extrated from the b o olean test in the lo op: when en tering inside the lo op, w e kno w that the test sueeded; when exiting the lo op w e kno w that the test failed. It turns out that this information an help us detet ases where the b o dy of a lo op is nev er exeuted and ases where a lo op an nev er terminate. T o desrib e non-termination, w e hange the t yp e of v alues returned b y the abstrat in terpreter: instead of returning an annotated instrution and a state, our new abstrat in terpreter returns an annotated instrution and an optional state: the optional v alue is None when w e ha v e deteted that exeution annot terminate. This detetion of guaran teed non-termination is onserv ativ e: when the analyser annot guaran tee that an instrution lo ops, it returns a state as usual. The presene of optional states will sligh tly omplexify the struture of our stati analysis. W e assume the existene of t w o new funtions for this purp ose. learn_from_sue ss : state -> bexpr -> option state , this is used to eno de the information learned when the test sueeded. F or instane if the en vironmen t initially on tains an in terv al [0,10℄ for the v ariable x and the test is x < 6 , then w e an return the en vironmen t so that the v alue for x b eomes [0, 5℄ . Sometimes, the initial en vironmen t is so that the test an nev er b e satised, in this ase a v alue None is returned instead of an en vironmen t. learn_from_failu re : state -> bexpr -> option state , this is used to ompute information ab out a state kno wing that a test failed. The b o dy of a while lo op is often mean t to b e run sev eral times. In abstrat in- terpretation, this is also true. A t ev ery run, the information ab out ea h v ariable at ea h lo ation of the instrution needs to b e up dated to tak e in to aoun t more and more onrete v alues that ma y b e rea hed at this lo ation. In traditional approa hes to abstrat in terpretation, a binary op eration is applied at ea h lo- ation, to om bine the information previously kno wn at this lo ation and the new v alues diso v ered in the urren t run. This is mo deled b y a binary op eration. join : A -> A -> A , this funtion tak es t w o abstrat v alues and returns a new abstrat v alue whose in terpretation as a set is larger than the t w o inputs. The theoretial desription of abstrat in terpretation insists that the set A , to- gether with the v alues join and top should onstitute an upp er semi-lattie. In fat, W e will use only part of the prop erties of su h a struture in our pro ofs ab out the abstrat in terpreter. When the funtions learn_from_su ess and learn_from_failu re return a None v alue, w e atually detet that some o de will nev er b e exeuted. F or instane, if learn_from_sue ss returns None , w e an kno w that the test at the en try of a lo op will nev er b e satised and w e an onlude that the b o dy of the lo op is not exeuted. In this ondition, w e an mark this lo op b o dy with a false assertion. W e pro vide a funtion for this purp ose: Fixpoint mark (i:instr) : a_instr := math i with assign x e => pre a_false (a_assign x e) | seq i1 i2 => a_seq (mark i1) (mark i2) | while b i => a_while b a_false (mark i) end. Beause it marks almost ev ery instrution, this funtion mak es it easy to reog- nize at rst glane the fragmen ts of o de that are dead o de. A more ligh t w eigh t approa h ould b e to mark only the sub-instrutions for whi h an annotation is mandatory: while lo ops. 5.1 Main struture of in v arian t sear h In general, nding the most preise in v arian t for a while lo op is an undeidable problem. Here w e are desribing a stati analysis to ol. W e will trade preiseness for guaran teed termination. The approa h w e will desrib e will b e as follo ws: 1. Run the b o dy of the lo op abstratly for a few times, progressiv ely widening the sets of v alues for ea h v ariable at ea h run. If this pro ess stabilizes, w e ha v e rea hed an in v arian t, 2. If no in v arian t w as rea hed, try taking o v er-appro ximations of the v alues for some v ariables and run again the lo op for a few times. This pro ess ma y also rea h an in v arian t, 3. If no in v arian t w as rea hed b y progressiv e widening, pi k an abstrat state that is guaran teed to b e an in v arian t (as w e did for the rst simple in ter- preter: tak e the top state that giv es no information ab out an y v ariable), 4. In v arian ts that w ere obtained b y o v er-appro ximation an then b e impro v ed b y a narr owing pro ess: when run through the lo op again, ev en if no infor- mation ab out the state is giv en at the b eginning of the lo op, w e ma y still b e able to gather some information at the end of exeuting the lo op. The state that gathers the information at the end of the lo op and the information b e- fore en tering the lo op is most lik ely to b e an in v arian t, whi h is more preise (narro w er) than the top state. Again this pro ess ma y b e run sev eral times. W e shall no w review the op erations in v olv ed in ea h of these steps. 5.2 Joining states together Abstrat states are nite list of pairs of v ariable names and abstrat v alues. When a v ariable do es not o ur in a state, the asso iated abstrat v alue is top . When joining t w o states together ev ery v ariable that do es not o ur in one of the t w o states should reeiv e the top v alue, and ev ery v ariable that o urs in b oth states should reeiv e the join of the t w o v alues found in ea h state. W e desrib e this b y writing a funtion that studies all the v ariables that o ur in one of the lists: it is guaran teed to p erform the righ t b eha vior for all the v ariables in b oth lists, it naturally asso iates the top v alue to the v ariables that do not o ur in the rst list (b eause no pair is added for these v ariables), and it naturally asso iates the top v alue to the v ariables that do not o ur in the seond list, b eause top is the v alue found in the seond list and join preserv es top . Fixpoint join_state (s1 s2:state) : state := math s1 with nil => nil | (x,v)::tl => a_upd x (join v (lookup s2 x)) (join_state tl s2) end. Beause w e sometimes detet that some instrution will not b e exeuted w e o - asionally ha v e to onsider situation w ere w e are not giv en a state after exeuting a while lo op. In this ase, w e ha v e to om bine together a state and the absene of a state. But b eause the absene of state orresp onds to a false assertion, the other state is enough to desrib e the required in v arian t. W e eno de this in an auxiliary funtion. Definition join_state' (s: state)(s':optio n state) : state := math s' with Some s' => join_state s s' | None => s end. 5.3 Running the b o dy a few times In our general desription of the abstrat in terpretation of lo ops, w e need to exeute the b o dy of lo ops in t w o dieren t mo des: one mo de is a widening mo de the other is a narr owing mo de. In the narro wing mo de, after exeuting the b o dy of the lo op needs to b e joined with the initial state b efore exeuting the b o dy of the lo op, so that the result state is less preise than b oth the state b efore exeuting the b o dy of the lo op and the state after exeuting the b o dy of the lo op. In the narr owing mo de, w e start the exeution with an en vironmen t that is guaran teed to b e large enough, hoping to narro w this en vironmen t to a more preise v alue. In this ase, the join op eration m ust not b e done with the state that is used to start the exeution, but with another state whi h desrib es the information kno wn ab out v ariables b efore onsidering the lo op. T o aomo date these t w o mo des of abstrat exeution, w e use a funtion that tak es t w o states as input: the rst state is the one with whi h the result m ust b e joined, the seond state is the one with whi h exeution m ust start. In this funtion, the argumen t ab is the funtion that desrib es the abstrat in terpretation on the instrution inside the lo op, the argumen t b is the test of the lo op. The funtion ab returns an optional state and an annotated instrution. The optional state is None when the abstrat in terpreter an detet that the exeution of the program from the input state will nev er terminate. When putting all elemen ts together, the argumen t ab will b e instan tiated with the reursiv e all of the abstrat in terpreter on the lo op b o dy . Definition step1 (ab: state -> a_instr * option state) (b:bexpr) (init s:state) : state := math learn_from_su es s s b with Some s1 => let (_, s2) := ab s1 in join_state' init s2 | None => s end. W e then onstrut a funtion that rep eats step1 a ertain n um b er of times. This n um b er is denoted b y a natural n um b er n . In this funtion, the onstan t 0 is a natural n um b er and w e need to mak e it preise to Co q's parser, b y expressing that the v alue m ust b e in terpreted in a parsing sop e for natural n um b ers instead of in tegers, using the sp eier %nat . Fixpoint step2 (ab: state -> a_instr * option state) (b:bexpr) (init s:state) (n:nat) : state := math n with 0%nat => s | S p => step2 ab b init (step1 ab b init s) p end. The omplexit y of these funtions an b e impro v ed: there is no need to ompute all iterations if w e an detet early that a xed p oin t w as rea hed. In this pap er, w e prefer to k eep the o de of the abstrat in terpreter simple but p oten tially ineien t to mak e our formal v eriation w ork easier. 5.4 V erifying that a state is more preise than another T o v erify that w e ha v e rea hed an in v arian t, w e need to he k for a state s , so that running this state through step1 ab b s s returns a new state that is not less preise than s . F or this, w e assume that there exist a funtion that mak es it p ossible to ompare t w o abstrat v alues: thinner : A -> A -> bool , this funtion returns true when the rst ab- strat v alue giv es more preise information than the seond one. Using this basi funtion on abstrat v alues, w e dene a new funtion on states: Fixpoint s_stable (s1 s2 : state) : bool := math s1 with nil => true | (x,v)::tl => thinner (lookup s2 x) v && s_stable tl s2 end. This funtion tra v erses the rst state to he k that the abstrat v alue asso iated to ea h v ariable is less preise than the information found in the seond state. This funtion is then easily used to v erify that a giv en state is an in v arian t through the abstrat in terpretation of a lo op's test and b o dy . Definition is_inv (ab:state-> a_instr * option state) (s:state)(b:bexp r): bo ol := s_stable s (step1 ab b s s). 5.5 Narro wing a state The step2 funtion reeiv es t w o argumen ts of t yp e state . The rst argumen t is solely used for join op erations, while the seond argumen t is used to start a sequene of abstrat states that orresp ond to iterated in terpretations of the lo op test and b o dy . When the start state is not stable through in terpretation, the resulting state is larger than b oth the rst argumen t and the start argumen t. When the start state is stable through in terpretation, there are ases where the resulting state is smaller than the start state. F or instane, in the ases where the abstrat v alues are even and odd , if the rst state argumen t maps the v ariable y to even and the v ariable z to odd , the start state maps y and z to the top abstrat v alue (the abstrat v alue that giv es no information) and the while lo op is the follo wing: while (x < 10) do x := x + 1; z:= y + 1; y := 2 done Then, after abstratly exeuting the lo op test and b o dy one, w e obtain a state where y has the v alue even and z has the top abstrat v alue. This state is more preise than the start state. After abstratly exeuting the lo op test and b o dy a seond time, w e obtain a state where z has the v alue odd and y has the v alue even . This state is more preise than the one obtained only after the rst abstrat run of the lo op test and b o dy . The example ab o v e sho ws that o v er-appro ximations are impro v ed b y running the abstrat in terpreter again on them. This phenomenon is kno wn as narr owing . It is w orth foring a narro wing phase after ea h phase that is lik ely to pro due an o v er-appro ximation of the smallest xed-p oin t of the abstrat in terpreter. This is used in the abstrat in terpreter that w e desrib e b elo w. 5.6 Allo wing for o v er-appro ximations In general, the nite amoun t of abstrat omputation p erformed in the step2 funtion is not enough to rea h the smallest stable abstrat state. This is re- lated to the undeidabilit y of the halting problem: it is often p ossible to write a program where a v ariable will reeiv e a preise v alue exatly when some other program terminates. If w e w ere able to ompute the abstrat v alue for this v ari- able in a nite amoun t of time, w e w ould b e able to design a program that solv es the halting problem. Ev en if w e are faing a program where nding the smallest state an b e done in a nite amoun t of time, w e ma y w an t to aelerate the pro ess b y taking o v er-appro ximations. F or instane, if w e onsider the follo wing lo op: while x < 10 do x := x + 1 done If the abstrat v alues w e are w orking with are in terv als and w e start with the in terv al [0,0℄ , after abstratly in terpreting the lo op test and b o dy one, w e obtain that the v alue for x should on tain at least [0,1℄ , after abstratly in ter- preting 9 times, w e obtain that the v alue for x should on tain at least [0,9℄ . Un til these 9 exeutions, w e ha v e not seen a stable state. A t the 10th exeution, w e obtain that the v alue for x should on tain at least [0, 10℄ and the 11th exeution sho ws that this v alue atually is stable. A t an y time b efore a stable state is rea hed, w e ma y ho ose to replae the urren t unstable state with a state that is larger. F or instane, w e ma y ho ose to replae [0,3℄ with [0,100℄ . When this happ ens, the abstrat in terpreter an diso v er that the resulting state after starting with the one that maps x to [0,100℄ atually maps x to [0,10℄ , th us [0,100℄ is stable and is go o d andidate to en ter a narro wing phase. This narro wing phase atually on v erges to a state that maps x to [0,10℄ . The hoie of o v er-appro ximations is arbitrary and information ma y atually b e lost in the pro ess, b eause o v er-appro ximated states are less preise, but this is omp ensated b y the fat that the abstrat in terpreter giv es qui k er answ ers. The termination of the abstrat in terpreter an ev en b e guaran teed if w e imp ose that a guaran teed o v er-appro ximation is tak en after a nite amoun t of steps. An example of a guaran teed o v er-appro ximation is a state that maps ev ery v ariable to the top abstrat v alue. In our Co q eno ding, su h a state is represen ted b y the nil v alue. The hoie of o v er-appro ximation strategies v aries from one abstrat domain to the other. In our Co q eno ding, w e hose to let this o v er-appro ximation b e represen ted b y a funtion with the follo wing signature: over_approx : nat -> state -> state -> state When applied to n , s , and s' , this funtion omputes an o v er appro ximation of s' . The v alue s is supp osed to b e a v alue that omes b efore s' in the abstrat in terpretation and an b e used to ho ose the o v er-appro ximation lev erly , as it giv es a sense of diretion to the urren t ev olution of suessiv e abstrat v alues. The n um- b er n should b e used to ne-tune the oarseness of the o v er-appro ximation: the lo w er the v alue of n , the oarser the appro ximation. F or instane, when onsidering the example ab o v e, kno wing that s = [ 0 , 1 ] and s ′ = [ 0 , 2 ] are t w o suessiv e unstable v alues rea hed b y the abstrat in terpreter for the v ariable x an suggest to ho ose an o v er-appro ximation where the upp er b ound hanges but the lo w er b ound remains un hanged. In this ase, w e exp et the funtion over_approx to return [0, + ∞ ℄ , for example. 5.7 The main in v arian t sear hing funtion W e an no w desrib e the funtion that p erforms the pro ess desrib ed in se- tion 5.1 . The o de of this funtion is as follo ws: Fixpoint find_inv ab b init s i n : state := let s' := step2 ab b init s (hoose_1 s i) in if is_inv ab s' b then s' else math n with 0%nat => nil | S p => find_inv ab b init (over_approx p s s') i p end. The funtion hoose_1 is pro vided at the same time as all other funtions that are sp ei to the abstrat domain A , su h as join , a_add , et. The argumen t funtion ab is supp osed to b e the funtion that p erforms the abstrat in terpretation of the lo op inner instrution i (also alled the lo op b o dy), the b o olean expression b is supp osed to b e the lo op test. The state init is supp osed to b e the initial input state at the rst in v o ation of find_inv on this lo op and s is supp osed to b e the urren t o v er-appro ximation of init , n is the n um b er of o v er-appro ximations that are still allo w ed b efore the funtion should swit h to the nil state, whi h is a guaran teed o v er-appro ximation. This funtion systematially runs the abstrat in terpreter on the inner instrution an arbitrary n um b er of times (giv en b y the funtion hoose_1 ) and then tests whether the resulting state is an in v arian t. Narro wing steps atually tak e plae if the n um b er of iterations giv en b y hoose_1 is large enough. If the result of the iterations is an in v arian t, then it is returned. When the result state is not an in v arian t, the funtion find_inv is alled reursiv ely with a larger appro ximation omputed b y over_approx . When the n um b er of allo w ed reursiv e alls is rea hed, the nil v alue is returned. 5.8 Annotating the lo op b o dy with abstrat information The find_inv funtion only pro dues a state, while the abstrat in terpreter is also supp osed to pro due an annotated v ersion of the instrution. One w e kno w the in v arian t, w e an annotate the while lo op with this in v arian t and obtain an annotated v ersion of the lo op b o dy b y re-running the abstrat in terpreter on this instrution. This is done with the follo wing funtion: Definition do_annot (ab:state-> a_instr * option state) (b:bexpr) (s:state) (i:instr) : a_instr := math learn_from_su es s s b with Some s' => let (ai, _) := ab s' in ai | None => mark i end. In this funtion, ab is supp osed to ompute the abstrat in terpretation of the lo op b o dy . When the funtion learn_from_sue ss returns a None v alue, this means that the lo op b o dy is nev er exeuted and it is mark ed as dead o de b y the funtion mark . 5.9 The abstrat in terpreter's main funtion With the funtion find_inv , w e an no w design a new abstrat in terpreter. Its main struture is ab out the same as for the naiv e one, but there are t w o imp ortan t dierenes. First, the abstrat in terpreter no w uses the find_inv funtion to ompute an in v arian t state for the while lo op. Seond, this abstrat in terpreter an detet ases where instrutions are guaran teed to not terminate. This is a seond part of dead o de detetion: when a go o d in v arian t is deteted for the while lo op, a omparison b et w een this in v arian t and the lo op test ma y giv e the information that the lo op test an nev er b e falsied. If this is the ase, no state is returned and the instrutions follo wing this while lo op in sequenes m ust b e mark ed as dead o de. This is handled b y the fat that the abstrat in terpreter no w returns an optional state and an annotated instrution. The ase for the sequene is mo died to mak e sure instrution are mark ed as dead o de when reeiving no input state. Fixpoint ab2 (i:instr)(s:sta te ) : a_instr*option state := math i with assign x e => (pre (s_to_a s) (a_assign x e), Some (a_upd x (a_af s e) s)) | seq i1 i2 => let (a_i1, s1) := ab2 i1 s in math s1 with Some s1' => let (a_i2, s2) := ab2 i2 s1' in (a_seq a_i1 a_i2, s2) | None => (a_seq a_i1 (mark i2), None) end | while b i => let inv := find_inv (ab2 i) b s s i (hoose_2 s i) in (a_while b (s_to_a inv) (do_annot (ab2 i) b inv i), learn_from_failu re inv b) end. This funtion relies on an extra n umeri funtion hoose_2 to deide the n um b er of times find_inv will attempt progressiv e o v er-appro ximations b efore giving up and falling ba k on the nil state. Lik e hoose_1 and over_approx , this funtion m ust b e pro vided at the same time as the t yp e for abstrat v alues. 6 Pro ving the orretness of the abstrat in terpreter T o pro v e the orretness of our abstrat in terpreter, w e adapt the orretness statemen ts that w e already used for the naiv e in terpreter. The main hange is that the resulting state is optional, with a None v alue orresp onding to non- termination. This means that when a None v alue is obtained w e an tak e the p ost- ondition as the false assertion. This is expressed with the follo wing funtion, mapping an optional state to an assertion. Definition s_to_a' (s':option state) : assert := math s' with Some s => s_to_a s | None => a_false end. The main orretness statemen t th us b eomes the follo wing one: Theorem ab2_orret : forall i i' s s', onsistent s -> ab2 i s = (i', s') -> valid m (v i' (s_to_a' s')). By omparison with the similar theorem for ab1 , w e remo v ed the part ab out the nal state satisfying the onsistent . This part is atually pro v ed in a lemma b eforehand. The reason wh y w e hose to establish the t w o results at the same time for ab1 and in t w o stages for ab2 is anedotal. As for the naiv e in terpreter this theorem is paired with a lemma asserting that leaning up the resulting annotated instrution i' yields ba k the initial instrution i . W e atually need to pro v e t w o lemmas, one for the mark funtion (used to mark o de as dead o de) and one for ab2 itself. Lemma mark_lean : forall i, leanup (mark i) = i. Theorem ab2_lean : forall i i' s s', ab2 i s = (i', s') -> leanup i' = i. These t w o lemmas are pro v ed b y indution on the struture of the instrution i . 6.1 Hyp otheses ab out the auxiliary funtions The abstrat in terpreter relies on a olletion of funtions that are sp ei to the abstrat domain b eing handled. In our Co q dev elopmen t, this is handled b y dening the funtion inside a setion, where the v arious omp onen ts that are sp ei to the abstrat domain of in terpretation are giv en as setion v ariables and h yp otheses. When the setion is losed, the v arious funtions dened in the setion are abstrated o v er the v ariables that they use. Th us, the funtion ab2 b eomes a 16-argumen t funtion. The extra t w elv e argumen ts are as follo ws: 1. A : Type , the t yp e on taining the abstrat v alues, 2. from_Z : Z -> A , a funtion mapping in teger v alues to abstrat v alues, 3. top : A , an abstrat v alue represen ting la k of information, 4. a_add : A -> A -> A , an addition op eration for abstrat v alues, 5. to_pred : A -> aexpr -> assert , a funtion mapping abstrat v alues to their in terpretations as assertions on arithmeti expressions, 6. learn_from_sue ss : state A -> bexpr -> state A , a funtion that is able to impro v e a state, kno wing that a b o olean expression's ev aluation re- turns true , 7. learn_from_failu re : state A -> bexpr -> state A , similar to the pre- vious one, but using the kno wledge that the b o olean expression's ev aluation returns false , 8. join : A -> A -> A , a binary funtion on abstrat v alues that returns an abstrat v alue that is oarser than the t w o inputs, 9. thinner : A -> A -> bool , a omparison funtion that sueeds when the rst argumen t is more preise than the seond, 10. over_approx : nat -> state A -> state A -> state A , a funtion that implemen ts heuristis to nd o v er-appro ximations of its argumen ts, 11. hoose_1 : state A -> instr -> nat , a funtion that returns the n um- b er of times a lo op b o dy should b e exeuted with a giv en start state b efore testing for stabilisation, 12. hoose_2 : state A -> instr -> nat , a funtion that returns the n um- b er of times o v er-appro ximations should b e attempted b efore giving up and using the oarsest state. Most of these funtions m ust satisfy a olletion of prop erties to ensure that the orretness statemen t will b e pro v able. There are fourteen su h prop erties, whi h an b e sorted in the follo wing w a y: 1. Three prop erties are onerned with the assertions reated b y to_pred , with resp et to their logial in terpretation and to substitution. 2. T w o prop erties are onerned with the onsisteny of in terpretation of ab- strat v alues obtained through from_Z and a_add as prediates o v er in tegers. 3. T w o prop erties are onerned with the logial prop erties of abstrat states omputed with the help of learn_from_sues s and learn_from_fail ur e . 4. F our prop erties are onerned with ensuring that over_approx , join , and thinner do return or detet o v er-appro ximations orretly , 5. Three prop erties are onerned with ensuring that the onsistent prop er- ties is preserv ed through learn_from... and over_approx . 6.2 Main taining the onsistent prop ert y F or this abstrat in terpreter, w e need again to pro v e that it main tains the prop- ert y that all states are dupliation-free. It is rst established for the join_state op eration. A tually , sine the join_state op eration p erforms rep etitiv e up dates from the nil state, the result is dupliation-free, regardless of the dupliations in the inputs. This is easily obtained with a pro of b y indution on the rst argumen t. F or one, w e sho w the full pro of sript. Lemma join_state_onsis te nt : forall s1 s2, onsistent (join_state s1 s2). intros s1 s2; indution s1 as [ | [x v℄ s1 IHs1℄; simpl; auto. apply onsistent_update ; auto. Qed. The rst t w o lines of this Co q exerpt giv e the theorem statemen t. The line intros ... explains that a pro of b y indution should b e done. This pro of raises t w o ases, and the as ... fragmen t states that in the step ase (the seond ase), one should onsider a list whose tail is named s1 and whose rst pair on tains a v ariable x and an abstrat v alue v , and w e ha v e an indution h yp othesis, whi h should b e named IHs1 : this indution h yp othesis states that s1 already satises the onsistent prop ert y . The simpl diretiv e expresses that the reursiv e fun- tion should b e simplied if p ossible, and auto attempts to solv e the goals that are generated. A tually , the omputation of reursiv e funtions leads to pro ving true = true in the base ase and auto tak es are of this. F or the step ase, w e simply need to rely on the theorem onsistent_updat e (see setion 4.6 ). The premise of this theorem atually is IHs1 and auto nds it. 6.3 Relating input abstrat states and pre-onditions Similarly to what w as done for the naiv e abstrat in terpreter, w e w an t to ensure that the in terpretation of the input abstrat state as a logial form ula implies the pre-ondition for the generated annotated instrution and the generated p ost- ondition. F or the while lo op, this relies on the fat that the seleted in v arian t is obtained after rep etitiv e joins with the input state. W e rst establish t w o monotoniit y prop erties for the join_state funtion, w e sho w only the rst one: Lemma join_state_safe_1 : forall g s1 s2, ia m g (s_to_a s1) -> ia m g (s_to_a (join_state s1 s2)). Then, w e only need to propagate the prop ert y up from the step1 funtion. Again, w e sho w only the rst one but there are similar lemmas for step2 , find_inv ; and w e onlude with the prop ert y for ab2 : Lemma step1_p : forall g ab b s s', ia m g (s_to_a s) -> ia m g (s_to_a s') -> ia m g (s_to_a (step1 ab b s s')). Lemma ab2_p : forall i i' s s', ab2 i s = (i', s') -> forall g a, ia m g (s_to_a s) -> ia m g (p i' a). The pro of for step1_p is a diret onsequene of the denition and the prop er- ties of join_state . The pro ofs for step2 and find_inv are done b y indution on n . The pro of for ab2 is an easy indution on the instrution i . In partiular, the t w o state argumen ts to the funtion find_inv are b oth equal to the input state in the ase of while lo ops. 6.4 V alidit y of the generated onditions The main theorem is ab out ensuring that all v eriation onditions are pro v able. A go o d half of this problem is already tak en are of when w e pro v e the theorem ab2_p , whi h expresses that at ea h step the state is strong enough to ensure the v alidit y of the pre-ondition for the instrution that follo ws. The main added diult y is to v erify that the in v arian t omputed for ea h while lo op atually is in v arian t. This diult y is tak en are of b y the struture of the funtion find_inv , whi h atually in v ok es the funtion is_inv on its exp eted output b efore returning it. Th us, w e only need to pro v e that is_inv orretly detets states that are in v arian ts: Lemma is_inv_orret : forall ab b g s s' s2 ai, is_inv ab s b = true -> learn_from_su ess s b = Some s' -> ab s' = (ai, s2) -> ia m g (s_to_a' s2) -> ia m g (s_to_a s). W e an then dedue that find_inv is orret: the pro of pro eeds b y sho wing that the v alue this funtion returns is either v eried using is_inv or the nil state. The orretness statemen t for find_inv has the follo wing form: Lemma find_inv_orret : forall ab b g i n init s s' s2 ai, learn_from_sue ss (find_inv ab b init s i n) b = Some s' -> ab s' = (s2, ai) -> ia m g (s_to_a' s2) -> ia m g (s_to_a (find_inv ab b init s i n)). This an then b e om bined with the assumptions that learn_from_sues s and learn_from_failu re orretly impro v e the information giv en in abstrat state to sho w that the v alue returned for while lo ops in ab2 is orret. These assump- tions ha v e the follo wing form (the h yp othesis for the learn_from_failur e has a negated third assumption). Hypothesis learn_from_sues s_ se m : forall s b g, onsistent s -> ia m g (s_to_a s) -> ia m g (a_b b) -> ia m g (s_to_a' (learn_from_su e ss s b)). 7 An in terv al-based instan tiation The abstrat in terpreters w e ha v e desrib ed so far are generi and are ready to b e instan tiated on sp ei abstrat domains. In this setion w e desrib e an instan tiation on an abstrat domain to represen t in terv als. This domain of in- terv als on tains in terv als with nite b ounds and in terv als with innite b ounds. The in terv al with t w o innite b ounds represen ts the whole t yp e of in tegers. W e desrib e these in terv als with an indutiv e t yp e that has four v arian ts: Indutive interval : Type := above : Z -> interval | below : Z -> interval | between : Z -> Z -> interval | all_Z : interval. F or instane, the in terv al on taining all v alues larger than or equal to 10 is represen ted b y above 10 and the whole t yp e of in tegers is represen ted b y all_Z . The in terv al asso iated to an in teger is simply desrib ed as the in terv al with t w o nite b ounds equal to this in teger. Definition i_from_Z (x:Z) := between x x. When adding t w o in terv als, it sues to add the t w o b ounds, b eause addi- tion preserv es the order on in tegers. Coping with all the v arian ts of ea h p ossible input yields a funtion with man y ases. Definition i_add (x y:interval) := math x, y with above x, above y => above (x+y) | above x, between y z => above (x+y) | below x, below y => below (x+y) | below x, between y z => below (x+z) | between x y, above z => above (x+z) | between x y, below z => below (y+z) | between x y, between z t => between (x+z) (y+t) | _, _ => all_Z end. The assertions asso iated to ea h abstrat v alue an rely on only one, as w e an re-use the same omparison prediate for almost all v arian ts. This is desrib ed in the to_pred funtion. Definition i_to_pred (x:interval) (e:aexpr) : assert := math x with above a => pred "leq" (anum a::e::nil) | below a => pred "leq" (e::anum a::nil) | between a b => a_onj (pred "leq" (anum a::e::nil)) (pred "leq" (e::anum b::nil)) | all_Z => a_true end. Of ourse, the meaning atta hed to the string "leq" m ust b e orretly xed in the orresp onding instan tiation for the m parameter: Definition i_m (s : string) (l: list Z) : Prop := if string_de s "leq" then math l with x::y::nil => x <= y | _ => False end else False. 7.1 Learning from omparisons The funtions i_learn_from_su es s and i_learn_from_fai lu re used when pro essing while lo ops an b e made arbitrarily omplex. F or the sak e of onise- ness, w e ha v e only designed a pair of funtions that detet the ase where the b o olean test has the form x < e , where e is an arbitrary arithmeti expression. In this ase, the funtion i_learn_from_su e ss up dates only the v alue asso i- ated to x : the initial in terv al asso iated with x is in terseted with the in terv al of all v alues that are less than the upp er b ound of the in terv al omputed for e . An imp ossibilit y is deteted when the lo w est p ossible v alue for x is larger than or equal to the upp er b ound for e . Ev en this simple strategy yields a funtion with man y ases, of whi h w e sho w only the ases where b oth x and e ha v e in terv al v alues with nite b ounds: Definition i_learn_from_su es s s b := math b with blt (avar x) e => math a_af _ i_from_Z all_Z i_add s e, lookup _ all_Z s x with ... | between _ n, between m p => if Z_le_de n m then None else if Z_le_de n p then Some (a_upd _ x (between m (n-1)) s) else Some s ... end | _ => Some s end. In the o de of this funtion, the funtions a_af , lookup , and a_upd are parame- terized b y the funtions from the datat yp e of in terv als that they use: i_from_Z , all_Z and i_add for a_af , all_Z for lookup , et. The funtion i_learn_from_fail ur e is designed similarly , lo oking at upp er b ounds for x and lo w er b ounds for e instead. 7.2 Comparing and joining in terv als The treatemen t of lo ops also requires a funtion to nd upp er b ounds of pairs of in terv als and a funtion to ompare t w o in terv als. These funtions are simply dened b y pattern-mat hing on the kind of in terv als that are enoun tered and then omparing the upp er and lo w er b ounds. Definition i_join (i1 i2:interval) : interval := math i1, i2 with above x, above y => if Z_le_de x y then above x else above y ... | between x y, between z t => let lower := if Z_le_de x z then x else z in let upper := if Z_le_de y t then t else y in between lower upper | _, _ => all_Z end. Definition i_thinner (i1 i2:interval) : bool := math i1, i2 with above x, above y => if Z_le_de y x then true else false | above _, all_Z => true ... | between x _, above y => if Z_le_de y x then true else false | between _ x, below y => if Z_le_de x y then true else false | _, all_Z => true ... end. 7.3 Finding o v er-appro ximations When the in terv al asso iated to a v ariable do es not stabilize, an o v er-appro xi- mation m ust b e found for this in terv al. W e implemen t an approa h where sev eral steps of o v er-appro ximation an b e tak en one after the other. F or in terv als, nding o v er-appro ximations an b e done b y pushing one of the b ounds of ea h in terv al to innit y . W e use the fat that the generi abstrat in terpreter alls the o v er-appro ximation with t w o v alues to ho ose the b ound that should b e pushed to innit y: in a rst round of o v er-appro ximation, only the b ound that do es not app ear to b e stable is mo died. This strategy is partiularly w ell adapted for lo ops where one v ariable is inreased or dereased b y a xed amoun t at ea h exeution of the lo op's b o dy . The strategy is implemen ted in t w o funtions, the rst funtion o v er-appro xi- mates an in terv al, the seond funtion applies the rst to all the in terv alles found in a state. Definition open_interval (i1 i2:interval) : interval := math i1, i2 with below x, below y => if Z_le_de y x then i1 else all_Z | above x, above y => if Z_le_de x y then i1 else all_Z | between x y, between z t => if Z_le_de x z then if Z_le_de t y then i1 else above x else if Z_le_de t y then below y else all_Z | _, _ => all_Z end. Definition open_intervals (s s':state interval) : state interval := map (fun p:string*interval => let (x, v) := p in (x, open_interval v (lookup _ all_Z s' x))) s. The result of open_interval i1 i2 is exp eted to b e an o v er-appro ximation of i1 . The seond argumen t i2 is only used to ho ose whi h of the b ounds of i1 should b e mo died. The funtion i_over_approx reeiv es a n umeri parameter to indiate the strength of o v er-appro ximation that should b e applied. Here, there are only t w o strengths: at the rst try (when the lev el is larger than 0), the funtion applies open_intervals ; at the seond try , it simply returns the nil state, whi h orresp onds to the top v alue in the domain of abstrat states. Definition i_over_approx n s s' := math n with S _ => open_intervals s s' | _ => nil end. The abstrat in terpreter also requires t w o funtions that ompute the n um b er of attempts at ea h lev el of rep etitiv e op eration. W e dene these t w o funtions as onstan t funtions: Definition i_hoose_1 (s:state interval) (i:instr) := 2%nat. Definition i_hoose_2 (s:state interval) (i:instr) := 3%nat. One the t yp e interval and the v arious funtions are pro vided w e obtain an abstrat in terpreter for omputing with in terv als. Definition abi := ab2 interval i_from_Z all_Z i_add i_to_pred i_learn_from_su e ss i_learn_from_fa ilu re i_join i_thinner i_over_approx i_hoose_1 i_hoose_2. W e an already run this instan tiated in terpreter inside the Co q system. F or instane, w e an run the in terpreter on the instrution: while x < 10 do x := x + 1 done This giv es the follo wing dialog (where the answ er of the Co q system is written in italis): Eval vm_ompute in abi (while (blt (avar "x") (anum 10)) (assign "x" (aplus (avar "x") (anum 1)))) (("X", between 0 0)::nil). = (a_while (blt (avar "x") (anum 10)) (a_onj (a_onj (pred "leq" (anum 0 :: avar "x" :: nil)) (pred "leq" (avar "x" :: anum 10 :: nil))) a_true) (pre (a_onj (a_onj (pred "leq" (anum 0 :: avar "x" :: nil)) (pred "leq" (avar "x" :: anum 9 :: nil))) a_true) (a_assign "x" (aplus (avar "x") (anum 1)))), Some (("x", between 10 10) :: nil)) : a_instr * option (state interval) 8 Conlusion This pap er desrib es ho w the funtional language presen t in a higher-order the- orem pro v er an b e used to eno de a to ol to p erform a stati analysis on an arbitrary programming language. The example programming language is ho- sen to b e extremely simple, so that the example an b e desrib ed preisely in this tutorial pap er. The stati analysis to ol that w e desrib ed is inspired b y the approa h of abstrat in terpretation. Ho w ev er this w ork is not a omprehensiv e in tro dution to abstrat in terpretation, nor do es it o v er all the asp ets of en- o ding abstrat in terpretation inside a theorem pro v er. Better desriptions of abstrat in terpretation and its formal study are giv en in [11 ,5 ,12 ℄. The exp erimen t is p erformed with the Co q system. More extensiv e studies of programming languages using this system ha v e b een dev elop ed o v er the last y ears. In partiular, exp erimen ts b y the Comp ert team sho w that not only stati analysis but also eien t ompilation an b e desrib ed and pro v ed orret [4,10 ,6 ℄. Co q is also used extensiv ely for the study of funtional programming languages, in partiular to study the prop erties of t yp e systems and there are a few Co q-based solutions to the general landmark ob jetiv e kno wn as POPLMark [1℄. The abstrat in terpreter w e desrib e here is ineien t in man y resp ets: when analysing the b o dy of a lo op, this lo op needs to b e exeuted abstratly sev eral times, the annotations omputed ea h time are forgotten, and then when an in v arian t is diso v ered, the whole pro ess needs to b e done again to pro due the annotated instrution. A more eien t in terpreter ould b e designed where omputed annotations are k ept in memory long enough to a v oid reomputation when the in v arian t is found. W e did not design the abstrat in terpreter with this optimisation, thinking that the soures of ineieny ould b e alulated a w a y through systemati transformation of programs, as studied in another pap er in this v olume. The abstrat in terpreter pro vided with the pap er [2℄ on tains some of these optimisations. An imp ortan t remark is that program analyses an b e m u h more eien t when they onsider the relations b et w een sev eral v ariables at a time, as opp osed to the exp erimen t desrib ed here where the v ariables are onsidered indep en- den tly of ea h other. More preise w ork where relations b et w een v ariables an b e tra k ed is p ossible, on the ondition that abstrat v alues are used to desrib e omplete states, instead of single v ariables as in [ 4 ℄, where the result of the analy- sis is used as a basis for a ompiler optimisation kno wn as ommon sub expr ession elimination . W e ha v e onen trated on a v ery simple while language in this pap er, for didatial purp oses. Ho w ev er, abstrat in terpreters ha v e b een applied to m u h more omplete programming languages. F or instane, the Astree [8 ℄ analyser o v ers most of the C programming language. On the other hand, the founda- tional pap ers desrib e abstrat in terpretation in terms of analyses on on trol o w graphs. The idea of abstrat in terpretation is general enough that it should b e p ossible to apply it to an y form of programming language. Referenes 1. B. A ydemir, A. Bohannon, M. F airbairn, J. F oster, B. Piere, P . Sew ell, D. V ytin- iotis, G. W ash burn, S. W eiri h, and S. Zdanewi. Me hanized metatheory for the masses: The POPLmark hallenge. In Pr o e e dings of the Eighte enth International Confer en e on The or em Pr oving in Higher Or der L o gis (TPHOLs 2005) , 2005. 2. Y v es Bertot. Theorem pro ving supp ort in programming language seman tis. T e h- nial Rep ort 6242, INRIA, 2007. to app ear in a b o ok in memory of Gilles Kahn. 3. Y v es Bertot and Pierre Castéran. Inter ative The or em Pr oving and Pr o gr am Devel- opment, Co q'A rt:the Cal ulus of Indutive Construtions . Springer-V erlag, 2004. 4. Y v es Bertot, Benjamin Grégoire, and Xa vier Lero y . A strutured approa h to pro ving ompiler optimizations based on datao w analysis. In T yp es for Pr o ofs and Pr o gr ams, W orkshop TYPES 2004 , v olume 3839 of L e tur e Notes in Computer Sien e , pages 6681. Springer, 2006. 5. F rédéri Besson, Thomas Jensen, and Da vid Pi hardie. Pro of-arrying o de from ertied abstrat in terpretation to xp oin t ompression. The or eti al Computer Sien e , 364(3):273291, 2006. 6. Sandrine Blazy , Za ynah Darga y e, and Xa vier Lero y . F ormal v eriation of a C ompiler fron t-end. In FM 2006: Int. Symp. on F ormal Metho ds , v olume 4085 of L e tur e Notes in Computer Sien e , pages 460475. Springer, 2006. 7. P atri k Cousot and Radhia Cousot. Abstrat in terpretation: a unied lattie mo del for stati analysis of programs b y onstrution or appro ximation of xp oin ts. In Confer en e R e or d of the F ourth A CM Symp osium on Priniples of Pr o gr amming L anguages, POPL'77 , pages 238252. A CM Press, 1977. 8. P atri k Cousot, Radhia Cousot, Jérome F eret, An toine Mine Lauren t Maub orgne, Da vid Monniaux, and Xa vier Riv al. The Astrée analyzer. In Eur op e an Symp osium on Pr o gr amming, ESOP'XIV , v olume 3444 of LNCS , pages 2130. Springer, 2005. 9. Edsger W. Dijkstra. A disipline of Pr o gr amming . Pren tie Hall, 1976. 10. Xa vier Lero y . F ormal ertiation of a ompiler ba k-end, or: programming a ompiler with a pro of assistan t. In 33r d symp osium Priniples of Pr o gr amming L anguages , pages 4254. A CM Press, 2006. 11. Da vid Pi hardie. Interpr étation abstr aite en lo gique intuitionniste : extr ation d'analyseurs Java ertiés . PhD thesis, Univ ersité Rennes 1, 2005. In fren h. 12. Da vid Pi hardie. Building ertied stati analysers b y mo dular onstrution of w ell-founded latties. In Pr o . of the 1st International Confer en e on F oundations of Informatis, Computing and Softwar e (FICS'08) , Eletroni Notes in Theoreti- al Computer Siene, 2008. 13. The Co q dev elopmen t team. The o q pro of assistan t, 2008. http://oq.inria.fr .
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment