CPBVP: A Constraint-Programming Framework for Bounded Program Verification

CPBPV: A Constrain t-Programming F ramew ork F or Bounded Program V eriﬁcation H ´ el` ene Collavizza 1 , Michel Rueher 1 , Pascal V an Hen tenryck 2 1 Universit ´ e de Nice–Sophia An tip olis, F rance ( { helen,r ueher } @po lytech.uni ce.fr ) 2 Bro wn Univ ersit y , Box 1910, Pro vid en ce, RI 02912 ( pvh@cs.brown.edu ) Abstract. This pap er studies ho w to verify the conformit y of a pro- gram with its sp eciﬁcation and proposes a nov el constraint-programming framew ork for b ounded program veriﬁcation (CPBPV). The CPBPV framew ork uses constraint stores to represent the sp eciﬁcation and th e program and explores ex ecution paths nondeterministically . The input program is partially correct if each constraint store so produ ced implies the p ost-condition. CPBPV do es not explore spurious execution paths as it in crementally prunes execution paths early by detecting that th e constrain t store is n ot consistent. CPBP V uses the ric h language of con- strain t p ro gramming t o express the constrain t store. Finally , CPBPV is parametrized with a list of solvers whic h are tried in sequ ence, start- ing with the least expensive and les s general. Exp erimen tal results often prod uce orders of magnitude improv ements over earlier approac h es, run- ning times b eing often indep endent of the v ariable domains. Moreo ver, CPBPV was able to detect subt l e errors in some programs while other framew orks based on mod el c hec king h a ve failed. 1 In t roduction This paper is concerned with softw are correctness, a critical issue in softw are en- gineering. It pro poses a novel co nstrain t-progra mming framework for bo unded progra m v eriﬁcation (CPBP V) , i.e., when the progra m inputs (e.g., the ar r a y lengths and the v aria ble v alues) a re b ounded. The goal is to verify the confor mit y of a progr am with its sp eciﬁcation, that is to demons trate that the s p eciﬁcation is a conse q uence of the program. The key idea of CP BPV is to use constr a in t stores to represent the spec iﬁc a tion a nd the program, and to non-deterministically explore ex ecution paths over these co nstrain t stores. This non-deter ministic constraint-based symbolic execution incrementally reﬁnes the constraint store , which initially consists o f the preco ndit ion. Non- deter minism o ccurs when exe- cuting conditiona l or iterative instructio ns and the no n-deterministic execution reﬁnes the constr a in t store by a dding constraints coming from c o nditions and from assignments. The input program is par tially corr e c t if each constraint s tore pro duced by the symbolic e x ecution implies the p ost-condition. It is imp ortan t to emphasize that CPBPV considers progra ms with complete sp eciﬁcations and that verifying the conformity b et w een a program and its sp eciﬁcation requir es to chec k (explicitly or implicitly) a ll executables paths. This is not the cas e in 2 H´ el` ene Co lla v izza , Mic h el Rueher, Pascal V an Hentenryck mo del-c hecking to ols designed to detect violations o f some sp eciﬁc prop erty , e.g ., safety or liveness pr operties. The CPBPV framework has a num ber o f fundamental b eneﬁts. First, con- trary to ea rlier work using constr ain t progr amming or SMT [2,11,12], CPBPV do es no t use predicate a bstraction o r ex plo re spurio us exe c utio n paths, i.e., paths that do no t corre s pond to a ctual executions ov er inputs satis fying the pre-condition. CPBPV incremen tally pr unes execution paths early by detecting that the constra int store is not consis ten t. Second, CPBPV uses the rich languag e of constraint progr amming to expres s the constraint stor e, including arbitrary logical a nd thre s hold combination of cons train ts, the element constr ain t, and global/co m binator ial constra ints that expres s complex relationships on a set o f v ariables. Finally , CPB PV is parametr iz ed with a list of solvers which a re trie d in sequence, starting with the least expens ive and less general. The CPBP V framework w as ev alua ted exper imen tally on a ser ie s of b ench- marks from pr o gram veriﬁcation. Ex perimental results of our (slow) pr otot yp e often pro duce o rders of magnitude improv ements ov er e arlier appr o ac hes , and indicate that the running times a re often independent of the v ariable domains. Moreov er, CPBPV w as able to found subtle errors in some progra ms that s ome other veriﬁcation frameworks based on model-chec king could not de tec t. The rest o f the pap er is or ganized as follows. Section 2 illustra tes how CP BPV handles co nstrain ts s tore on a motiv ating example. Section 3 formalizes the CPBPV fra mework for a sma ll pro gramming langua g e and Section 4 discusse s the implemen tation issues. Section 5 pres en ts e x perimental r esults on a num b er of veriﬁcation problems, comparing o ur appro ac h with state o f the art mo del- chec king based veriﬁcation frameworks. Section 6 discusse s rela ted work in test generation, bo unded pr ogram veriﬁcation and so f tw are model chec king. Section 7 summarizes the contributions and presents future research directions. 2 The Constr ai n t -P rogramm ing F ramework at W ork This section illustrates the CPBP V veriﬁer on a motiv ating example, the binary search prog r am. CPBPV uses Jav a pro grams and JML sp eciﬁcations for the pre- and p ost-conditions, appro pr iately enhanced to support the expressivity of constraint pr ogramming. Figure 1 depicts a binary search pr ogram to deter mine if a v a lue v is pre s en t in a so rted a rray t . (N ote t hat \ re sult in JML corresp onds to the v alue returned by the prog ram). T o verify this progra m, our pr o tot yp e implemen tation r e q uires a b ound on the length o f array t , on its elements, and on v . W e will verify its cor rectness for sp eciﬁc le ng ths and simply assume that the v alues are signed in tegers on a n umber of bits. The initial constraint store of the CPBPV veriﬁer, assuming an input array of length 8, is the precondition 3 c pre ≡ ∀ 0 ≤ i < 7 : t 0 [ i ] ≤ t 0 [ i + 1] where t 0 is an array of constraint v aria bles capturing the input. The constraint v ariables are annotated with a version n um ber as CPB P V per forms a SSA-like renaming 3 W e omit the domain constrain ts on the v ariables for simplicity . A Constraint-Progra mming F ramework for Bounded Program V eriﬁcation 3 /*@ requires (\forall int i; i>=0 && i t[\result] == v) && @ (\result == -1 ==> \forall int k; 0 <= k < t.length ; t[k] != v) @*/ 1 static int binary_sea rch(int[] t, int v) { 2 int l = 0; 3 int u = t.length-1 ; 4 while (l <= u) { 5 int m = (l + u) / 2; 6 if (t[ m]==v) 7 return m; 8 if (t[ m] > v) 9 u = m - 1; 10 else 11 l = m + 1; } // ERROR else u = m - 1; 12 return -1; } Fig. 1. The Binary Search P rogram [10] on the ﬂy since ea c h assignment genera tes constraints p ossibly linking the old and the new v alues of the ass igned v ariable. The ass ig nmen ts in lines 2– 3 add the constraints l 0 = 0 ∧ u 0 = 7. CPBPV then co ns iders the lo op instruction. Since l 0 ≤ u 0 , it enters the lo op b ody , adds the co nstrain t m 0 = ( l 0 + u 0 ) / 2, which simpliﬁes to m 0 = 3, and co nsiders the co ndit ional statement on line 6. The execution of the statement is nondeter ministic: Indeed, bo th t 0 [3] = v 0 and t 0 [3] 6 = v 0 are consistent with the constra in t store, so that the tw o alterna- tives, which give rise to tw o execution pa ths, m ust b e explored. Note that these t wo alternatives co rrespo nd to a ctual execution paths in which t [3] in the input is equal to, o r diﬀerent from, input v . The ﬁrst alterna tiv e adds the co nstrain t t 0 [3] = v 0 to the store and executes line 7 which a dds the constra in t r esul t = m 0 . CPBPV has thus obtained an execution path p whose ﬁnal constra in t store c p is: c pr e ∧ l 0 = 0 ∧ u 0 = 7 ∧ m 0 = ( l 0 + u 0 ) / 2 ∧ t 0 [ m 0 ] = v 0 ∧ r esult = m 0 CPBPV then chec ks whether this store c p implies the p ost-condition c post by searching for a solution to c p ∧ ¬ c post . This test fa ils , indicating that the com- putation path p , whic h c a ptures the set of a ctual executions in which t [3] = v , satisﬁes the sp eciﬁcation. CPBP V then explores the other alternatives to the conditional statement in line 6. It adds the constraint t 0 [ m 0 ] 6 = v 0 and executes the conditional statement in line 8. Once ag ain, this statement is nondetermin- istic. Its ﬁrst alternative a ssumes that the test holds, generating the constraint t 0 [ m 0 ] > v 0 and exec ut ing the instruction in line 9 . Since u is (re-)assig ned, CPBPV cr e ates a new v aria ble u 1 and p osts the constr ain t u 1 = m 0 − 1 = 2 . The execution retur ns to line 4, where the test now r eads l 0 ≤ u 1 , since CPBP V alwa y s uses the most recent version for eac h v ariable. Since the constraint stor es ent ails l 0 ≤ u 1 , the only extensio n to the cur r en t path c onsists of executing line 5, adding the constraint m 1 = ( l 0 + u 1 ) / 2, which actually simpliﬁes to m 1 = 1. Another complete execution path is then obtained b y executing lines 6 and 7. 4 H´ el` ene Co lla v izza , Mic h el Rueher, Pascal V an Hentenryck Consider now a version of the pr ogram in which line 1 1 is replaced by u = m-1 . T o illustrate the CPBP V v er iﬁer, we sp ecify pa r tial execution pa ths by in- dicating which alternative is selected for each nondeterministic instruction. F o r instance, h T 4 , F 6 , T 8 , T 5 , T 6 i denotes the last execution path discussed above in which the tr ue alternative is s elected fo r the ﬁrst executio n of the instr uction in line 4, the false alterna tiv e for the ﬁrst execution of instruction 6, the true alternative for the ﬁrst instruction o f instruction 8, the true a lter nativ e of the second execution of instruction 5 , a nd the true alternative of the second execu- tion of instruction 6. Conside r the pa r tial path h T 4 , F 6 , F 8 i and let us study how it can be extended. The pa r tial path h T 4 , F 6 , F 8 , T 4 , T 6 i is not explored, since it pro duces a co ns train t store containing c pre ∧ t 0 [3] 6 = v 0 ∧ t 0 [3] ≤ v 0 ∧ t 0 [1] = v 0 which is clearly inconsis ten t. Similarly , the path h T 4 , F 6 , F 8 , T 4 , F 6 , T 8 i canno t be extended. The output of CP BPV o n this incorrect progr am when exec uted on an array of length 8 (with integers co ded on 8 -bits to ma k e it reada ble ) pro duces, in 0.025 seconds, the coun terexample: v 0 = − 126 ∧ t 0 = [ − 128 , − 12 7 , − 126 , − 125 , − 12 4 , − 123 , − 122 , − 12 1] ∧ r esult = − 1 . This example highlights a few interesting beneﬁts of CPBPV. 1. The v eriﬁer o nly considers pa ths that co rrespo nd to collections of actual in- puts (a bstracted by constra in t store s). The resulting exec utio n paths must a ll be explored since our goal is to pro ve the pa r tial co r rectness of the pr ogram. 2. The p erformance of the veriﬁer is independent of the in teger re presen ta tion on this application: it only requires a bound on the length of the arr a y . 3. The veriﬁer returns a coun ter -example for debugging the prog ram. Note that C B M C a nd E S C /J ava 2 , tw o state-o f-the-art model chec k ers fail to verify this exa mple as discussed in Section 5. 3 F ormalization of the F ramework This section formalizes the CPBPV veriﬁer on a small abs tract la nguage using a small-step SOS seman tics. The semantics pr imarily sp eciﬁes the execution paths ov er constraint stores explored b y the veriﬁer. It features asser t and enf orce constructs which are necess a ry for mo dular compositio n. Syn tax Figure 2 depicts the syntax of the pr ograms and the constraints gener- ated by the veriﬁer. In the fo llo wing, we use s , po ssibly subscr ipted, to denote elements o f a syntactic entit y S . Renamings CPBPV crea tes v a riables a nd arrays of v ar iables “o n-the-ﬂy” when they are needed. This proces s resembles an SSA nor ma lization but do es no t in- tro duce the jo in no des, since the results of diﬀerent exec ut ion paths are not merged. Similar renamings are used in model chec k ing. The rena ming uses map- pings of t yp e V ∪ A → N whic h ma ps v aria ble s and arr a y s into a natural num b ers A Constraint-Progra mming F ramework for Bounded Program V eriﬁcation 5 L : l ist of instructions ; I : instructions ; B : Bo ole an expr essions E : inte ger expr essions ; A : arr ays ; V : variables L ::= I ; L | ǫ I :: = A [ E ] ← E | V ← E | if B I | while B I | assert ( B ) | enfor c e ( B ) | r eturn E | { L } B ::= t rue | f alse | E > E | E ≥ E | E = E | E 6 = E | E ≤ E | E < E B ::= ¬ B | B ∧ B | B ∨ B | B ⇒ B E ::= V | A [ E ] | E + E | E − E | E × E | E /E | C : c onstr aints E + : solver expr essions V + = { v i | v ∈ V & i ∈ N } : solver variables A + = { a i | a ∈ A & i ∈ N } : solver arr ays C ::= true | f alse | E + > E + | E + ≥ E + | E + = E + | E + 6 = E + | E + ≤ E + | E + < E + C ::= ¬ C | C ∧ C | C ∨ C | C ⇒ C E + ::= V | A [ E + ] | E + + E + | E + − E + | E + × E + | E + /E + | Fig. 2 . The Syntax of Progra ms and Constraint s denoting their current “version num b ers”. In the semantics, the v er sion n umber is incremented eac h time a v ariable or an ar r a y element is a ssigned. W e us e σ ⊥ to denote the uniform mapping to zero (i.e., ∀ x ∈ V ∪ A : σ ⊥ ( x ) = 0) a nd σ [ x/i ] the mapping σ where x no w maps to i , i.e., σ [ x/ i ]( y ) = if x = y then i else σ ( y ) . These mappings are used by a p olymorphic renaming function ρ to transform progra m expr essions into cons tr ain ts. F o r ex a mple, ρ σ b 1 ⊕ b 2 = ( ρ σ b 1 ) ⊕ ( ρ σ b 2 )(where ⊕ ∈ { ∧ , ∨ , ⇒} ) is the rule used to transform a logical expression. Conﬁgurations The CPBCV semantics mostly uses conﬁgurations of the type h l, σ, c i , wher e l is the list of instructions to execute, σ is a version mapping, and c is the set o f constraints generated so far. It a lso uses c onﬁgurations of the form h⊤ , σ, c i to denote ﬁnal states and conﬁguratio ns of the form h⊥ , σ, c i to denote the violation o f an ass ertion. The semantics is s p eciﬁed by rules of the form conditions γ 1 7− → γ 2 stating that conﬁgur ation γ 1 can b e rewritten into γ 2 when the conditions hold. Conditional Instructions The conditional instructio n if b i consider s t w o cases. If the constr a in t c b asso ciated with b is co nsisten t with the constr ain t store, then the store is augmented with c b and the b ody is executed. If the negation ¬ c b is consistent with the store, then the constraint store is augmen ted with ¬ c b . Both rules may apply , since the sto re may repres en t some memory states satisfying the condition and some violating it. c ∧ ( ρ σ b ) is satisﬁable h if b i ; l , σ, c i 7− → h i ; l , σ , c ∧ ( ρ σ b ) i c ∧ ¬ ( ρ σ b ) is satisﬁable h if b i ; l , σ, c i 7− → h l, σ, c ∧ ¬ ( ρ σ b ) i Iterativ e Instruc tions The while instruction wh ile b i also consider s t wo cases. If the co nstrain t c b asso ciated with b is consisten t with the cons tr ain t store, then 6 H´ el` ene Co lla v izza , Mic h el Rueher, Pascal V an Hentenryck the constraint store is augmented with c b , the b ody is exe cuted, and the while instruction is reconsider ed. If the negation ¬ c b is consistent with the constraint store, then the constraint store is augmented with ¬ c b . c ∧ ( ρ σ b ) is satisﬁable h while b i ; l, σ, c i 7− → h i ; w hile b i ; l , σ , c ∧ ( ρ σ b ) i c ∧ ¬ ( ρ σ b ) is satisﬁable h while b i ; l, σ, c i 7− → h l , σ, c ∧ ¬ ( ρ σ b ) i Scalar Assignm e n ts Scalar as signmen ts crea te a new constraint v ar iable for the pro gram v ariable to b e a ssigned and add a constraint s pecifying that the v ariable is equal to the right-hand side. A new renaming mapping is pro duced. σ 2 = σ 1 [ v /σ 1 ( v ) + 1] & c 2 ≡ ( ρ σ 2 v ) = ( ρ σ 1 e ) h v ← e ; l , σ 1 , c 1 i 7− → h l , σ 2 , c 1 ∧ c 2 i Assignme nts o f Arra y Elem en ts The a ssignmen t of an a rra y element crea tes a new constr a in t ar r a y , add a co nstrain t for the index being indexed and p osts constraints sp ecifying that all the new constra in t v a riables in the ar ra y are equal to their e a rlier version, except for the elemen t b eing indexed. N ote that the index is an expr ession w hich may c o n tain v ar iables a s well, giving ris e to the well-known element constraint in constr ain t programming [25]. σ 2 = σ 1 [ a/σ 1 ( a ) + 1] c 2 ≡ ( ρ σ 2 a )[ ρ σ 1 e 1 ] = ( ρ σ 1 e 2 ) c 3 ≡ ∀ i ∈ 0 .. a . length : ( ρ σ 1 e 1 ) 6 = i ⇒ ( ρ σ 2 a )[ i ] = ( ρ σ 1 a )[ i ] h a [ e 1 ] ← e 2 , σ 1 ; l , c 1 i 7− → h l , σ 2 , c 1 ∧ c 2 ∧ c 3 i Assert Statements An asser t statement chec k s whether the ass ertion is im- plied by the control s tore in which ca se it pro ceeds no rmally . Otherwise, it ter- minates the execution with an error. c ⇒ ( ρ σ b ) h assert b ; l , σ , c i 7− → h l , σ , c i c ∧ ¬ ( ρ σ b ) is satisﬁable h assert b ; l , σ , c i 7− → h⊥ , σ, c i Enforce Statemen ts An enforce statement adds a constraint to the constraint store if it is satisﬁable. c ∧ ( ρ σ b ) is satisﬁable h enfor c e b ; l , σ, c i 7− → h l , σ, c ∧ ( ρ σ b ) i Blo c k Statemen ts Blo c k statemen ts simply remov e the braces. h{ l 1 } ; l 2 , σ, c i 7− → h l 1 : l 2 , σ, c i A Constraint-Progra mming F ramework for Bounded Program V eriﬁcation 7 Return Statemen ts A return statement simply constrains the r esult v ar iable. c 2 ≡ ( ρ σ 1 res u lt ) = ( ρ σ 1 e ) h r eturn e ; l , σ 1 , c 1 i 7− → h σ 1 , c 1 ∧ c 2 i T ermination T ermination also o ccurs when no instruction rema ins. h ǫ, σ, c i 7− → h⊤ , σ , c i The CPBPV S e man tics Let P b e program b pre l b post in which b pre denotes the pre c o ndition, l is a list of instructions, and b post the p ost-condition. Let ∗ 7− → be the tr ansitiv e closur e of 7− → . The ﬁnal s tates ar e sp eciﬁed b y the set SFN ( b pre , P ) = { h f , σ, c i|h i, σ ⊥ , ρ σ ⊥ b pre i ∗ 7− → ∗h f , σ , c i ∧ f ∈ {⊥ , ⊤} } The progr a m violates an assertion if the s et SFE ( b pre , P , b post ) = { h ⊥ , σ, c i ∈ SFN ( b pre , P ) } is not empt y . It violates its speciﬁca tion if the s e t SFE ( b pre , P , b post ) = { ⊤ , σ, c i ∈ SFN ( b pre , P ) | c ∧ ( ρ σ ¬ b post ) satisﬁable } is not empt y . It is partially correct otherwise. 4 Implemen tation issues The CPBPV fr a mew or k is parametriz ed by a list of so lv ers ( S 1 , . . . , S k ) which are tried in sequence, starting w ith the least expens iv e and less general. When chec king satisﬁability , the veriﬁer never tries so lv er S i +1 , . . . , S k if solver S i is a decision pro cedure for the constr ain t store. If s olv er S i is not a decis io n pro- cedure, it uses an abstr action α of the constraint store c satisfying c ⇒ α and can still detect failed execution paths quickly . The last solver in the sequenc e is a co ns train t-pr ogramming solver (CP solver) over ﬁnite domains which iter- ates pruning and searching to ﬁnd s olutions or prov e infeasibility . When the CP solver makes a choice, the ear lier solvers in the s e quence a re ca lled o nce again to prune the s earc h space o r ﬁnd solutions if they hav e b e come decision pr o ce- dures. Our prototype implemen tatio n uses a sequence ( M I P, C P ), where MIP is the mixed integer-prog r amming to ol ILO G CPLEX 4 and CP is the constraint- progra mming to ol Ilog JSOL VER. Our Jav a implemen tation also perfor ms some trivial simpliﬁcations such as cons tan t propaga tion but is other wise not opti- mized in its use of the solvers and in its rena ming pr ocess whose sp eed and memory usage could b e impr oved substantially . Pr actically , simpliﬁcations are done o n the ﬂy a nd the MIP solver is ca lled at e ac h no de of the executable paths. The CP so lv er is only ca lled at the end o f the executable paths when 4 See http://www.il og.com/products . 8 H´ el` ene Co lla v izza , Mic h el Rueher, Pascal V an Hentenryck the complete pos t condition is consider ed. Curren tly , the implemen tation use a depth-ﬁrst strategy for the CP solver, but modern CP lang ua ges now oﬀer high- level abstractions to implement other explo r ation strategies. In practice, whe n CPBPV is used for mo del chec k ing as discussed b elow, it is probably advisable to use a depth-ﬁrst iterative deep ening implemen tatio n. 5 Exp erimen tal results In this sectio n, w e rep ort exp erimen tal r esults for a set of traditional b enc hmarks for progra m veriﬁcation. W e co mpare CPBVP with the following framew orks: – ESC/ Ja v a is an Extended Static Check er for Jav a to ﬁnd c o mmon run-time error s in J ML-annotated Jav a progr ams by static ana lysis of the co de and its annotations. See h ttp://kind.ucd.ie/pro ducts/op ensource/ESCJ a v a2 / . – CBMC is a Bounded Model Check er for ANSI-C and C+ + progr ams. It al- lows for th e veriﬁcation of ar r a y b ounds (buﬀer overﬂo ws), po in ter s afet y , ex- ceptions, and user-sp eciﬁed assertions. See http://www.cprov er.org /cbmc/ . – BLAST, the Ber k eley Lazy Abstraction Softw a re V eriﬁcation T o ol, is a soft- ware mo del chec ker for C progra ms. See http://m tc.epﬂ.c h/softw a re-tools /blast/ . – EUREK A is a C bounded mo del chec ker w hich uses an SMT so lv er instead of an SA T solver. See http://www.ai-lab.it/eur ek a / . – Why is a softw a re veriﬁcation platform which integrates many existing pr o vers (pro of assistant s suc h as Coq, PVS, HOL 4,...) and decision pro cedures suc h as Simplify , Yices, ...). See http://wh y .lri.fr/. Of course, neither the expres siv enes s nor the ob jectiv es of all these systems are the s a me as the one o f CPBPV. F o r instance, so me o f them can handle CTL/L TL constraints whereas CPBP V dos not y et s upport this kind of co nstrain ts. Nev- ertheless, this compariso n is useful to illustrate the capabilities of CPBPV. All exp erimen ts were per formed on the sa me ma c hine, an Int el(R) Pentium (R) M pro cessor 1.86GHz with 1.5G of memo r y , using the version of the veriﬁers that can b e downloaded from their w eb sites (except for E UREKA for which the execution times given in [2,3] are rep orted.) F or ea c h b enc hmar k prog ram, we de- scrib e the da ta entries a nd the veriﬁcation par ameters. In the tables, “UNABLE” means that the co r responding framework is unable to v alidate the pro gram ei- ther becaus e a lack of expressiveness or bec a use of time or memory limitations, “NOT F OUND” that it does not detect an er ror, a nd “F ALSE ERROR” that it repo rts an erro r in a co r rect program. Complete details of the exp erimen ts, including input ﬁles and error traces, can be found in [13]. Binary searc h W e start with the binar y search program presented in ﬁgure 1. ESC/Jav a is applied on the prog ram describ ed in Figure 1. ESC/Jav a requires a limit on the n um be r of lo op unfoldings, which we set to log ( n ) + 1 which is the worst case complexity of binary sea rc h algorithm for an array of length n . Sim- ilarly , CBMC requir e s an overestimate of the n um ber o f lo op unfoldings. Since CBMC do es not supp ort ﬁrst-order expressions such as JML \ f oral l statement, A Constraint-Progra mming F ramework for Bounded Program V eriﬁcation 9 CPBPV arra y length 8 16 32 64 128 256 time 1.081s 1.69s 4.043s 17.009s 136.80s 1731.69 6s CBMC arra y length 8 16 32 64 128 256 time 1.37s 1.43s UNA BLE UNABLE UNABLE UNABLE Why with inv ariant 11.18s without inv ariant UN ABLE ESC/Ja va F ALSE ERROR BLAST UNABLE T able 1. Compar ison table for binary search we g enerated a C pro gram for each instance of the problem (i.e., each a rra y length). F or example, the p ostcondition for an array of length 8 is giv en b y (result!=-1 && a[result]==x)|| (result==-1 && (a[0]!=x&&a[1]! =x&&a[2]!=x&&a[3]!=x&&a[4]!= x&&a[5]!=x&&a[6]!=x&&a[7]!=x) F or the Why framework, we used the binary s earc h v ersion given in their distri- bution. This progr am uses an assert statemen t to giv e a lo op in v ariant. Note that CPBPV do es not require any additional information: no in v a rian t and no limits on lo op unfoldings. During execution, it selects a path b y nonde- terministically a pplying the semantic rules for conditional and lo op expressions. T able 1 r eports the e xperimental results. Executio n times for CP BPV are rep orted a s a function of the arr a y length fo r integers co ded on 31 bits. 5 Our implemen tation is neither optimized for time or space at this stage and times are only given to demonstrate the feasibility o f the CP BPV veriﬁer. The “ Wh y” framew o rk [16] was unable to verify the corr ectness without the lo op inv ariant; 60 % of the pro of o bligations rema ined unknown. The CBMC framework was not able to do the v er iﬁcation for an instance of length 32 (it w as in ter r upted a ft er 6691,8 7s). ESC/Jav a w as unable to verify the correctnes s of this pro g ram unless c o m- plete lo op inv ariants are provided 6 . An Incorrect Binary searc h T able 2 rep orts exper imen tal r e sults for an in- correct binary se ar ch pr ogram (see Figur e 1 , line 11) for CP BPV, E SC/Jav a, CBMC, and Wh y using an inv ariant. The error trace fo und with CPBPV has bee n des c ribed in Section 2. The error traces provided by CBMC and ESC/Jav a only show the decisions taken a long the fault y path can b e found in [13]. In con- trast to CP B PV, they do not pr o vide any v alue for the a rray no r the searched data. Observe that CPBPV pro vides orders of magnitude improv ements in eﬃ- ciency ov er CBMC a nd also outperfor ms ESC/Jav a by almost a factor 8 o n the largest instance. 5 The commercial MIP solver fails with 32-bit domains because of scaling issues. 6 a version with lo op in v ariants that allo ws to show the correctness of this program has b een wri tten by Da v id C ok, a develo pp er of ESC/ Ja va, after w e con tacted him. 10 H´ el` ene Co lla v izza , Mic h el Rueher, Pascal V an Hentenryck CPBPV ESC/Ja v a CBMC WHY with inv ariant BLAST length 8 0.027s 1.21 s 1.38s NOT FOUND UNABLE length 16 0.037s 1.34 7 s 1.69s NOT FOUND UNABLE length 32 0.064s 1.79 2 s 7.62s NOT FOUND UNABLE length 64 0.115s 1.88 6 s 27.0 5s NOT FOUND UNABLE length 128 0.241s 1.964 s 189.20s NOT FOUND UNABLE T able 2. Exp erimental Results for an Incorr ect Binary Search CPBPV ESC/J a v a CBMC Why BLAST time 0.287s 1.828s 0 .82s 8.85s UNABLE T able 3. Exp erimental Results on the T r it yp e P rogram The T ritype Program The tritype prog ram is a standar d b enc hmar k in test case generation and pro gram veriﬁcation since it contains numerous non-feasible paths: only 10 paths corresp o nd to actual inputs because o f complex conditiona l statements in the progra m. The program ta k es three p ositiv e integers a s inputs (the tria ng le sides) a nd returns 2 if the inputs corresp ond to an isosceles tria ngle, 3 if they corresp ond to an e q uilateral tria ngle, 1 if they corresp ond to some other triangle, and 4 otherwise. The tri type pro gram in Ja v a with its speciﬁcatio n in JML can b e found in[13]. T able 3 depicts the ex perimental results for CPBP V, ESC/Jav a, CBMC, BLAST and Why . BLAST was unable to v a lidate this ex- ample b ecause the c urren t version do es not handle linear ar ithmetic. Obser v e the excellent p erformance of CPB PV and note that our previous approa c h us- ing constraint prog ramming and Bo olean abstraction to abstract the conditions, v alidated this b enc hmar k in 8 . 52 seco nds when integers w ere co ded on 1 6 bits [12]. It also explored 92 spurious paths. An Incorrect T rit yp e Program Consider now an inco rrect version o f T rityp e progra m in which the test “if ((trityp==2)&&(i+k > j))” in line 22 (see [1 3]) is r eplaced b y “ i f ((trityp==1)&& (i +k > j))” . Since the lo cal v aria ble t ri typ is equal to 2 when i==k , the condition (i+k) > j implies that (i,j,k) are the sides of an isosceles triangle (the tw o other triangular inequalities are trivia l b ecause j > 0). But, when trityp=1 , i==j ho lds and this inco rrect version may answer that the triangle is isosceles while it ma y not b e a triangle a t all. F or example, it will r eturn 2 when (i,j,k)=(1,1,2) . T a ble 4 depicts the exp erimen tal r esults. Execution times corr espond to the time required to ﬁnd the ﬁrs t e r ror. The error found with CPBPV co rrespo nds to input v alues ( i, j, k ) = (1 , 1 , 2) mentioned earlier. O nce again, obser v e the excellent b ehavior of CPBP V compared to the remaining to ols. 7 7 F or C BMC, w e hav e contacted D. Kroening who has recommended to use the option CPR OVER assert. If we do so, CBMC is able to ﬁn d the error, but w e must add A Constraint-Progra mming F ramework for Bounded Program V eriﬁcation 11 CPBPV ESC/J a v a CBMC WHY time 0.056s s 1.853s NOT FOUND NO T FOUND T able 4. Exp erimental Results for the Incorrect T ritype Pr ogram CPBPV E SC/Ja va CBMC EUREKA length 8 1.45s 3.778 s 1.11s 91s length 16 2. 97s UNABLE 2.01s UNABLE length 32 UNABLE U NABLE 6.10s UNABLE length 64 UNABLE U NABLE 37.65s UNA BLE T able 5. Exp erimental Results for Bubble Sort Bubble Sort with i nitial condi tion This b enc hma rk (see [13]) is taken fro m [2] and p erforms a bubble sort of an array t whic h contains integers from 0 to t.l eng th given in decreasing or der. T able 5 shows the compa rativ e r esults for this benchmark. CPBP V was limited on this b e nchmark b ecause its recursive imple- men tation uses up all the JA V A stack space. This problem s hould b e remedied by removing recursion in C P BPV. Selection Sort W e now prese nt a b enc hmark to highlight bo th mo dular v eri- ﬁcation a nd the ele ment co ns train t o f constraint programming to index arr a ys with arbitra ry expr essions. The benchmark describ ed in [13]. Assume tha t func- tion findM in has b een v er iﬁed fo r a r bitrary integers. When encountering a call to f indMin , CPBPV ﬁrst checks if its preconditio n is en tailed by the constraint store, whic h requires a consistency check of the constraint stor e with resp ect to the negation of the precondition. Then CPBPV repla ces the ca ll by the p ost- condition where the fo r mal parameters are r eplaced b y the actual v a riables. In particular, for the ﬁrs t iter ation o f the lo op and an array length o f 4 0 , CPBP V generates the conjunction 0 ≤ k 0 < 40 ∧ t 0 [ k 0 ] ≤ t 0 [0] ∧ . . . ∧ t 0 [ k 0 ] ≤ t 0 [39] which features element constraint [25]. Indeed, k 0 is a v ar iable and a constraint like t 0 [ k 0 ] ≤ t 0 [0] indexes the array t 0 of v ariables using k 0 . The mo dular v eriﬁcation of the selectio n s o rt explores only a single path, is independent o f the integer representation, and takes less than 0 . 0 1 s for ar r a y s of s iz e 40. The bottleneck in v e r ifying selection sor t is the v alida tion of function findMi n , whic h requir es the ex plo ration of many paths. How ever the complete v alidation of s election sor t takes less than 4 seconds for an array o f length 6. Once again, this should be contrasted with the mo del-ch ecking approach o f Eurek a [2]. On a version o f selection sort where all v ariable s are ass igned speciﬁc v alues (contrary to our veriﬁcation which makes no assumptions on the inputs), E urek a takes 10 4 seco nds o n a faster machine. Referenc e [2 ] als o rep orts that CBMC some assumptions to mean that th ere is no ov erﬂo w i nto the sums, in order to prov e the correct vers ion of trit yp e with this same option. 12 H´ el` ene Co lla v izza , Mic h el Rueher, Pascal V an Hentenryck takes 432.6 seconds, that BLAST cannot solv e this problem, and that SA T ABS [9] only veriﬁes the prog ram for an array with 2 elements. Sum of Squares Our last b enc hmar k is describ ed in [1 3] and computes the sum o f the squa re of the n ﬁr st integers sto red in an ar ra y . T he preco ndition states that n is the size of the a rra y and that t must contain any p ossible per m utation o f the n ﬁrs t int egers. The p ostcondition s ta tes that the result is n × ( n + 1) × (2 × n + 1) / 6. The b enc hmar k illustrates tw o functiona lities of co nstrain t progra mming: the ability of sp ecifying combinatorial constra ints and of solving nonlinear problems. The al ldifferent constraint[23 ] in the pre- condition sp eciﬁes that all the elements of the a rray a re diﬀerent, while the progra m cons tr ain ts and postcondition inv olves q uadratic and cubic constraints. The maximum instance that w e were able to solve with CP BPV was an arr a y of size 10 in 66.179s. CPLEX, the MIP solver, pla ys a key role in all these b enc hmarks. F o r in- stance, the CP so lv er is never called in the T ritype b enchmark. F or the Binary search benchmark, there are length calls to the CP solver but almost 75% of the CPU time is sp en t in the CP solver. Since there is only path in the B uble so rt benchmark, the CP solver is only ca lled once. In the Sum of squar es example, 80% of the CPU time is s pent in the CP solver. 6 Discussion and Related W ork W e brieﬂy r eview recent work in cons train t progr a mming and mo del chec king for softw ar e testing, v alidation, and veriﬁcation. W e outline the main diﬀ erences betw een our CPBPV framework and existing approa c hes. Constrain t Logic Programming Cons tr ain t logic pro gramming (CLP) was used for tes t generatio n of progr ams (e.g ., [17,20,24,19]) a nd provides a nice implemen tation to ol e x tending sym b olic execution tec hniques [4]. Go tlieb et al. show ed how to repr esen t impera tiv e progr ams as constraint logic progr ams and used predicate a bstraction (from mo del chec king) a nd co nditional constra in ts within a CLP fr a mew or k. Flanagan [15] formalized the translation o f imp erative progra ms in to CLP , ar gued that it could b e used fo r b ounded mo del chec king, but did not provide an implemen tation. The test-generatio n metho dology w as generalized a nd applied to b ounded progr am v er iﬁcation in [11,12]. The imple- men tation use d dedica ted pr edicate abstractions to r educe the explor ation of spurious exe c utio n paths. How ever, as shown in the pap er, the CPBPV veri- ﬁer is signiﬁcantly more eﬃcient and often av oids the generatio n of spurious execution paths completely . Mo del Checking It is a lso useful to cont rast the CPBPV v er iﬁer with mo del- chec king of soft ware systems. SA T-based bo unded mo del chec king for so ft ware[6] consists in building a pro positional formula whos e mo dels corresp ond to exe- cution paths of b ounded length viola ting so me pr o perties and in using SA T A Constraint-Progra mming F ramework for Bounded Program V eriﬁcation 13 solvers to c heck whether the res ulting formula is satisﬁable. SA T-based mo del- chec king platforms [6] have b een widely po pular thanks to s igniﬁcan t progress in SA T solvers. A fundamen tal issue faced by mo del c heck ers is the state space explosion of the re sulting mo del. V arious techniques hav e be en prop o sed to a d- dress this c hallenge, including generalized sy mbo lic execution (e.g., [21]), SMT- based mo del chec king, a nd abs traction/reﬁnement techniques. SMT-bas ed mo del chec king is the idea of represe nting and chec k ing quantiﬁer-free formulas in a more gener al decidable theor y (e.g. [18,14,22]). These SMT solvers integrate dedicated solvers and share some of the motiv ations of constra in t progr amming. Predicate abstr action is another po pular techn ique to address the state space explosion. The idea co nsists in a bstracting the pro gram to obtain an abstract progra m on which mo del chec k ing is p erformed. The model chec ker may then generate an abstra ct counterexample which must b e chec ked to determine if it corres p onds to a concrete execution pa th. If the counterexample is spurious, the abstract progra m is reﬁned and the pr ocess is iter ated. A succes sful predicate abstraction consists of abstracting the concrete program into a Boo lean pr ogram (e.g., [5,7,8]). In recent work [3,2], Armando & al propo sed to abstract c o ncrete progra ms into linear progra ms and used an abstra ction of sets of v aria bles and array indices. They show ed that their to ol compa res fav o urably and, on so me of the progra ms considered in this pape r, outperfor ms mo del check er s based on predicate abstraction. Our CPBP V veriﬁer contrasts with SA T-based mo del ch eck er s, SMT-based mo del chec kers and pr e dic a te abstraction based approa c hes: It do es not abstra ct the progra m and do es not g enerate spurious execution paths. Instead it uses a constraint-solver a nd nondeter ministic explo ration to incrementally constr uct abstractions of executio n paths. The a bstraction use s constraint stores to rep- resent sets of c o ncrete stores . On many bo unded veriﬁcation b enc hma rks, o ur preliminary exp erimen tal res ults show signiﬁcant improv ements ov er the state- of-the-art results in [2]. Mo del checking is well ada pted to chec k low-level C progra m and hardw are applications with n umero us Bo olean constraint s and bit - wise op erations: It was successfully used to compare a n ANSI C program with a circuit given as design in V e rilog [7]. How ever, it is imp ortant to observe that in mo del chec king, o ne is typically int erested in checking some spe c iﬁc pro perties such as buﬀer ov e rﬂo ws , p oin ter sa fet y , or use r -speciﬁed assertions. These prop- erties are typically m uch less detailed than our p ost-conditions and abstracting the progr a m ma y spee d up the pro cess sig niﬁcan tly . In o ur CPBP V veriﬁer, it is critical to explo r e all execution paths and the main is s ue is ho w to eﬀectively abstract memory stores by constraints and how to c heck satisﬁa bilit y incr emen- tally . It is an intriguing is sue to determine whether an hybridization of the tw o approaches would be b eneﬁcial for model chec king, an issue brieﬂy discussed in the next section. Observe also that this research provides convincing evidence o f the b eneﬁts of Nieuw enhuis’ challenge [22] aiming a t ex tending SMT 8 with CP techn iques. 8 See also [1] fo r a stud y of the relations b et w een constrain t programming and Satis- ﬁabilit y Modulo Theories (SMT) 14 H´ el` ene Co lla v izza , Mic h el Rueher, Pascal V an Hentenryck 7 P ersp ectiv es and F uture W ork This pa per intro duced the CP B PV fra mew or k for b o unded progra m veriﬁcation. Its novelt y is to us e cons tr ain ts to represent sets of memory stores a nd to explo re execution paths ov er these constraint stores nondeterministically and incremen- tally . The CP B PV veriﬁer exploits the fact that, when v aria ble s and a rra ys ar e bo unded, the constraint store ca n alwa ys b e check ed for feasibility . As a result, it never explore s spurious execution path contrary to earlier approaches combining constraint pr o gramming and predica te abstrac tio n [11,12] or integrating SMT solvers and the a bstraction/reﬁnement appro ac h from mo del chec king [2]. W e demonstrated the CPB P V veriﬁer on a n um be r of sta nda rd b enc hmarks from mo del checking and prog ram checking as well as on nonlinea r prog rams and functions using complex array indexings, and s ho wed how to p erform mo dular veriﬁcation. The exp erimental r e sults demonstrate the p otential of the appro ac h: The CPBPV v eriﬁer provides signiﬁcant g ain in p erformance and functionalities compared to other too ls. Our curren t w or k aims at improving and generalizing the framew ork and im- plement ation. In particular, we would like to include tailored, lig h t-weigh t solvers for a v a r iet y of constr ain t class e s , the optimiza tion of the array implemen tation, and the in tegration of Ja v a ob jects and references. There are also man y resea rc h av enues o p ened by this resea rc h, t wo of whic h are review ed now. Currently , the CP BPV veriﬁer do es not chec k for v a riable overﬂo ws: the constraint store enforces that v ariables tak e v a lues inside their do ma ins and ex- ecution paths violating these constraints are th us not co nsidered. It is po ssible to generalize the CPBP V veriﬁer to chec k ov erﬂows as the veriﬁcation pro ceeds. The key idea is to chec k b efore each assig nmen t if the constra in t store entails that the v alue pro duced ﬁts in the selected integer r epresen tation and gener ate an error otherwise. (Similar ass ertions m ust in fact be chec ked for ea c h sub expres- sion in the right hand-side in the lang uage ev aluation order. Interv al techniques on ﬂoats [4] may be used to obtain conserv ative checking of such as s ertions. An in triguing direction is to use the CPBPV approach for prop erties chec k- ing. Given an as s ertion to b e veriﬁed, o ne may per f orm a backward executio n from the asser tion to the function entry point. The negation of the a ssertion is now the pre-c o ndition and the pre-condition b ecomes the p ost-condition. This requires to sp ecify in verse renaming and executio ns of conditiona l and itera tiv e statements but these hav e already be en studied in the con tex t o f test ge ne r ation. Ac knowledgemen ts Many thanks to Jean-F r a nois Couchot for many helps o n the use of the Why framew ork. References 1. A ¨ ıt-Kaci H., Berstel B., Junker U., Leco nte M., P o delski A . : S atis ﬁabilit y Modulo Structures as Constrain t Satisfactio n : A n Introduction. Pro cs of JFLA 2007. 2. Armando A., Benerecetti M., and Mon tov ani J. Abstraction Reﬁnemen t of Linear Programs with Arra y s. Proceedings of T ACAS 2007, LNCS 4424: 373– 388. A Constraint-Progra mming F ramework for Bounded Program V eriﬁcation 15 3. Armando A., Manto v ani J., and Platania L. Bounded Model Checking of C Pro- grams using a S MT solv er instead of a SA T solv er. Proc. SPIN’06. LN CS 3925, P ages 146-162. 4. Botella B., Gotlieb A., Michel C. S ym b olic ex ecution of ﬂoating-p oin t computa- tions. Softw are T esting, V eriﬁcation and Reliabilit y . 16:2:97–1 21.2006 . 5. Thomas Ball, Andreas Podelski, Sriam K. Ra jamani Boolean and Cartesian A b- straction for Model Chec king C Programs. Proc. of T ACAS 2001. 6. E. Clark e, A. Biere, R. Raimi, and Y. Zhu. Bounded Mo del Chec king using Satis- ﬁabilit y Solving. FMSD, 19(1):7–34, 2001 . 7. Clark e E., Kro ening D., Lerda F. : A T o ol for Chec king ANSI-C programs. T acas 2004, LNCS 29 88, pp 168-176, 200 4 8. Clark e E., Kro ening D ., S harygina N., Y orav K. : Predicate abstraction of ANSI-C Programs using SA T. FMSD , 25:105 –127, 2004 9. Clark e E., Kroening D., Sharygina N., Y orav K. : SA T ABS: SA T-Based Predicate Abstraction for ANSI-C. T AC AS’05, 570–574, 2005. 10. Cyt ro n R., F erran t e J., Rosen B., W egman M., and Zadec k K. : Eﬃcently Com- puting Static Single Assignment Form and the C ontrol D ep endence Graph. T r ans- actions on Pr o gr amming L anguages and Syste ms , 13(4):451 –490, Octob er 1991. 11. Collavizza H. and Rueher M. : S oftw are V eriﬁcation using Constraint Programming T echniques. Procs of T ACAS 2006, LNCS 3920 : 182-196. 12. Collavizza H . and Rueher M. : Exp lo ring diﬀeren t constrain t-based modelings for program veriﬁcation Pro cs of CP 2007, LNCS 3920: 182-196 13. Collavizza H. Rueher M., V an Hentenryck P . : Comparison b e- tw een CPBPV with ESC/Ja v a, CBMC, Blast, EUREKA and Why . http://w ww.i3s. unice.fr/˜rueher/v eriﬁcationBenc h .pdf 14. Brun o Dutertre and Leonardo Mend onca d e Moura. A fast linear-arithmetic solv er for DPLL(T). CA V 200 6, p ag es 81 –94. LNCS 4144. 15. Cormac Flanagan, ”Automatic soft ware mo del chec king via constraint logic” (2004). Science of Computer Pro gramming. 50 (1-3), pp . 253-270. 16. Fillitre J.C., Claude March.The Wh y / Krak atoa/Caduceus Platform f or Deductive Program V eriﬁcation Pro c. CA V’2007, LN C S 4590. pp 173-177. 17. Gotlieb A., Botella B. and Rueher M : Automatic T est Data Generation using Constrain t Solving T echniques. Proc. ISST A 98, ACM SIGSO FT (2), 1998. 18. Ganzinger H., Hagen G., Nieu w enhuis R.,Oliveras A ., and Tinelli C.: DPLL(T): F ast Decision Procedu res. Proc. of CA V 200 4, 175-188 , 200 4. 19. P . Godefroid, M. Y. Levin, D. Molnar: Automated Whitebox F uzz T esting, NDSS 2008, Netw ork and Distributed System Security Symp osium. 20. D aniel Jac kson an d Mandana V aziri, Find in g Bugs with a Constrain t Solver, ACM SIGSOFT Sy mposium on Soft ware T esting and Analysis, 14–15, 200 0. 21. K h urshid , S., P asarean u, C.S., and Vissser, W. “Generalized Sy m b olic Execution for Mo del Checking and T esting”, in T ACAS 2003, W arsa w, Poland. 22. R . Nieu w en h uis, A. Oliveras, E. R odrguez-Carb o nell and A. R ubio: Challe nges in Satisﬁabilit y Mo dulo Theories. Invited T alk. R T A 2007, LNCS 4533, pp 2-18. 23. J-C. R´ egin. A ﬁ lteri ng algorithm for constrain ts of diﬀerence in CSPs. AAAI- 9 4, Seattle, W A, U SA, pp 362–367, 199 4. 24. S y N.T. and Deville Y.: Automatic T est Data Generation for Programs with Integer and Float V ariables. Pro c of. 16th IEEE ASE01, 2001. 25. V anHentenryck P . ( 1989) Constrain t Satisfaction in Logic Programming, MIT Press. 26. N umerica: A Modeling Language for Global Optimization Pascal V an Hen t enryc k , Laurent Mic hel, Yves Deville. MIT Press, 1997.

CPBVP: A Constraint-Programming Framework for Bounded Program Verification

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment