Algebraic Pattern Matching in Join Calculus
We propose an extension of the join calculus with pattern matching on algebraic data types. Our initial motivation is twofold: to provide an intuitive semantics of the interaction between concurrency and pattern matching; to define a practical compil…
Authors: Qin Ma, Luc Maranget
Logical Methods in Computer Science V ol. 4 (1:7) 2008, pp. 1–41 www .lmcs-online.org Submitted Jan. 25, 2007 Published Mar . 21 , 2008 ALGEBRAIC P A TTERN MA T CHING IN JOIN CALCULUS ∗ QIN MA a AND LUC MARA NGET b a OFFIS, Escherw eg 2, 26121 Oldenburg, German y e-mail addr ess : Qin.Ma@offis.de b INRIA - Ro cquencourt, BP 105, 78153 Le Chesna y Cedex, F rance e-mail addr ess : Luc.Maranget@inria.fr Abstra ct. W e prop ose an extension of the join calculus with pattern matching on alge- braic data types. Our initial motiv ation is t wofol d: to provide an in tuitiv e seman t ics of the intera ction b etw een concu rrency and pattern matc hing; to defi ne a practical compilation sc heme from ex tended join definitions into ordinary ones plus ML pattern matching. T o assess the correctness of our compilation scheme, we develop a theory of th e app lied join calculus, a calculus with v alue passing and v alue matching. W e implement this calculus as an extension of the current JoC aml system. 1. Introduction The join calculus [15, 16] is a pr o cess calculus in the tradition of th e π -calculus of Milner et al. [33]. One distinctiv e feature of join calculus is the sim ultaneous definition of all receptors on sev eral c han n els thr ou gh join definitions . A join definition is structured as a list of r e action rules , w ith eac h reaction rule b eing a pair of one join p attern and one guar de d pr o c ess . A join p attern is in turn a list of c hannel n ames (with form al arguments), sp ecifying the sy n c hronization among those channels: namely , a join pattern is matc hed only if there are messages present on all its c hannels. Finally , the reaction rules of one join definition d efine comp eting b eha viors with a non-deterministic c hoice of wh ich guarded pro cess to trigger when sev eral join p atterns are satisfied. In th is pap er, w e extend th e m atc hin g mec h anism of j oin patterns, suc h that message con tents are also tak en in to accoun t. As an example, let us consider the follo wing list-based implemen tation of a concurr en t stac k: 1 def p op ( r ) & State ( x :: xs ) ⊲ r ( x ) & State ( xs ) or push ( v ) & State ( ls ) ⊲ State ( v :: ls ) in State ( [] ) & . . . 1998 A CM Subje ct Cl assific ation: D.1.3, D.3.3, F.3.2. Key wor ds and phr ases: join-calculus, pattern-matching, process calculus, concu rren cy . ∗ Extended versio n of [28]. 1 W e use the OCaml syntax for lists, with Ni l b eing [] and Cons b eing the infix :: . LOGICAL METHODS l IN COMPUTER SCIENCE DOI:10.216 8/LMCS-4 (1:7) 20 08 c Q. Ma and L. Maranget CC Creative Commons 2 Q. MA AND L. MARANGET The second join pattern push ( v ) & State ( ls ) is an or dinary one: it is matc hed w henev er there are m essages on b oth State and push . By con trast, the first join pattern is an extende d one, where the formal argument of c hannel State is an algebr aic p attern , matc hed only by messages th at are cons cells. Th us, w h en the stac k is emp ty ( i.e. , when message [] is p end in g on channel State ), p op requests are dela y ed. Note th at we follo w th e conv entio n that capitalized c hann els are pr iv ate : only push and p op will b e visib le outside. A similar stac k can b e implemente d with ou t using extended j oin patterns, b ut instead, using an extra p riv ate c hannel and ML p attern matc hing in guard ed pro cesses: def p op ( r ) & Some ( ls ) ⊲ matc h ls with | [ x ] → r ( x ) & Empty () | y :: x :: xs → r ( y ) & Some ( x :: xs ) or push ( v ) & Empty () ⊲ Some ( [ v ] ) or push ( v ) & Some ( ls ) ⊲ Some ( v :: ls ) in Empty () & . . . This sec ond definition enco d es the empty/ non-empt y stat us of t he sta c k as a messag e on c h annels Empty and Some resp ect iv ely . Po p requests on an empty stac k are stil l dela yed, since there is no r ule for the join p attern p op ( r ) & Empty (). The second definition ob viously requires more programming effort. Moreo ver, it is not immediately apparent that messages on Some are n on-empt y lists, and that the partial ML pattern m atching thus nev er fails. Join d efinitions with (constant) pattern argumen ts app ear in formally in functional nets [36]. Here w e generalize this idea to full algebraic patterns. A similar attempt has also b een scheduled by Benton et al. as an in teresting future work for C ω [7]. The n ew semantic s is a smo oth extension, since b oth join pattern matc hing and pattern matc hin g rest up on classica l su bstitution (or semi-unification). Ho we v er, an efficien t im- plemen tation is m ore inv olv ed. Our id ea is to address this issue b y transformin g programs whose definitions contai n extended join patterns into equiv alen t programs whose defin itions use ordinary join patterns and wh ose guard ed pro cesses use ML pattern matc h ing. Doing so, we lea v e most of the burd en of pattern matc hing compilation to an ordinary ML p attern matc hin g compiler. Ho w ev er, suc h a transformation is far from obvious. More sp ecifica lly , there is a gap b etw een (extended ) join pattern matc hing, which is non-deterministic, and ML pattern matc hing, wh ic h is d eterministic (follo w ing the “first matc h p olic y”). F or example, in our d efinition of a concur ren t stac k with extended join patterns, State ( ls ) is still matc hed b y any message on State , regardless of the p r esence of the more precise State ( x :: xs ) in the comp eting reaction rule that p r ecedes it. Our solution to this problem relies on p artitioning matc hin g v al ues int o n on-in tersecting sets. In the case of our co ncurrent stac k, those sets simply are the singleton { [] } and the set of non-empty lists. Then, pattern State ( ls ) is matc hed by v alues from b oth sets, wh ile pattern State ( x :: xs ) is matc hed only by v alues of the second set. The r est of th e p ap er is organized as follo w s: Section 2 fir st giv es a b rief review of al- gebraic patterns and ML pattern matc hing. Section 3 pr esen ts the applied join calculus — an extension of join with algebraic pattern matc hing. W e in tro duce the seman tics and the appropriate equiv alence relatio ns. Section 4 informally explains the k ey ideas to trans form the extension to the ordinary join calculus, and esp ecially ho w we deal with the nonde- terminism prob lem. Sectio n 5 formalizes the transformatio n as a compilation sc heme a nd present s the algorithm whic h essen tially works by b uilding a meet semi-latt ice of patterns. W e go through a complete example in Section 6, and finally , w e deal with the correctness of the compilation scheme in Section 7. Im plemen tation h as b een carried out as an extension ALGEBRAIC P A TTERN M A TCHING IN JOIN CALCULUS ∗ 3 of t he JoCaml system. W e discuss t he issues that ha ve arisen during th e implemen tation w ork in Section 8. An earlier v ersion of this pap er (lac king the detailed pro ofs and the discussion of the implemen tation) app eared as [28]. 2. Algebraic da t a types and ML p a ttern m a tching This section serves as a b r ief in tro duction to alg ebraic d ata t yp es and ML pattern matc hin g. In terested readers are r eferred to [30, 26] for further details. 2.1. Algebraic da t a t yp es. In fun ctional languages, new typ es can b e introd uced by using data typ e definitions and su c h t yp es a re algebraic data t yp es. F or example, using OCaml syn tax, binary trees can b e d efined as follo ws: t yp e tr e e = Empty | L e af of in t | No de of tr e e ∗ tr e e The c omplete signatur e of t yp e tr e e has three c onstructors : Empty , L e af , and N o de , whic h are used to build the v alues of this type. Ev ery constructor h as an arit y , i .e. the num b er of argumen ts it requir es and mean while sp ecifies the corresp ond ing t yp es of eac h argument. I n this definition, Empty is of arity zero, L e af is of arity one (and accepts int eger arguments), and No de is of arit y tw o (b oth its argument s b eing themselv es of t yp e tr e e ). A constructor of zero arity is sometimes called a c onstant c onstructo r . Most nativ e ML data types can b e seen as p articular instances of algebraic d ata t yp es. F or examp le, lists are defined by t w o constructors: constant Nil (written [] ) for empty lists and Cons (written as the infix :: ) for n onempt y ones; pairs are defined by one constructor with arit y t wo, (written a s the infix “ , ”); and in tegers are defined by infin itely many (or 2 31 ) constan t constructors. F ormally , the algebraic v alues (for short v alues) of t yp e t are w ell-t yp ed terms bu ilt from the constructors of t . “W ell-t yp ed” here means correct with resp ect to constru ctor arit y and argumen t typ es. Assuming a coun table set of iden tifiers for constructors, ranged o ver by κ , we giv e the formal definition of v alues as follo ws: v ::= Algebraic v alues κ ( v 1 , v 2 , . . . , v n ) κ of arity n ≥ 0 T yp e correctness is left implicit: we shall consider well typed terms only . Algebraic patterns (for sh ort patterns) of t yp e t are also we ll-t yp ed terms built fr om the constructors of t , b ut w ith v ariables. 2 The f ormal defin ition of p atterns is giv en as follo ws. π ::= Algebraic pat terns x v ariable κ ( π 1 , π 2 , . . . , π n ) κ of arity n ≥ 0 W e fu rther requir e all v ariables in a p attern to b e pairwise d istin ct, that is, w e only consider line ar patterns. Again, w e assu me a t yp ed con text. More precisely , we rely on the ML typ e system to guaran tee that v alues and p atterns are w ell-t yp ed. Moreo v er, w e rely on a ML t yp e inf erer to enric h syntax with explicit t yp es (wh ic h we lea ve implicit), and consider that the t yp e of an y synta ctic structure is a v ailable wh enev er needed. Doing so, we fo cus on our main issue and a v oid complications th at wo uld b e of little explanatory v alue. 2 W e freely replace v ariables whose names are of n o imp ortance by wildcards “ ”. 4 Q. MA AND L. MARANGET P atterns are used to d iscriminate v alues according to their s tructures. More sp ecifically , a pattern denotes a set of v alues th at ha v e a common prefix sp ecified by th e p attern. W e sa y a v alue v (of t yp e t ) is an instanc e of pattern π (of type t ), or that v matc hes π , wh en π describ es the prefix of v , in other w ord s, when there exists a su bstitution σ , s uc h t hat π σ = v . F or linear patterns, the instance r elation can b e defin ed in d uctiv ely as follo ws: Definition 2.1 (Instance) . Let π b e a patt ern and v b e a v alue, such that π and v ha v e the same t yp e, th e instance r elation π v is d efi ned as: v κ ( π 1 , . . . , π n ) κ ( v 1 , . . . , v n ) iff π i v i for all 1 ≤ i ≤ n W e w rite Ins ( π ) for the set of the instances of pattern π . The in stance r elation induces the follo wing relations among patterns. These relations app ly to p atterns π 1 and π 2 that h a ve the same t yp e. Definition 2.2 (Pa ttern relations) . • P atte rns π 1 and π 2 are compatible when they share at least one instance. Other w ise π 1 and π 2 are incompatible written π 1 # π 2 . Two compatible p atterns ad m it a least u pp er b ound written π 1 ↑ π 2 , whose ins tance set is Ins ( π 1 ) ∩ Ins ( π 2 ). • P atte rn π 1 is less precise than pattern π 2 written π 1 π 2 when Ins ( π 2 ) ⊆ Ins ( π 1 ). • P atte rns π 1 and π 2 are equiv alen t written π 1 ≡ π 2 when Ins ( π 1 ) = Ins ( π 2 ). I f so, their least up p er b ound is their representa tiv e, wr itten π i l π 2 . Note that we use the same notation for b oth relations: “b eing an in stance of ” (whic h is b et w een a patt ern and a v alue) and “being less precise” (wh ic h is b et we en tw o patte rns). Indeed, v alues are in fac t a sp ecial c ase of patterns (with no v ariables), and in that case, b oth relations collapse. The least up p er b ound of t w o patterns can b e compu ted at the same time when com- patibilit y is chec k ed b y th e follo wing rules: ↑ π = π π ↑ = π κ ( π 1 , . . . , π n ) ↑ κ ( ω 1 , . . . , ω n ) = κ ( π 1 ↑ ω 1 , . . . , π n ↑ ω n ) Deciding the relation “b eing less p recise” is more inv olv ed. Because of typing, there exists nontrivial su c h relations, for instance ( , ) . The JoCaml compiler relies on an efficien t algorithm for this task, called the U algorithm, with U standing for “Useful- ness” [3 0]. Algorithm U tak es tw o parameters: a list of patterns Π and a pattern π , and returns a b oolean. Roughly sp eaking, it chec ks the us efu lness of π with resp ect to Π. More sp ecifically , algorithm U tests the existence of at least one v alue v suc h that π admits v as an instance, an d n one of the patterns in Π do es. F rom the p oin t of v iew of alg orithm U , deciding the relation π 1 π 2 amoun ts to compute the ne gation of U ([ π 1 ] , π 2 ). Namely , π 1 is less pr ecise then π 2 , if and only if all the instances of π 2 are instances of π 1 . π 1 π 2 ⇐ ⇒ U ([ π 1 ] , π 2 ) = useless W e now give a simplified d efinition of algorithm U . The sim p lified d efinition suffi ces for our needs and also conv eys the basic idea b ehind the algorithm. Consider U ([ π 1 ] , π 2 ), where π 1 and π 2 are p atterns of a common type t . The follo w ing t wo cases are distinguished. ALGEBRAIC P A TTERN M A TCHING IN JOIN CALCULUS ∗ 5 Case π 2 = κ ( ω 1 , . . . , ω n ) • If π 1 = κ ( γ 1 , . . . , γ n ), then chec k if ∃ i, 1 ≤ i ≤ n , s.t. U ([ γ i ] , ω i ). • If π 1 = κ ′ ( γ 1 , . . . , γ n ) and κ 6 = κ ′ , then useful ( i.e . false for π 1 π 2 ). • If π 1 = , then useless ( i.e. true for π 1 π 2 ). Case π 2 = • If π 1 = , then useless ( i.e. true for π 1 π 2 ). • If π 1 = κ ( γ 1 , . . . , γ n ), − if κ is the u nique constructor of t yp e t , then c hec k if ∃ i, 1 ≤ i ≤ n , s.t. U ([ γ i ] , ). − otherw ise useful ( i.e. false for π 1 π 2 ). Once w e can decide relation “ ”, we can easily decide pattern equiv alence, since, by d efi- nition, π 1 ≡ π 2 means π 1 π 2 and π 2 π 1 . 2.2. ML pattern matching. In ML, op erating on algebraic data t yp es is perf orm ed by the use of the follo wing matc h c onstruct th at we extend to processes ( Q 1 , Q 2 etc. b elo w are pro cesses of the j oin calculus). matc h v with π 1 → Q 1 | π 2 → Q 2 | . . . | π n → Q n Ab o v e, we attempt a matc hing of v alue v aga inst a sequence o f patterns π 1 , . . . , π n of the same t yp e. ML pattern matc h ing is deterministic. It follo ws th e “first matc h p olic y”. That is, when v alue v is an in s tance of more than one of the patterns π i , the matc h construct c h o oses the one with the sm allest index i . This can b e seen as c hec kin g patterns π 1 , π 2 , . . . , π n for admitting v as an instance sequen tial ly , stopping a s so on a s a matc h is found. As a consequence, pattern π i is matc hed only by the v alues in set Ins ( π i ) \ ( S 1 ≤ j
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment