Declarative Combinatorics: Isomorphisms, Hylomorphisms and Hereditarily Finite Data Types in Haskell

This paper is an exploration in a functional programming framework of {\em isomorphisms} between elementary data types (natural numbers, sets, multisets, finite functions, permutations binary decision diagrams, graphs, hypergraphs, parenthesis langua…

Authors: ** Paul Tarau (Department of Computer Science, Engineering, University of North Texas) **

Declarative Combinatorics: Isomorphisms, Hylomorphisms and Hereditarily   Finite Data Types in Haskell
Isomorphic Data Enco dings in Hask ell and their Generalization to Hylomorphisms on Hereditarily Finite Data T yp es P aul T arau Departmen t of Computer Science and Engineering Univ ersity of North T exas E-mail: tar au@cs.unt.e du Abstract. This pap er is an exploration in a functional programming framew ork of isomorphisms b et w een elemen tary data types (natural num- b ers, sets, m ultisets, finite functions, permutations binary decision di- agrams, graphs, h yp ergraphs, paren thesis languages, dyadic rationals, primes, DNA sequences etc.) and their extension to hereditarily finite univ erses through hylomorphisms derived from r anking/unr anking and p airing/unp airing op erations. An embedded higher order c ombinator language pro vides any-to-an y en- co dings automatically . Besides applications to experimental mathematics, a few examples of “free algorithms” obtained b y transferring op erations betw een data types are shown. Other applications range from stream iterators on combina- torial ob jects to self-delimiting codes, succinct data represen tations and generation of random instances. The pap er cov ers 59 data types and, through the use of the embedded com binator language, provides 3540 distinct bijective transformations b et w een them. The self-contained source co de of the pap er, as generated from a literate Hask ell program, is av ailable at http://logic.csci.unt.edu/tarau/ research/2008/fISO.zip . A short, 5 page v ersion of the pap er, published as [1] describ es the idea of organizing v arious data transformations as enco dings to sequences of natural n umbers and giv es a few examples of h ylomorphisms that lift the enco dings to related hereditarily finite universes. Keyw ords : Haskel l data r epr esentations, data typ e isomorphisms, de clar- ative c ombinatorics, c omputational mathematics, A ckermann enc o ding, G¨ odel numberings, arithmetization, ranking/unr anking, her e ditarily fi- nite sets, functions and p ermutations, enc o dings of binary de cision dia- gr ams, dyadic r ationals, DNA enco dings 1 In tro duction Analogical/metaphorical thinking routinely shifts entities and operations from a field to another hoping to uncov er similarities in representation or use [2]. Compilers conv ert programs from human cen tered to machine centered rep- resen tations - sometime reversibly . Complexit y classes are defined through compilation with limited resources (time or space) to similar problems [3, 4]. Mathematical theories often borrow pro of patterns and reasoning tec hniques across close and sometime not so close fields. A relatively small num ber of universal data t yp es are used as basic building blo c ks in programming languages and their runtime in terpreters, corresp ond- ing to a few w ell tested mathematical abstractions like sets, functions, graphs, groups, categories etc. A less ob vious leap is that if heterogeneous ob jects can b e seen in some wa y as isomorphic, then we can share them and compress the underlying informa- tional univ erse b y collapsing isomorphic encodings of data or programs whenev er p ossible. Sharing heterogeneous data ob jects faces tw o problems: – some form of equiv alence needs to b e pro ven b et ween t wo ob jects A and B b efore A can replace B in a data structure, a p ossibly tedious and error prone task – the fast gro wing div ersit y of data t ypes mak es harder and harder to recognize sharing opp ortunities. Besides, this rises the question: what guaranties do w e hav e that sharing across heterogeneous data types is useful and safe? The techniques introduced in this pap er provide a generic solution to these problems, through isomorphic mappings b et ween heterogeneous data types, suc h that unified in ternal represen tations make equiv alence chec king and sharing pos- sible. The added b enefit of these “shapeshifting” data t yp es is that the functors transp orting their data conten t will also transp ort their op erations, resulting in shortcuts that provide, for free, implementations of in teresting algorithms. The simplest instance is the case of isomorphisms – reversible mappings that also transp ort op erations. In their simplest form such isomorphisms show up as en- c o dings to some simpler and easier to manipulate representation, for instance natural num b ers. Suc h enco dings can b e traced back to G¨ odel num b erings [5, 6] asso ciated to form ulae, but a wide diversit y of common computer op erations, ranging from data compression and serialization to wireless data transmissions and crypto- graphic co des qualify . Enco dings b et w een data t yp es provide a v ariet y of services ranging from free iterators and random ob jects to data compression and succinct representations. T asks lik e serialization and p ersistence are facilitated b y simplification of reading or writing op erations without the need of sp ecial purp ose parsers. Sensitivity to internal data represen tation format or size limitations can b e circumv ented without extra programming effort. 2 An Em b edded Data T ransformation Language W e will start by designing an embedded transformation language as a set of op erations on a group oid of isomorphisms. W e will then extended it with a set of higher order combinators mediating the comp osition of the enco dings and the transfer of op erations b et ween data types. 2.1 The Group oid of Isomorphisms W e implement an isomorphism b et ween t wo ob jects X and Y as a Haskell data t yp e encapsulating a bijection f and its inv erse g . W e will call the fr om function the first comp onen t (a se ction in category theory parlance) and the to function the second component (a r etr action ) defining the isomorphism. W e can organize isomorphisms as a gr oup oid as follows: X Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f = g − 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . g = f − 1 data Iso a b = Iso (a → b) (b → a) from (Iso f _) = f to (Iso _ g) = g compose :: Iso a b → Iso b c → Iso a c compose (Iso f g) (Iso f’ g’) = Iso (f’ . f) (g . g’) itself = Iso id id invert (Iso f g) = Iso g f Assuming that for an y pair of type Iso a b , f ◦ g = id a and g ◦ f = id b , we can no w formulate laws ab out isomorphisms that can b e used to test correctness of implemen tations with to ols like QuickChec k [7]. Prop osition 1 The data typ e Iso has a gr oup oid structur e, i.e. the comp ose op er ation, when define d, is asso ciative, itself acts as an identity element and in vert c omputes the inverse of an isomorphism. W e can transp ort op erations from an ob ject to another with b orr ow and lend com binators defined as follows: borrow :: Iso t s → (t → t) → s → s borrow (Iso f g) h x = f (h (g x)) borrow2 (Iso f g) h x y = f (h (g x) (g y)) borrowN (Iso f g) h xs = f (h (map g xs)) lend :: Iso s t → (t → t) → s → s lend = borrow . invert lend2 = borrow2 . invert lendN = borrowN . invert The combinators fit and retrofit just transp ort an ob ject x through an isomorphism and and apply to it an op eration op av ailable on the other side: fit :: (b → c) → Iso a b → a → c fit op iso x = op ((from iso) x) retrofit :: (a → c) → Iso a b → b → c retrofit op iso x = op ((to iso) x) W e can see the com binators from, to, compose, itself, invert, borrow, lend, fit etc. as part of an emb e dde d data tr ansformation language . Note that in this design w e b orro w from our strongly t yped host programming language its abstraction la yers and safety mechanisms that contin ue to c heck the semantic v alidit y of the embedded language constructs. 2.2 Cho osing a Ro ot T o av oid defining n ( n − 1) / 2 isomorphisms betw een n ob jects, we choose a R o ot ob ject to/from which w e will actually implement isomorphisms. W e will extend our em b edded com binator language using the group oid structure of the isomor- phisms to connect any tw o ob jects through isomorphisms to/from the R o ot . Cho osing a R o ot ob ject is somewhat arbitrary , but it mak es sense to pic k a represen tation that is relatively easy conv ertible to v arious others, efficiently implemen table and, las t but not least, scalable to accommo date large ob jects up to the runtime system’s actual memory limits. W e will choose as our R o ot ob ject finite se quenc es of natur al numb ers . They can b e seen as finite functions from an initial segment of N at , say [0 ..n ], to N at . This implies that a finite function can b e seen as an array or a list of natural num b ers except that we do not limit the size of the representation of its v alues. W e will represent them as lists i.e. their Haskell type is [ N at ]. type Nat = Integer type Root = [Nat] W e can no w define an Enc o der as an isomorphism connecting an ob ject to R o ot type Encoder a = Iso a Root together with the com binators with and as pro viding an emb e dde d tr ansformation language for routing isomorphisms through tw o Enc o ders . with :: Encoder a → Encoder b → Iso a b with this that = compose this (invert that) as :: Encoder a → Encoder b → b → a as that this thing = to (with that this) thing The com binator with turns t wo Enco ders in to an arbitrary isomorphism, i.e. acts as a connection hub b et ween their domains. The com binator as adds a more conv enien t syntax suc h that conv erters b et ween A and B can b e designed as: a2b x = as A B x b2a x = as B A x Root A B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . a − 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . b − 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . a 2 b = as B A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . b 2 a = as A B A particularly useful com binator that transp orts binary op erations from an En- co der to another, borrow from , can b e defined as follows: borrow_from :: Encoder a → (a → a → a) → Encoder b → b → b → b borrow_from other op this x y = borrow2 (with other this) op x y Note that one can also use the more in tuitive equiv alen t definition borrow_from’ other op this x y = z where x’ = as other this x y’ = as other this y z’ = op x’ y’ z = as this other z’ giv en that the following equiv alence alw ays holds: bor row f rom ≡ bor row f r om 0 (1) W e will provide extensiv e use cases for these combinators as we p opulate our group oid of isomorphisms. Giv en that [ N at ] has b een c hosen as the ro ot, we will define our finite function data type fun simply as the identit y isomorphism on sequences in [ N at ]. fun :: Encoder [Nat] fun = itself 3 Extending the Group oid of Isomorphisms W e will now p opulate our group oid of isomorphisms with com binators based on a few primitive con verters. 3.1 An Isomorphism b et ween Finite Multisets and Finite F unctions Multisets [8] are unordered collections with rep eated elements. Non-decreasing sequences provide a canonical representation for multisets of natural n umbers. The isomorphism b et ween finite multisets and finite functions is sp ecified with t wo bijections mset2fun and fun2mset . mset :: Encoder [Nat] mset = Iso mset2fun fun2mset While finite multisets and sequences representing finite functions share a com- mon represen tation [ N at ], multisets are sub ject to the implicit constraint that their order is immaterial 1 . This suggest that a multiset like [4 , 4 , 1 , 3 , 3 , 3] could b e represented by first ordering it as [1 , 3 , 3 , 3 , 4 , 4] and then compute the differ- ences betw een consecutive elemen ts i.e. [ x 0 . . . x i , x i +1 . . . ] → [ x 0 . . . x i +1 − x i . . . ]. This gives [1 , 2 , 0 , 0 , 1 , 0], with the first element 1 follow ed by the increments [2 , 0 , 0 , 1 , 0], as implemented b y mset2fun : mset2fun = to_diffs . sort . (map must_be_nat) to_diffs xs = zipWith (-) (xs) (0:xs) must_be_nat n | n ≥ 0 = n It can now b e verified easily that incremental sums of the num b ers in such a sequence return the original set in sorted form, as implemented b y fun2mset : fun2mset ns = tail (scanl ( + ) 0 (map must_be_nat ns)) The resulting isomorphism mset can b e applied directly using its tw o comp onen ts mset2fun and fun2mset . Equiv alently , it can be expressed more “generically” b y using the as combinator, as follows: ∗ ISO > mset2fun [1,3,3,3,4,4] [1,2,0,0,1,0] ∗ ISO > fun2mset [1,2,0,0,1,0] [1,3,3,3,4,4] ∗ ISO > as fun mset [1,3,3,3,4,4] [1,2,0,0,1,0] ∗ ISO > as mset fun [1,2,0,0,1,0] [1,3,3,3,4,4] 3.2 An Isomorphism to Finite Sets of Natural Numbers While finite sets and sequences share a common represen tation [ N at ], sets are sub ject to the implicit constraints that all their elements are distinct and order is immaterial. Like in the case of multisets, this suggest that a set like { 7 , 1 , 4 , 3 } could b e represen ted by first ordering it as { 1 , 3 , 4 , 7 } and then compute the differences b et ween consecutive elements. This gives [1 , 2 , 1 , 3], with the first elemen t 1 follow ed by the incremen ts [2 , 1 , 3]. T o turn it into a bijection, including 0 as a p ossible member of a sequence, another adjustment is needed: elements in the sequence of increments should b e replaced b y their predecessors. This gives [1 , 1 , 0 , 2] as implemented b y set2fun : set2fun xs | is_set xs = shift_tail pred (mset2fun xs) 1 Suc h constrain ts can b e regarded as laws that we assume ab out a giv en data type, when needed, restricting it to the appropriate domain of the underlying mathemat- ical concept. shift_tail _ [] = [] shift_tail f (x:xs) = x:(map f xs) is_set ns = ns = = nub ns It can no w b e v erified easily that predecessors of the incremental sums of the successors of num b ers in such a sequence, return the original set in sorted form, as implemented b y fun2set : fun2set = (map pred) . fun2mset . (map succ) The Enc o der (an isomorphism with fun ) can b e sp ecified with the t wo bijections set2fun and fun2set . set :: Encoder [Nat] set = Iso set2fun fun2set The Enco der ( set ) is now ready to interoperate with another Enco der: ∗ ISO > as fun set [0,2,3,4,9] [0,1,0,0,4] ∗ ISO > as set fun [0,1,0,0,4] [0,2,3,4,9] ∗ ISO > as mset set [0,2,3,4,9] [0,1,1,1,5] ∗ ISO > as set mset [0,1,1,1,5] [0,2,3,4,9] As the example shows,the Encoder set connects arbitrary lists of natural num- b ers representing finite functions to strictly increasing sequences of (distinct) natural num b ers represe n ting sets. Then, through the use of the combinator as , sets represented by set are connected to multisets represented by mset . This connection is (implicitly) routed through a connection to fun , as if ∗ ISO > as mset fun [0,1,0,0,4] [0,1,1,1,5] w ere executed. 3.3 F olding Sets in to Natural Num b ers W e can fold a set, represented as a list of distinct natural n umbers into a sin- gle natural num b er, reversibly , by observing that it can b e seen as the list of exp onen ts of 2 in the num b er’s base 2 representation. nat_set = Iso nat2set set2nat nat2set n | n ≥ 0 = nat2exps n 0 where nat2exps 0 _ = [] nat2exps n x = if (even n) then xs else (x:xs) where xs = nat2exps (n ‘div‘ 2) (succ x) set2nat ns | is_set ns = sum (map (2^) ns) W e will standardize this pair of op erations as an Enc o der for a natural n umber using our Ro ot as a mediator: nat :: Encoder Nat nat = compose nat_set set Giv en that nat is an isomorphism with the Ro ot fun , one can use directly its from and to comp onen ts: ∗ ISO > from nat 2008 [3,0,1,0,0,0,0] ∗ ISO > to nat it 2008 Moreo ver, the resulting Encoder ( nat ) is now ready to interoperate with an y Enco der, in a generic wa y: ∗ ISO > as fun nat 2008 [3,0,1,0,0,0,0] ∗ ISO > as set nat 2008 [3,4,6,7,8,9,10] ∗ ISO > as nat set [3,4,6,7,8,9,10] 2008 ∗ ISO > lend nat reverse 2008 1135 ∗ ISO > lend nat_set reverse 2008 2008 ∗ ISO > borrow nat_set succ [1,2,3] [0,1,2,3] ∗ ISO > as set nat 42 [1,3,5] ∗ ISO > fit length nat 42 3 ∗ ISO > retrofit succ nat_set [1,3,5] 43 The reader might notice at this p oin t that we hav e already made full circle - as finite sets can b e seen as instances of finite sequences. Injective functions that are not surjections with wider and wider gaps can b e generated using the fact that one of the representations is information theoretically “denser” than the other, for a given range: ∗ ISO > as set fun [0,1,2,3] [0,2,5,9] ∗ ISO > as set fun $ as set fun [0,1,2,3] [0,3,9,19] ∗ ISO > as set fun $ as set fun $ as set fun [0,1,2,3] [0,4,14,34] One can now define, for instance, a mapping from natural num b ers to multi-sets simply as: nat2mset = as mset nat mset2nat = as nat mset but we will not explicitly need suc h definitions as the the equiv alen t function is clearly provided b y the combinator as . One can no w b orrow operations betw een set and nat as follows: ∗ ISO > borrow_from set union nat 42 2008 2042 ∗ ISO > 42 . | . 2008 :: Nat 2042 ∗ ISO > borrow_from set intersect nat 42 2008 8 ∗ ISO > 42 .&. 2008 :: Nat 8 ∗ ISO > borrow_from nat ( ∗ ) set [1,2,3] [4,5] [5,7,9] ∗ ISO > borrow_from nat ( + ) set [1,2,3] [3,4,5] [1,2,6] and notice that op erations like union and in tersection of sets map to b oolean op erations on num bers as exp ected, while other op erations are not necessarily meaningful at first sight. W e will show next a few cases where suc h “shapshift- ings” of op erations reveal more interesting analogies. 3.4 Enco ding Finite Multisets with Primes A factorization of a natural num b er is uniquely describ ed as m ulti-set or primes. W e will use the fact that eac h prime n um b er is uniquely asso ciated to its position in the infinite stream of primes to obtain a bijection from multisets of natural n umbers to natural n umbers. W e assume defined a prime generator primes and a factoring function to factors (see App endix). The function nat2pmset maps a natural n umber to the m ultiset of prime p ositions in its factoring. Note that we treat 0 as [] and shift n to n+1 to accomo date 0 and 1 , to which prime factoring op erations do not apply . nat2pmset 0 = [] nat2pmset n = map (to_pos_in (h:ts)) (to_factors (n + 1) h ts) where (h:ts) = genericTake (n + 1) primes to_pos_in xs x = fromIntegral i where Just i = elemIndex x xs The function pmset2nat maps back a multiset of p ositions of primes to the result of the pro duct of the corresp onding primes. Again, we map [] to 0 and shift back b y 1 the result. pmset2nat [] = 0 pmset2nat ns = (product ks)-1 where ks = map (from_pos_in ps) ns ps = primes from_pos_in xs n = xs !! (fromIntegral n) W e obtain the Enco der: pmset :: Encoder [Nat] pmset = compose (Iso pmset2nat nat2pmset) nat w orking as follows: ∗ ISO > as pmset nat 2008 [3,3,12] ∗ ISO > as nat pmset it 2008 ∗ ISO > map (as pmset nat) [0..7] [[],[0],[1],[0,0],[2],[0,1],[3],[0,0,0]] Note that the mappings from a set or sequence to a num ber work in time and space linear in the bitsize of the n umber. On the other hand, as prime num- b er enumeration and factoring are inv olved in the mapping from num b ers to m ultisets this enco ding is intractable for all but small v alues. W e are no w ready to “shap eshift” b et w een data types while watc hing for in teresting landscap es to show up. 3.5 Exploring the analogy b et w een multiset decomp ositions and factoring As natural num b ers can b e uniquely represented as a multiset of prime factors and, indep enden tly , they can also b e represented as a m ultiset with the Enco der mset describ ed in subsection 3.1, the following question arises naturally: Can in any way the “e asy to r everse” enc o ding mset emulate or pr e dict pr op- erties of the the difficult to r everse factoring op er ation? The first step is to define an analog of the multiplication op eration in terms of the computationally easy m ultiset encoding mset . Clearly , it makes sense to tak e inspiration from the fact that factoring of an ordinary pro duct of tw o num b ers can be computed b y concatenating the multisets of prime factors of its op erands. mprod = borrow_from mset ( + + ) nat Prop osition 2 < N , mpr od, 0 > is a c ommutative monoid i.e. mprod is define d for al l p airs of natur al numb ers and it is asso ciative, c ommutative and has 0 as an identity element. After rewriting the definition of mprod as the equiv alen t: mprod_alt n m = as nat mset ((as mset nat n) + + (as mset nat m)) the prop osition follows immediately from the asso ciativit y of the concatenation op eration and the order independence of the m ultiset enco ding provided b y mset . W e can derive an exponentiation op eration as a rep eated application of mprod : mexp n 0 = 0 mexp n k = mprod n (mexp n (k-1)) Here are a few examples comparing mprod to ordinary m ultiplication and exp onen tiation: ∗ ISO > mprod 41 (mprod 33 88) 3539 ∗ ISO > mprod (mprod 41 33) 88 3539 ∗ ISO > mprod 33 46 605 ∗ ISO > mprod 46 33 605 ∗ ISO > mprod 0 712 712 ∗ ISO > mprod 5513 0 5513 ∗ ISO > (41 ∗ 33) ∗ 88 119064 ∗ ISO > 41 ∗ (33 ∗ 88) 119064 ∗ ISO > 33 ∗ 46 1518 ∗ ISO > 46 ∗ 33 1518 ∗ ISO > 1 ∗ 712 712 ∗ ISO > 5513 ∗ 1 5513 ∗ ISO > map ( λ x → mexp x 2) [0..15] [0,3,6,15,12,27,30,63,24,51,54,111,60,123,126,255] ∗ ISO > map ( λ x → x^2) [0..15] [0,1,4,9,16,25,36,49,64,81,100,121,144,169,196,225] Note also that any multiset enco ding of natural num b ers can b e used to define a similar commutativ e monoid structure. In the case of pmset we obtain: pmprod n m = as nat pmset ((as pmset nat n) + + (as pmset nat m)) If one defines: pmprod’ n m = (n + 1) ∗ (m + 1)-1 it follows immediately from the definition of mprod that: pmpr od ≡ pmpr od 0 (2) This is useful as computing pmprod’ is easy while computing mprod is in tractable for large v alues. This brings us back to observe that: Prop osition 3 < N , pmpr od, 0 > is a c ommutative monoid i.e. pmprod is de- fine d for al l p airs of natur al numb ers and it is asso ciative, c ommutative and has 0 as an identity element. Fig. 1 compares the shap es of pmprod’ (virtually the same as ordinary mul- tiplication) and mpro d for operands in [0 .. 2 7 − 1]. One can see the contrast b et w een the regular shap e of ordinary m ultiplication and the recursively “self- similar” landscap e induced by mprod . One can also bring mprod closer to ordinary m ultiplication b y defining mprod’ 0 _ = 0 mprod’ _ 0 = 0 mprod’ m n = (mprod (n-1) (m-1)) + 1 mexp’ n 0 = 1 mexp’ n k = mprod’ n (mexp’ n (k-1)) and by observing that they correlate as follows: ∗ ISO > map ( λ x → mexp’ x 2) [0..16] [0,1,4,7,16,13,28,31,64,25,52,55,112,61,124,127,256] ∗ ISO > map ( λ x → x^2) [0..16] [0,1,4,9,16,25,36,49,64,81,100,121,144,169,196,225,256] [0,1,8,15,64,29,120,127,512,57,232,239,960,253,1016,1023,4096] ∗ ISO > map ( λ x → x^3) [0..16] [0,1,8,27,64,125,216,343,512,729,1000,1331,1728,2197,2744,3375,4096] Fig. 2 shows that v alues for mexp’ follow from b elow those of the x 2 function and that equality only holds when x is a p o wer of 2. Note that the structure induced by mprod’ is similar to ordinary multiplica- tion: Prop osition 4 < N , mpr od 0 , 1 > is a c ommutative monoid i.e. mprod’ is de- fine d for al l p airs of natur al numb ers and it is asso ciative, c ommutative and has 1 as an identity element. In terestingly , mprod’ coincides with ordinary multiplication if one of the op erands is a p ow er of 2. More precisely , the following holds: Prop osition 5 mpr od 0 x y = x ∗ y if and only if ∃ n ≥ 0 such that x = 2 n or y = 2 n . Otherwise, mpr od 0 x y < x ∗ y . Fig. 3 shows the (scaled up b y 1000) self-similar landscap e generated b y the [0 .. 1]-v alued function (mprod’ x y) / (x*y) Besides the connection with pro ducts, natural mappings w orth in vestigating are the analogies b et ween multiset interse ction and gcd of the corresp onding n umbers or b et ween multiset union and the lcm of the corresp onding num b ers. Assuming the definitions of multiset op erations provided in the App endix, one can define: mgcd :: Nat → Nat → Nat mgcd = borrow_from mset msetInter nat mlcm :: Nat → Nat → Nat mlcm = borrow_from mset msetUnion nat Fig. 1: multiplication vs mpro d: pmpr od’ and mpr o d Fig. 2: Square vs. mexp’ n 2 Fig. 3: Ratio betw een mprod’ and product mdiv :: Nat → Nat → Nat mdiv = borrow_from mset msetDif nat and note that prop erties similar to usual arithmetic op erations hold: mpr od ( mgcd x y )( mlcm x y ) ≡ mpr od x y (3) mdiv ( mprod x y ) y ≡ x (4) mdiv ( mprod x y ) x ≡ y (5) While mprod,mprod’,pmprod’ and pmprod are not distributiv e with ordinary addition, it lo oks like an interesting problem to find for each of them compatible additiv e op erations. 3.6 Unfolding Natural Numbers into Bitstrings The isomorphism b et ween natural num bers and bitstring is well kno wn, except that it is usually ignored that conv en tional bit representations of integers need a t wist to b e mapp ed one-to-one to arbitr ary sequences of 0 s and 1 s. As the usual binary representation alwa ys has 1 as its highest digit, nat2bits will drop this bit, given that the length of the list of digits is (implicitly) kno wn. This transformation (a v ariant of the so called bije ctive b ase n representation), brings us an isomorphism b et ween N at and the regular language { 0 , 1 } ∗ . bits :: Encoder [Nat] bits = compose (Iso bits2nat nat2bits) nat nat2bits = drop_last . (to_base 2) . succ drop_last bs = genericTake ((genericLength bs)-1) bs to_base base n = d : (if q = = 0 then [] else (to_base base q)) where (q,d) = quotRem n base bits2nat bs = pred (from_base 2 (bs + + [1])) from_base base [] = 0 from_base base (x:xs) | x ≥ 0 && x < base = x + base ∗ (from_base base xs) Note also that, strictly sp eaking, this is only an isomorphism when the digits in the bitlist are in { 0 , 1 } , therefore we shall assume this constrain t as a law go verning this Enco der. The follo wing examples sho w tw o conv ersion op erations and bits b orro wing a multiplication operation from nat . ∗ ISO > as bits nat 42 [1,1,0,1,0] ∗ ISO > as nat bits [1,1,0,1,0] 42 ∗ ISO > borrow2 (with nat bits) ( ∗ ) [1,1,0] [1,0,1,1] [1,0,0,1,1,0,0,0] The reader might notice at this p oint that we hav e made full circle again - as bitstrings can b e seen as instances of finite sequences. Injective functions that are not surjections with wider and wider gaps can b e generated b y comp osing the as combinators: ∗ ISO > as bits fun [1,1] [1,1,0] ∗ ISO > as bits fun (as bits fun [1,1]) [1,1,0,1] ∗ ISO > as bits fun $ as bits fun $ as bits fun [1,1] [1,1,0,1,1,0] 3.7 Enco ding Signed In tegers T o enco de signed integers one can map p ositive num b ers to even num b ers and strictly negative n umbers to o dd n umbers. This gives the Enco der: type Z = Integer z:: Encoder Z z = compose (Iso z2nat nat2z) nat nat2z n = if even n then n ‘div‘ 2 else (-n-1) ‘div‘ 2 z2nat n = if n < 0 then -2 ∗ n-1 else 2 ∗ n w orking as follows: ∗ ISO > as set z (-42) [0,1,4,6] ∗ ISO > as z set [0,1,4,6] -42 3.8 F unctional Binary Num b ers Ch urch numerals are well known as a functional representation for Peano arith- metic. While b enefiting from lazy ev aluation, they implement a form of unary arithmetic that uses O ( n ) space to represent n . This suggest devising a func- tional represen tation that mimics binary n umbers. W e will do this follo wing the mo del describ ed in subsection 3.6 to provide an isomorphism b et ween N at and the functional equiv alen t of the regular language { 0 , 1 } ∗ . W e will view each bit as a N at → N at transformer: b x = pred x -- begin o x = 2 ∗ x + 0 -- bit 0 i x = 2 ∗ x + 1 -- bit 1 e = 1 -- end As the follo wing example sho ws, comp osition of functions o and i closely parallels the corresp onding bitlists: ∗ ISO > b$i$o$o$i$i$o$i$i$i$i$e 2008 ∗ ISO > as bits nat 2008 [1,0,0,1,1,0,1,1,1,1] W e can follow the same mo del with an abstract data t yp e: data D = E | O D | I D deriving (Eq,Ord,Show,Read) data B = B D deriving (Eq,Ord,Show,Read) from whic h we can generate functional bitstrings as an instance of a fold opera- tion: funbits2nat :: B → Nat funbits2nat = bfold b o i e bfold fb fo fi fe (B d) = fb (dfold d) where dfold E = fe dfold (O x) = fo (dfold x) dfold (I x) = fi (dfold x) Dually , we can rev erse the effect of the functions b, o, i, e as: b’ x = succ x o’ x | even x = x ‘div‘ 2 i’ x | odd x = (x-1) ‘div‘ 2 e’ = 1 and define a generator for our data type as an unfold op eration: nat2funbits :: Nat → B nat2funbits = bunfold b’ o’ i’ e’ bunfold fb fo fi fe x = B (dunfold (fb x)) where dunfold n | n = = fe = E dunfold n | even n = O (dunfold (fo n)) dunfold n | odd n = I (dunfold (fi n)) The tw o op erations form an isomorphism: ∗ ISO > funbits2nat (B $ I $ O $ O $ I $ I $ O $ I $ I $ I $ I $ E) 2008 ∗ ISO > nat2funbits it B (I (O (O (I (I (O (I (I (I (I E)))))))))) W e can define our Enco der as follows: funbits :: Encoder B funbits = compose (Iso funbits2nat nat2funbits) nat Arithmetic op erations can no w be p erformed directly on this represen tation. F or instance, one can define a successor function as: bsucc (B d) = B (dsucc d) where dsucc E = O E dsucc (O x) = I x dsucc (I x) = O (dsucc x) Equiv alen tly arithmetics can b e b orro w ed from N at : ∗ ISO > bsucc (B $ I $ O $ O $ I $ I $ O $ I $ I $ I $ I $ E) B (O (I (O (I (I (O (I (I (I (I E)))))))))) ∗ ISO > as nat funbits it 2009 ∗ ISO > borrow (with nat funbits) succ (B $ I $ O $ O $ I $ I $ O $ I $ I $ I $ I $ E) B (O (I (O (I (I (O (I (I (I (I E)))))))))) ∗ ISO > as nat funbits it 2009 While Haskell’s C-based arbitrary length integers are likely to b e more efficient for most op erations, this represen tation, like Churc h numerals, has the benefit of supp orting partial or delay ed computations through lazy ev aluation. 4 Generic Unranking and Ranking Hylomorphisms The r anking pr oblem for a family of com binatorial ob jects is finding a unique natural num b er asso ciated to it, called its r ank . The inv erse unr anking pr oblem consists of generating a unique combinatorial ob ject asso ciated to each natural n umber. 4.1 Pure Hereditarily Finite Data T yp es The unranking op eration is seen here as an instance of a generic anamorphism mec hanism (an unfold op eration), while the ranking op eration is seen as an in- stance of the corresp onding catamorphism (a fold op eration) [9, 10]. T ogether they form a mixed transformation called hylomorphism . W e will use such hylo- morphisms to lift isomorphisms b et ween lists and natural num b ers to isomor- phisms b et w een a derived “self-similar” tree data type and natural num b ers. In particular we will derive Ack ermann’s encoding from Hereditarily Finite Sets to Natural Numbers. The data t yp e representing hereditarily finite structures will b e a generic m ulti-wa y tree with a single leaf type [] . data T = H [T] deriving (Eq,Ord,Read,Show) The tw o sides of our hylomorphism are parameterized by tw o transformations f and g forming an isomorphism Iso f g : unrank f n = H (unranks f (f n)) unranks f ns = map (unrank f) ns rank g (H ts) = g (ranks g ts) ranks g ts = map (rank g) ts Both com binators can b e seen as a form of “structured recursion” that propagate a simpler op eration guided by the structure of the data type. F or instance, the size of a tree of type T is obtained as: tsize = rank ( λ xs → 1 + (sum xs)) Note also that unrank and rank work on T in co operation with unranks and ranks working on [ T ]. W e can no w combine an anamorphism+catamorphism pair into an isomor- phism hylo defined with rank and unrank on the corresp onding hereditarily finite data types: hylo :: Iso b [b] → Iso T b hylo (Iso f g) = Iso (rank g) (unrank f) hylos :: Iso b [b] → Iso [T] [b] hylos (Iso f g) = Iso (ranks g) (unranks f) Hereditarily Finite Sets Hereditarily Finite Sets will be represented as an Enco der for the tree type T : hfs :: Encoder T hfs = compose (hylo nat_set) nat The hfs Enco der can no w b orro w op erations from sets or natural n umbers as follo ws: hfs_union = borrow2 (with set hfs) union hfs_succ = borrow (with nat hfs) succ hfs_pred = borrow (with nat hfs) pred ∗ ISO > hfs_succ (H []) H [H []] ∗ ISO > hfs_union (H [H []] ) (H []) H [H []] Otherwise, hylomorphism induced isomorphisms work as usual with our embed- ded transformation language: ∗ ISO > as hfs nat 42 H [H [H []],H [H [],H [H []]],H [H [],H [H [H []]]]] ∗ ISO > as hfs nat 2008 H [H [H [],H [H []]],H [H [H [H []]]],H [H [H []], H [H [H []]]],H [H [],H [H []],H [H [H []]]], H [H [H [],H [H []]]],H [H [],H [H [],H [H []]]], H [H [H []],H [H [],H [H []]]]] One can notice that we hav e just deriv ed as a “free algorithm” Ack ermann’s enco ding [11, 12] from Hereditarily Finite Sets to Natural Numbers: f ( x ) = if x = {} then 0 else P a ∈ x 2 f ( a ) together with its inv erse: ackermann = as nat hfs inverse_ackermann = as hfs nat One can represent the action of a h ylomorphism unfolding a natural n umber in to a hereditarily finite set as a directed graph with outgoing edges induced by by applying the inverse ackermann function as sho wn in Fig. 4. 1 0 2 3 4 6 7 8 9 10 2008 Fig. 4: 2008 as a HFS Hereditarily Finite F unctions The same tree data type can host a hylomor- phism derived from finite functions instead of finite sets: hff :: Encoder T hff = compose (hylo nat) nat The hff Enco der can b e seen as another “free algorithm”, providing data com- pression/succinct represen tation for Hereditarily Finite Sets. Note, for instance, the significantly smaller tree size in: ∗ ISO > as hff nat 42 H [H [H []],H [H []],H [H []]] ISO > as hff nat 2008 H [H [H [],H []],H [],H [H []],H [],H [],H [],H []] As the cognoscen ti might observe this is explained by the fact that hff pro- vides higher information density than hfs , by incorp orating order information that matters in the case of a sequence and is ignored in the case of a set. One can represent the action of a hylomorphism unfolding a natural num b er into a hereditarily finite function as a directed ordered multi-graph as sho wn in Fig. 5. Note that as the mapping as fun nat generates a sequence where the order of the edges matters, this order is indicated in tegers starting from 0 lab eling the edges. 0 1 0 3 1 0 2008 6 5 4 3 1 2 0 Fig. 5: 2008 as a HFF It is also interesting to connect sequences and HFF directly - in case one w ants to represen t giant “sparse n umbers” that corresp ond to sequences that w ould ov erflo w memory if represented as natural num b ers but hav e a relatively simple structure as formulae used to compute them. W e obtain the Enco der: hffs :: Encoder T hffs = Iso hff2fun fun2hff fun2hff ns = H (map (as hff nat) ns) hff2fun (H hs) = map (as nat hff) hs whic h can b e used to generate HFFs asso ciated to very large num b ers: ∗ ISO > as hffs fun [2^65,2^131] H [H [H [H [],H [H [],H [H []]]]],H [H [H [],H [],H [H [],H [H []]]]]] 4.2 Hereditarily Finite Multisets In a similar w ay , one can deriv e an Enco der for Hereditarily Finite Multisets based on either the mset or the pmset isomorphisms: nat_mset = Iso nat2mset mset2nat hfm :: Encoder T hfm = compose (hylo nat_mset) nat nat_pmset = Iso nat2pmset pmset2nat hfpm :: Encoder T hfpm = compose (hylo nat_pmset) nat w orking as follows: ∗ ISO > as hfm nat 2008 H [H [H [],H []],H [H [],H []],H [H [H [H []]]],H [H [H [H []]]], H [H [H [H []]]],H [H [H [H []]]],H [H [H [H []]]]] ∗ ISO > as nat hfm it 2008 ∗ ISO > as hfpm nat 2008 H [H [H [],H []],H [H [],H []],H [H [H [],H [H []]]]] ∗ ISO > as nat hfpm it 2008 After implementing this enco ding some Go ogle search revealed that it is essen- tially the same as [13] where it app ears as an enco ding of r o ote d tr e es . 4.3 A Hylomorphism with Atoms/Urelemen ts A similar construction can be carried out for the more practical case when A toms ( Ur elements in Set Theory parlance) are present. Hereditarily Finite Sets with Urelemen ts are represen ted as generic multi-w ay trees with a leaf type holding urelemen ts/atoms: data UT a = A a | F [UT a] deriving (Eq,Ord,Read,Show) A toms will b e mapp ed to natural num b ers in [0..ulimit-1] . Assuming for simplicit y that ulimit is fixed, we denote this set A and denote U T the set of trees of type U T with atoms in A . Unr anking As an adaptation of the unfold op eration, natural num b ers will b e mapp ed to elements of U T with a generic higher order function unrankU f , defined from N at to U T , parameterized b y the natural num ber ulimit and the transformer function f : ulimit = 4 unrankU = unrankUL ulimit unranksU = unranksUL ulimit unrankUL l _ n | n ≥ 0 && n < l = A n unrankUL l f n = F (unranksUL l f (f (n-l))) unranksUL l f ns = map (unrankUL l f) ns R anking Similarly , as an adaptation of fold , a generic in verse mapping rankU is defined as: rankU = rankUL ulimit ranksU = ranksUL ulimit rankUL l _ (A n) | n ≥ 0 && n < l = n rankUL l g (F ts) = l + (g (ranksUL l g ts)) ranksUL l g ts = map (rankUL l g) ts where rankU g maps trees to n umbers and ranksU g maps lists of trees to lists of num b ers. The follo wing prop osition describes conditions under which rankU and unrankU can b e used to lift isomorphisms betw een [ N at ] and N at to isomorphisms in- v olving hereditarily finite structures: Prop osition 6 If the tr ansformer function f : N at → [ N at ] is a bije ction with inverse g , such that n ≥ ulimit ∧ f ( n ) = [ n 0 , ...n i , ...n k ] ⇒ n i < n , then ( unr ankU f ) : N at → U T is a bije ction with inverse ( rank U g ) : U T → N at and the r e cursive c omputations defining b oth functions terminate in a finite numb er of steps. Pr o of. Note that unrankU terminates as its arguments strictly decrease at each step and rankU terminates as leaf no des are even tually reached. That b oth are bijections, follo ws b y induction on the structure of N at and U T , giv en that map preserv es bijections and that adding/subtracting ul imit ensures that enco dings of atoms and sets never o verlap. The resulting hylomorphisms are defined as previously: hyloU (Iso f g) = Iso (rankU g) (unrankU f) hylosU (Iso f g) = Iso (ranksU g) (unranksU f) An Enco der for Hereditarily Finite Sets with Urelemen ts is defined as: uhfs :: Encoder (UT Nat) uhfs = compose (hyloU nat_set) nat Note that this encoder pro vides a generalization of Ac kermann’s mapping, to Hereditarily Finite Sets with Urelements in [0 ..u − 1] defined as: f u ( x ) = if x < u then x else u + P a ∈ x 2 f u ( a ) A similar Enco der for Hereditarily Finite F unctions with Urelements is de- fined as: uhff :: Encoder (UT Nat) uhff = compose (hyloU nat) nat 4.4 Extending the enco ding for the case of an infinite set of A toms/Urelements An adaptation of the previous construction for the case when an infinite supply of atoms/urelements is needed (i.e. when their n umber is not known in adv ance) follo ws. Unr anking As an adaptation of the unfold op eration, natural num b ers will b e mapp ed to elements of U T with a generic higher order function unrankIU f , defined from N at to U T , parameterized by the transformer function f : unrankIU _ n | even n = A (n ‘div‘ 2) unrankIU f n = F (unranksIU f (f ((n-1) ‘div‘ 2))) unranksIU f ns = map (unrankIU f) ns Note that (an infinite supply of ) even num bers provides co des for atoms, while o dd n umbers are used to enco de the non-leaf structure of the trees in UT . R anking Similarly , as an adaptation of fold , a generic inv erse mapping rankIU g is defined as: rankIU _ (A n) = 2 ∗ n rankIU g (F ts) = 1 + 2 ∗ (g (ranksIU g ts)) ranksIU g ts = map (rankIU g) ts where rankIU g maps trees to num b ers and ranksIU g maps lists of trees to lists of num b ers. The resulting hylomorphisms are defined as previously: hyloIU (Iso f g) = Iso (rankIU g) (unrankIU f) hylosIU (Iso f g) = Iso (ranksIU g) (unranksIU f) An Enco der for Hereditarily Finite Sets with an infinite supply of Urelemen ts is defined as: iuhfs :: Encoder (UT Nat) iuhfs = compose (hyloIU nat_set) nat A similar Enco der for Hereditarily Finite F unctions with and infinite supply of Urelements is defined as: iuhff :: Encoder (UT Nat) iuhff = compose (hyloIU nat) nat 5 P erm utations and Hereditarily Finite Perm utations W e hav e seen that finite sets and their deriv atives represent information in an or der indep enden t wa y , focusing exclusively on information c ontent . W e will no w lo ok at data representations that focus exclusively on or der in a c ontent indep enden t wa y - finite p erm utations and their hereditarily finite deriv atives. T o obtain an encoding for finite p erm utations w e will first review a rank- ing/unranking mechanism for p ermutations that inv olves an unconv entional n u- meric representation, factor adics . 5.1 The F actoradic Numeral System The factoradic numeral system [14] replaces digits m ultiplied by a p o wer of a base n with digits that m ultiply successive v alues of the factorial of n . In the increasing order v arian t fr the first digit d 0 is 0, the second is d 1 ∈ { 0 , 1 } and the n -th is d n ∈ [0 ..n ]. F or instance, 42 = 0 ∗ 0! + 0 ∗ 1! + 0 ∗ 2! + 3 ∗ 3! + 1 ∗ 4!. The left-to-righ t, decreasing order v ariant fl is obtained by rev ersing the digits of fr . fr 42 [0,0,0,3,1] rf [0,0,0,3,1] 42 fl 42 [1,3,0,0,0] lf [1,3,0,0,0] 42 The function fr generating the factoradics of n, right to left, handles the sp ecial case of 0 and calls a lo cal function f which recurses and divides with increasing v alues of n while collecting digits with mod : fr 0 = [0] fr n = f 1 n where f _ 0 = [] f j k = (k ‘mod‘ j) : (f (j + 1) (k ‘div‘ j)) The function fl , with digits left to right is obtained as follo ws: fl = reverse . fr The function lf (inv erse of fl ) con verts bac k to decimals b y summing up results while computing the factorial progressively: rf ns = sum (zipWith ( ∗ ) ns factorials) where factorials = scanl ( ∗ ) 1 [1..] Finally , lf , the inv erse of fl is obtained as: lf = rf . reverse 5.2 Ranking and unranking p ermutations of giv en size with Lehmer co des and factoradics The Lehmer co de of a p erm utation f of size n is defined as the sequence l ( f ) = ( l 1 ( f ) . . . l i ( f ) . . . l n ( f )) where l i ( f ) is the num b er of elements of the set { j > i | f ( j ) < f ( i ) } [15]. Prop osition 7 The L ehmer c o de of a p ermutation determines the p ermutation uniquely. The function perm2nth computes a rank for a permutation ps of size>0 . It starts b y first computing its Lehmer co de ls with perm2lehmer . Then it asso ciates a unique natural num ber n to ls , b y con verting it with the function lf from factoradics to decimals. Note that the Lehmer co de Ls is used as the list of digits in the factoradic representation. perm2nth ps = (l,lf ls) where ls = perm2lehmer ps l = genericLength ls perm2lehmer [] = [] perm2lehmer (i:is) = l:(perm2lehmer is) where l = genericLength [j | j ← is,j < i] The function nat2perm provides the matc hing unr anking op eration asso ciat- ing a p ermutation ps to a given size>0 and a natural num b er n . It generates the n -th p erm utation of a given size. nth2perm (size,n) = apply_lehmer2perm (zs + + xs) [0..size-1] where xs = fl n l = genericLength xs k = size-l zs = genericReplicate k 0 The follo wing function extracts a p erm utation from a “digit” list in factoradic represen tation. apply_lehmer2perm [] [] = [] apply_lehmer2perm (n:ns) ps@(x:xs) = y : (apply_lehmer2perm ns ys) where (y,ys) = pick n ps pick i xs = (x,ys + + zs) where (ys,(x:zs)) = genericSplitAt i xs Note also that apply lehmer2perm is used this time to reconstruct the p erm uta- tion ps from its Lehmer code, which in turn is computed from the p erm utation’s factoradic representation. One can try out this bijective mapping as follows: nth2perm (5,42) [1,4,0,2,3] perm2nth [1,4,0,2,3] (5,42) nth2perm (8,2008) [0,3,6,5,4,7,1,2] perm2nth [0,3,6,5,4,7,1,2] (8,2008) 5.3 A bijective mapping from p ermutations to natural n umbers Lik e in the case of BDDs, one more step is needed to to extend the mapping b et w een p erm utations of a given length to a bijectiv e mapping from/to N at : w e will hav e to “shift tow ards infinit y” the starting p oin t of each new blo c of p erm utations in N at as permutations of larger and larger sizes are enumerated. First, w e need to kno w b y ho w m uch - so w e compute the sum of all factorials up to n !. sf n = rf (genericReplicate n 1) This is done by noticing that the factoradic representation of [0,1,1,..] does just that. What we are really interested into, is decomp osing n in to the distance to the last sum of factorials smaller than n , n m and the its index in the sum, k . to_sf n = (k,n-m) where k = pred (head [x | x ← [0..],sf x > n]) m = sf k Unr anking of an arbitrary p erm utation is now easy - the index k determines the size of the p erm utation and n-m determines the rank. T ogether they select the righ t p erm utation with nth2perm . nat2perm 0 = [] nat2perm n = nth2perm (to_sf n) R anking of a p erm utation is ev en easier: we first compute its size and its rank, then we shift the rank by the sum of all factorials up to its size, en umerating the ranks previously assigned. perm2nat ps = (sf l) + k where (l,k) = perm2nth ps It works as follo ws: nat2perm 2008 [0,2,3,1,4] perm2nat [0,2,3,1,4] 42 nat2perm 2008 [1,4,3,2,0,5,6] perm2nat [1,4,3,2,0,5,6] 2008 W e can now define the Enco der as: perm :: Encoder [Nat] perm = compose (Iso perm2nat nat2perm) nat The Enco der works as follows: ∗ ISO > as perm nat 2008 [1,4,3,2,0,5,6] ∗ ISO > as nat perm it 2008 ∗ ISO > as perm nat 1234567890 [1,6,11,2,0,3,10,7,8,5,9,4,12] ∗ ISO > as nat perm it 1234567890 5.4 Hereditarily Finite Perm utations By using the generic unrank and rank functions defined in section 4 we can extend the isomorphism defined b y nat2perm and perm2nat to enco dings of Hereditarily Finite Perm utations ( H F P ). nat2hfp = unrank nat2perm hfp2nat = rank perm2nat The enco ding works as follows: ∗ ISO > nat2hfp 42 H [H [],H [H [],H [H []]],H [H [H []],H []], H [H []],H [H [],H [H []],H [H [],H [H []]]]] ∗ ISO > hfp2nat it 42 W e can now define the Enco der as: hfp :: Encoder T hfp = compose (Iso hfp2nat nat2hfp) nat The Enco der works as follows: ∗ ISO > as hfp nat 42 H [H [],H [H [],H [H []]],H [H [H []],H []], H [H []],H [H [],H [H []],H [H [],H [H []]]]] ∗ ISO > as nat hfp it 42 ∗ JFISO > as hfp nat 2008 H [H [H []],H [H [],H [H []],H [H [],H [H []]]],H [H [H []],H []], H [H [],H [H []]],H [],H [H [],H [H [],H [H []]],H [H []]], H [H [H []],H [],H [H [],H [H []]]]] ∗ ISO > as nat hfp it 2008 0 1 0 2 0 1 3 1 0 4 0 1 2 5 0 2 1 6 1 0 2 2008 4 0 3 2 1 5 6 Fig. 6: 2008 as a HFP As shown in Fig 6 an ordered digraph (with lab els starting from 0 representing the order of outgoing edges) can be used to represen t the unfolding of a natural n umber to the asso ciated hereditarily finite p erm utation. An interesting prop- ert y of graphs asso ciated to hereditarily finite p erm utations is that moving from a num ber n to its successor typically only induces a reordering of the lab eled edges, as shown in Fig. 7. 6 Hereditary base-k represenations and Go odstein sequences Definition 1 Her e ditary b ase-k r epr esentation of a numb er x is obtaine d by r ep- r esenting x as a sum of p owers of k fol lowe d by expr ession of e ach of the exp onents with nonzer o c o eficients as a sum of p owers of k, r e cursively. First we express a single step of this transformation to/from a polynomial in base k as a pair of bijections: nat2kpoly k n = filter ( λ p → 0 / = fst p) ps where ns = to_base k n l = genericLength ns is = [0..l-1] ps = zip ns is 0 1 0 2 0 1 3 1 0 4 0 1 2 5 0 2 1 6 1 0 2 2009 4 0 3 2 1 6 5 Fig. 7: 2009 as a HFP kpoly2nat k ps = sum (map ( λ (d,e) → d ∗ k^e) ps) The transformation works as follows: ∗ ISO > nat2kpoly 3 2009 [(2,0),(1,2),(2,3),(2,5),(2,6)] ∗ ISO > kpoly2nat 3 it 2009 The recursive pro cess generates a tree, with co eficien ts of eac h expansion labeling no des. W e can host this expansion in the data type HB : data HB a = HB a [HB a] deriving (Eq,Ord,Show,Read) W e will define, for each base k, tw o isomorphisms nat2hb k and hb2nat k b e- t ween natural num b ers and p olynomials: nat2hb :: Nat → Nat → [HB Nat] nat2hb _k 0 = [] nat2hb k n | n < k = [HB n []] nat2hb k n = gs where ps’ = nat2kpoly k n gs = map (nat2hb1 k) ps’ nat2hb1 k (d,e) = HB d (nat2hb k e) hb2nat :: Nat → [HB Nat] → Nat hb2nat k [] = 0 hb2nat k ts = kpoly2nat k ps where ps = map (hb2nat1 k) ts hb2nat1 k (HB d ts) = (d,hb2nat k ts) W e can now define a family of Encoders , one for each base k , as follows: hb :: Nat → Encoder [HB Nat] hb k = compose (Iso (hb2nat k) (nat2hb k)) nat The new concept here is working with a parametric family of Enco ders. With a small adaptation, the syntax of the as com binator scales up naturally: ∗ ISO > as (hb 3) nat 42 [HB 2 [HB 1 []],HB 1 [HB 2 []],HB 1 [HB 1 [HB 1 []]]] ∗ ISO > as nat (hb 3) it 42 Note that the base do es not o ccur as such in the hereditary base-k expression obtained with the Enco der hb . This prop ert y can b e used to obtain Goodstein sequences b y bumping the b ase from k to k+1 i.e. interpreting a (hb k) ex- pression as a (hb (k+1)) expression and then subtracting 1 from the result, i.e: goodsteinStep k n = (hb2nat (k + 1) (nat2hb k n)) - 1 goodsteinSeq _ 0 = [] goodsteinSeq k n = n:(goodsteinSeq (k + 1) m) where m = goodsteinStep k n goodstein m = goodsteinSeq 2 m ∗ ISO > goodstein 3 [3,3,3,2,1] ∗ ISO > take 12 (goodstein 4) [4,26,41,60,83,109,139,173,211,253,299,348] Go odstein’s Theorem (prov able in second order arithmetics) states that this sequence alwa ys terminates at 0. The remark able thing ab out it is that it is an undecidable statement in first order Peano arithmetics, that in contrast to G¨ odel’s therorem, inv olves only “con ven tional” numerical relations. 7 P airing/Unpairing A p airing function is an isomorphism f : N at × N at → N at . Its inv erse is called unp airing . 7.1 The Pepis-Kalmar-Robinson Pairing F unction An classic pairing function is p epis J , together with its left and right unpairing companions p epis K and p epis L that hav e b een used, by P epis, Kalmar and Robinson together with Can tor’s functions, in some fundamental work on recur- sion theory , decidability and Hilb ert’s T enth Problem in [16–24]. The function p epis J combines t wo num b ers rev ersibly by multiplying a p o wer of 2 deriv ed from the first and an o dd num b er deriv ed from the second: f ( x, y ) = 2 x ∗ (2 ∗ y + 1) − 1 (6) Its Haskell implemen tation, together with its inv erse is: pepis_J x y = pred ((2^x) ∗ (succ (2 ∗ y))) pepis_K n = two_s (succ n) pepis_L n = (pred (no_two_s (succ n))) ‘div‘ 2 two_s n | even n = succ (two_s (n ‘div‘ 2)) two_s _ = 0 no_two_s n = n ‘div‘ (2^(two_s n)) This pairing function (slow er in the second argument) works as follows: pepis_J 1 10 41 pepis_J 10 1 3071 [pepis_J i j | i ← [0..3],j ← [0..3]] [0,2,4,6,1,5,9,13,3,11,19,27,7,23,39,55] As Haskell provides a built-in ordered pair, it is conv enient to regroup the func- tions J, K, L (giv en in Julia Robinson’s original notation) as mappings to/from built-in ordered pairs: pepis_pair (x,y) = pepis_J x y pepis_unpair n = (pepis_K n,pepis_L n) Observing that the num b er of 0 s in fron t of the representation of a natural n umber n as a sequence equals pepis K n , an alternativ e implementation could b e: pepis_pair’ (x,y) = (fun2nat (x:(nat2fun y)))-1 pepis_unpair’ n = (x,fun2nat ns) where (x:ns) = nat2fun (n + 1) fun2nat = set2nat . fun2set nat2fun = set2fun . nat2set Note also that pepis unpair is “asymmetrical” in the sense that its first com- p onen t grows muc h slow er than the second, when applied to [0..] . Sometimes it is more useful to hav e the opp osite b eha vior rpepis_pair (x,y) = pepis_pair (y,x) rpepis_unpair n = (y,x) where (x,y) = pepis_unpair n After defining type Nat2 = (Nat,Nat) w e obtain the enco der pnat2 :: Encoder Nat2 pnat2 = compose (Iso pepis_pair pepis_unpair) nat rpnat2 :: Encoder Nat2 rpnat2 = compose (Iso rpepis_pair rpepis_unpair) nat 7.2 A Bitwise Pairing/Unpairing F unction W e will now introduce an unusually simple pairing function (also mentioned in [25], p.142). The function bitpair w orks by splitting a num b er’s big endian bitstring represen tation into o dd and even bits, while its inv erse bitunpair blends the o dd and even bits back together. bitpair :: Nat2 → Nat bitpair (i,j) = set2nat ((evens i) + + (odds j)) where evens x = map (2 ∗ ) (nat2set x) odds y = map succ (evens y) bitunpair :: Nat → Nat2 bitunpair n = (f xs,f ys) where (xs,ys) = partition even (nat2set n) f = set2nat . (map (‘div‘ 2)) The transformation of the bitlists is shown in the following example with bitstrings aligned: ∗ ISO > bitunpair 2008 (60,26) -- 2008:[0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1] -- 60:[0, 0, 1, 1, 1, 1] -- 26:[ 0, 1, 0, 1, 1 ] W e can derive the following Encoder: nat2 :: Encoder Nat2 nat2 = compose (Iso bitpair bitunpair) nat w orking as follows: ∗ ISO > as nat2 nat 2008 (60,26) ∗ ISO > as nat nat2 (60,26) 2008 In a w ay similar to hereditarily finite trees generated by unfoldings one can apply strictly decreasing 2 unpairing functions recursively . Figures 8 and 9 sho w the directed graphs describing recursive application of bitunpair and pepis unpair . 0 1 2 0 1 5 1 0 31 1 0 251 0 1 502 0 1 1004 0 1 2008 0 1 Fig. 8: Graph obtained by recursive application of pepis unpair for 2008 Giv en that unpairing functions are bijections from N at to N at × N at they will progressiv ely co ver all p oin ts ha ving natural n umber co ordinates in their range in the plane. Figures 10, 11 show the curves generated by bitunpair and pepis unpair . Fig. 12 sho ws the action of the pairing function bitpair on its t wo arguments argumen ts in [0..63]. 2 except for 0 and 1, t ypically 0 1 2 0 1 3 1 0 4 1 0 6 1 0 26 1 0 60 1 0 2008 1 0 Fig. 9: Graph obtained by recursive application of bitunpair for 2008 Fig. 10: 2D curv e connecting v alues of bitunpair n for n ∈ [0 .. 2 10 − 1] Fig. 11: 2D curv e connecting v alues of pepis unpair n for n ∈ [0 .. 2 10 − 1] Fig. 12: V alues of bitpair x y with x,y in [0..63] 7.3 Enco ding Unordered P airs T o derive an enco ding of unordered pairs, i.e. 2 element sets, one can combine pairing/unpairing with conv ersion b et ween sequences and sets: pair2unord_pair (x,y) = fun2set [x,y] unord_pair2pair [a,b] = (x,y) where [x,y] = set2fun [a,b] unord_unpair = pair2unord_pair . bitunpair unord_pair = bitpair . unord_pair2pair W e can derive the following equiv alent Enco ders: set2 :: Encoder [Nat] set2 = compose (Iso unord_pair2pair pair2unord_pair) nat2 that go es through nat2 , working as follows: ∗ ISO > as set2 nat 2008 [60,87] ∗ ISO > as nat set2 it 2008 and set2’ :: Encoder [Nat] set2’ = compose (Iso unord_pair unord_unpair) nat that go es through nat , working as follows: ∗ ISO > as set2’ nat 2008 [60,87] ∗ ISO > as nat set2’ [60,87] 2008 ∗ ISO > as nat set2’ [87,60] 2008 7.4 Enco dings Multiset P airs T o deriv e an enco ding of 2 elemen t m ultisets, one can combine pairing/unpairing with conv ersion b et ween sequences and multisets: pair2mset_pair (x,y) = (a,b) where [a,b] = fun2mset [x,y] mset_unpair2pair (a,b) = (x,y) where [x,y] = mset2fun [a,b] mset_unpair = pair2mset_pair . bitunpair mset_pair = bitpair . mset_unpair2pair W e can derive the following Encoder: mset2 :: Encoder Nat2 mset2 = compose (Iso mset_unpair2pair pair2mset_pair) nat2 w orking as follows: ∗ ISO > as mset2 nat 2008 (60,86) ∗ ISO > as nat mset2 it 2008 Figure 13 sho ws the curve generated by mset unpair cov ering the lattice of p oin ts in its range. Fig. 13: 2D curv e connecting v alues of mset unpair n for n ∈ [0 .. 2 10 − 1] 7.5 Extending Pairing/Unpairing to Signed Integers Giv en the bijection from nat to z one can easily extend pairing/unpairing op er- ations to signed integers. W e obtain the Enco der: type Z2 = (Z,Z) z2 :: Encoder Z2 z2 = compose (Iso zpair zunpair) nat zpair (x,y) = (nat2z . bitpair) (z2nat x,z2nat y) zunpair z = (nat2z n,nat2z m) where (n,m) = (bitunpair . z2nat) z w orking as follows: ∗ ISO > map zunpair [-5..5] [(-1,1),(-2,-1),(-2,0),(-1,-1),(-1,0),(0,0),(0,-1),(1,0),(1,-1),(0,1),(0,-2)] ∗ ISO > map zpair it [-5,-4,-3,-2,-1,0,1,2,3,4,5] ∗ ISO > as z2 z (-2008) (63,-26) ∗ ISO > as z z2 it -2008 Figure 14 shows the curve cov ering the lattice of integer co ordinates generated b y the function zunpair . Fig. 14: Curve generated by unpairing function on signed in tegers The same construction can b e extended to m ultiset pairing functions: mz2 :: Encoder Z2 mz2 = compose (Iso mzpair mzunpair) nat mzpair (x,y) = (nat2z . mset_pair) (z2nat x,z2nat y) mzunpair z = (nat2z n,nat2z m) where (n,m) = (mset_unpair . z2nat) z w orking as follows: ∗ ISO > as mz2 z (-42) (1,-8) ∗ ISO > as z mz2 it -42 7.6 Gauss Integers and Pairing F unctions Visualizing complex v ariable functions requires 4 dimensions ev en for 1-v ariable functions. This is usually handled by asso ciating a color/hue v alue to the phase while representing the mo dulus along the z-axis. How ever, for 2-argument com- plex functions as simple as the sum, difference and the pro duct 6 dimensions w ould b e needed. Let us start shap eshifting operations on Gauss In tegers (pairs of integers with a real and imaginary part) in com bination with a mapping to ordinary integers using the (comm utative!) multiset pairing/unpairing isomor- phism provided b y the Enco der mz2 : gauss_sum (ab,cd) = mzpair (a + b,c + d) where (a,b) = mzunpair ab (c,d) = mzunpair cd gauss_dif (ab,cd) = mzpair (a-b,c-d) where (a,b) = mzunpair ab (c,d) = mzunpair cd gauss_prod (ab,cd) = mzpair (a ∗ c-b ∗ d,b ∗ c + a ∗ d) where (a,b) = mzunpair ab (c,d) = mzunpair cd Clearly one can now fit these op erations in 3-dimensions as shown in Figures 15, 16, 17 visualizing sums, differences and pro ducts of Gauss In tegers obtained by unpairing integers in [ − 2 4 .. 2 4 − 1]. 7.7 Some algebraic prop erties of pairing functions The follo wing propositions state some simple algebraic iden tities betw een pairing op erations acting on ordered, unordered and multiset pairs. Prop osition 8 Given the function definitions: bitlift x = bitpair (x,0) bitlift’ = (from_base 4) . (to_base 2) bitclip = fst . bitunpair bitclip’ = (from_base 2) . (map (‘div‘ 2)) . (to_base 4) . ( ∗ 2) bitpair’ (x,y) = (bitpair (x,0)) + (bitpair(0,y)) xbitpair (x,y) = (bitpair (x,0)) ‘xor‘ (bitpair (0,y)) obitpair (x,y) = (bitpair (x,0)) . | . (bitpair (0,y)) pair_product (x,y) = a + b where Fig. 15: Sums of Gauss In tegers visualized with Pairing functions Fig. 16: Differences of Gauss In tegers visualized with Pairing functions Fig. 17: Pro ducts of Gauss Integers visualized with Pairing functions x’ = bitpair (x,0) y’ = bitpair (0,y) ab = x’ ∗ y’ (a,b) = bitunpair ab the fol lowing identities hold: bitlif t ≡ bitl if t 0 (7) bitclip ≡ bitclip 0 (8) bitclip ◦ bitl if t ≡ id (9) bitpair (0 , n ) ≡ 2 ∗ bitpair ( n, 0) (10) bitpair (0 , n ) ≡ 2 ∗ ( bitlif t n ) (11) bitpair ( n, n ) ≡ 3 ∗ ( bitlif t n ) (12) bitpair (2 n , 0) ≡ (2 n ) 2 (13) bitpair (2 2 n + 1 , 0) ≡ 2 2 n +1 + 1 (14) bitpair 0 ≡ bitpair ≡ xbitpair ≡ obitpair (15) bitpair ( x, y ) ≡ ( bitlif t x ) + 2 ∗ ( bitlif t y ) (16) pair pr oduct ≡ ∗ (17) Prop osition 9 Given the function definitions bitpair’’ (x,y) = mset_pair (min x y,x + y) bitpair’’’ (x,y) = unord_pair [min x y,x + y + 1] mset_pair’ (a,b) = bitpair (min a b, (max a b) - (min a b)) mset_pair’’ (a,b) = unord_pair [min a b, (max a b) + 1] unord_pair’ [a,b] = bitpair (min a b, (max a b) - (min a b) -1) unord_pair’’ [a,b] = mset_pair (min a b, (max a b)-1) the fol lowing identities hold: bitpair ≡ bitpair 00 ≡ bitpair 000 (18) mset pair ≡ mset pair 0 ≡ mset pair 00 (19) unor d pair ≡ unor d pair 0 ≡ unord pair 00 (20) 8 Cons-Lists with Pairing/Unpairing The simplest application of pairing/unpairing op erations is encoding of cons-lists of natural num b ers, defined as the data type: data CList = Atom Nat | Cons CList CList deriving (Eq,Ord,Show,Read) First, to provide an infinite supply of atoms, we enco de them as even num- b ers: to_atom n = 2 ∗ n from_atom a | is_atom a = a ‘div‘ 2 is_atom n = even n && n ≥ 0 Next, as we w ant atoms and cons cell disjoin t, we will enco de the later as o dd n umbers: is_cons n = odd n && n > 0 decons z | is_cons z = pepis_unpair ((z-1) ‘div‘ 2) cons x y = 2 ∗ (pepis_pair (x,y)) + 1 W e can deconstruct a natural num b er by recursing ov er applications of the unpairing-based decons combinator: nat2cons n | is_atom n = Atom (from_atom n) nat2cons n | is_cons n = Cons (nat2cons hd) (nat2cons tl) where (hd,tl) = decons n W e can reverse this pro cess by recursing with the cons combinator on the CList data type: cons2nat (Atom a) = to_atom a cons2nat (Cons h t) = cons (cons2nat h) (cons2nat t) The following example sho ws b oth transformations as inv erses. ∗ ISO > cons2nat (Cons (Atom 0) (Cons (Atom 1) (Cons (Atom 2) (Atom 3)))) 26589 ∗ ISO > nat2cons 26589 Cons (Atom 0) (Cons (Atom 1) (Cons (Atom 2) (Atom 3))) W e obtain the Enco der: clist :: Encoder CList clist = compose (Iso cons2nat nat2cons) nat The Enco der works as follows: ∗ ISO > as clist nat 101 Cons (Atom 0) (Cons (Atom 0) (Atom 3)) and can b e used to generate random LISP-lik e data and co de sk eletons from natural num b ers. 9 Revisiting Multiset Enco dings W e will now use pairing/unpairing functions, in combination with mappings to sequences and sets to design an efficient encoding of multisets. The function fmset2nat starts b y grouping the elements of a multiset. The lengths of the groups (decremen ted b y 1), as w ell as an element of eac h are then collected in 2 lists. Then the second list is morphed from a set to a sequence, as this provides a more compact representation without changing the length of the list. The first list, seen as a sequence is then paired element by element with the second list. Finally , the resulting n umbers, seen as a sequence, are then fused together. fmset2nat pairingf ms = m where mss = group (sort ms) xs = map (pred . genericLength) mss zs = map head mss ys = set2fun zs ps = zip xs ys ns = map pairingf ps m = fun2nat ns The function fnat2mset reverses the pro cess step by step: fnat2mset unpairingf m = rs where ns = nat2fun m ps = map unpairingf ns (xs,ys) = unzip ps xs’ = map succ xs zs = fun2set ys f k x = genericTake k (repeat x) rs = concat (zipWith f xs’ zs) After instan tiating these generic functions to interesting pairing/unpairing func- tions bmset2nat = fmset2nat bitpair nat2bmset = fnat2mset bitunpair bmset2nat’ = fmset2nat pepis_pair nat2bmset’ = fnat2mset pepis_unpair W e obtain the Enco ders: bmset :: Encoder [Nat] bmset = compose (Iso bmset2nat nat2bmset) nat bmset’ :: Encoder [Nat] bmset’ = compose (Iso bmset2nat’ nat2bmset’) nat w orking as follows: ISO > as bmset nat 2008 [1,1,2,3,3,4,5,6,7] ∗ ISO > as nat bmset it 2008 ∗ ISO > map (as bmset nat) [0..7] [[],[0],[0,0],[0,1],[1],[0,1,1],[0,0,1],[0,1,2]] ∗ ISO > as bmset’ nat 2008 [0,0,0,1,2,2,3,4,5,6] Note that, in con trast to the in tractable prime num ber based multiset enco ding pmset , this time we obtain an enco ding, linear in the bitsize of the natural n umbers inv olv ed, as in the case of mset . Note also that the construction is generic in the sense that it w orks with an y pairing / unpairing function. Like in the case of mset and pmset m ultiset enco dings we can extend these enco dings to a hylomorphism hfbm : nat_bmset = Iso nat2bmset bmset2nat hfbm :: Encoder T hfbm = compose (hylo nat_bmset) nat nat_bmset’ = Iso nat2bmset’ bmset2nat’ hfbm’ :: Encoder T hfbm’ = compose (hylo nat_bmset’) nat w orking as follows: ∗ ISO > as hfbm nat 42 H [H [],H [],H [H []],H [H []],H [H [],H []],H [H [],H []]] ∗ ISO > as nat hfbm it 42 ∗ ISO > as hfbm’ nat 2008 H [H [],H [],H [],H [H []],H [H [],H []], H [H [],H []],H [H [],H [H []]],H [H [H []]], H [H [],H [H []],H [H []]],H [H [],H [],H [H []]]] ∗ ISO > as nat hfbm’ it 2008 10 P airing F unctions and Enco dings of Binary Decision Diagrams As a v ariation on the theme of pairing/unpairing functions, we will sho w in this section that a Binary Decision Diagram ( B D D ) represen ting the same logic func- tion as an n -v ariable 2 n bit truth table can b e obtained b y applying bitunpair recursiv ely to tt . More precisely , we will show that applying this unfolding op- eration results in a complete binary tree of depth n representing a B D D that returns tt when ev aluated applying its b o olean operations. The binary tree t yp e BT has the constants B0 and B1 as leav es representing the bo olean v alues 0 and 1. Internal no des (that will represent if-then-else decision points), will b e mark ed with the constructor D . W e will also add in tegers to represen t logic v ariables, ordered iden tically in each branch, as first argumen ts of D . The tw o other arguments will b e subtrees that represent THEN and ELSE branc hes: data BT a = B0 | B1 | D a (BT a) (BT a) deriving (Eq,Ord,Read,Show) The constructor BDD wraps together a tree of t yp e BT and the n umber of logic v ariables o ccurring in it. data BDD a = BDD a (BT a) deriving (Eq,Ord,Read,Show) 10.1 Unfolding natural num b ers to binary trees The following functions apply bitunpair recursively , on a Natural Number tt , seen as an n -v ariable 2 n bit truth table, to build a complete binary tree of depth n , that we will represent using the BDD data type. unfold_bdd :: Nat2 → BDD Nat unfold_bdd (n,tt) = BDD n bt where bt = if tt < max then split_with bitunpair n tt else error ("unfold_bdd: last arg " + + (show tt) + + " should be < " + + (show max)) where max = 2^2^n split_with _ n 0 | n < 1 = B0 split_with _ n 1 | n < 1 = B1 split_with f n tt = D k (split_with f k tt1) (split_with f k tt2) where k = pred n (tt1,tt2) = f tt The following examples show results returned by unfold bdd for the 2 2 n truth tables asso ciated to n v ariables, for n = 2: BDD 2 (D 1 (D 0 B0 B0) (D 0 B0 B0)) BDD 2 (D 1 (D 0 B1 B0) (D 0 B0 B0)) BDD 2 (D 1 (D 0 B0 B0) (D 0 B1 B0)) ... BDD 2 (D 1 (D 0 B1 B1) (D 0 B1 B1)) Note that no b oolean op erations hav e b een p erformed so far and that we still ha ve to pro v e that suc h trees actually represen t BDDs asso ciated to truth tables. 10.2 F olding binary trees to natural num b ers One can “ev aluate back” the binary tree of data t yp e BDD, by using the pairing function bitpair . The inv erse of unfold bdd is implemented as follo ws: fold_bdd :: BDD Nat → Nat2 fold_bdd (BDD n bt) = (n,fuse_with bitpair bt) where fuse_with rf B0 = 0 fuse_with rf B1 = 1 fuse_with rf (D _ l r) = rf (fuse_with rf l,fuse_with rf r) Note that this is a purely structural operation and that in tegers in first argumen t p osition of the constructor D are actually ignored. The tw o bijections work as follows: ∗ ISO > unfold_bdd (3,42) BDD 3 (D 2 (D 1 (D 0 B0 B0) (D 0 B0 B0)) (D 1 (D 0 B1 B1) (D 0 B1 B0))) ∗ ISO > fold_bdd it 42 10.3 Bo olean Ev aluation of BDDs Practical uses of BDDs inv olve reducing them by sharing no des and eliminating iden tical branches [26]. Note that in this case bdd2nat migh t give a differen t result as it computes different pairing op erations. F ortunately , we can try to fold the binary tree back to a natural num ber by ev aluating it as a b oolean function. The function eval bdd describ es the B D D ev aluator: eval_bdd (BDD n bt) = eval_with_mask (bigone n) n bt eval_with_mask m _ B0 = 0 eval_with_mask m _ B1 = m eval_with_mask m n (D x l r) = ite_ (var_mn m n x) (eval_with_mask m n l) (eval_with_mask m n r) var_mn mask n k = mask ‘div‘ (2^(2^(n-k-1)) + 1) bigone nvars = 2^2^nvars - 1 The pr oje ction functions var mn can b e combined with the usual bit wise in teger operators, to obtain new bitstring truth tables, enco ding all p ossible v alue combinations of their arguments, as sho wn in [27]. Note that the constant 0 ev aluates to 0 while the constan t 1 is ev aluated as 2 2 n − 1 by the function bigone . The function ite used in eval with mask implemen ts the b o olean function if x then t else e using arbitrary length bitvector op erations: ite_ x t e = ((t ‘xor‘ e).&.x) ‘xor‘ e As the fol lowing example shows, it turns out that b o ole an evaluation eval bdd faithful ly emulates fold bdd ! ∗ ISO > unfold_bdd (3,42) BDD 3 (D 2 (D 1 (D 0 B0 B0) (D 0 B0 B0)) (D 1 (D 0 B1 B1) (D 0 B1 B0))) ∗ ISO > eval_bdd it 42 10.4 The Equiv alence W e will now state the surprising (and new!) result that b o olean ev aluation and structural transformation with rep eated application of p airing pro duce the same result: Prop osition 10 The c omplete binary tr e e of depth n , obtaine d by r e cursive ap- plic ations of bitunpair on a truth table tt c omputes an (unr e duc e d) BDD, that, when evaluate d, r eturns the truth table, i.e. f old bdd ◦ unf ol d bdd ≡ id (21) ev al bdd ◦ unf old bdd ≡ id (22) Pr o of. The function unfold bdd builds a binary tree by splitting the bitstring tt ∈ [0 .. 2 n − 1] up to depth n . Observe that this corresp onds to the Shannon expansion [28] of the formula asso ciated to the truth table, using v ariable order [ n − 1 , ..., 0]. Observe that the effect of bitunpair is the same as – the effect of var mn m n (n-1) acting as a mask selecting the left branch, and – the effect of its complement, acting as a mask selecting the righ t branc h. Giv en that 2 n is the double of 2 n − 1 , the same inv ariant holds at each step, as the bitstring length of the truth table reduces to half. W e can th us assume from no w on, that the BDD data type defined in section 10 actually represents BDDs mapped one-to-one to truth tables giv en as nat- ural num b ers. An in teresting application of this result would b e to in vestigate practical uses of bitpair / bitunpair op erations in actual circuit design. 11 Ranking and Unranking of BDDs One more step is needed to extend the mapping betw een B DD s with n v ariables to a bijective mapping from/to N at : we will hav e to “shift to wards infinity” the starting p oin t of eac h new blo c k 3 of BDDs in N at as BDDs of larger and larger sizes are enumerated. First, w e need to kno w b y ho w m uch - so w e will count the num ber of bo olean functions with up to n v ariables. bsum 0 = 0 bsum n | n > 0 = bsum1 (n-1) bsum1 0 = 2 bsum1 n | n > 0 = bsum1 (n-1) + 2^2^n The stream of all such sums can now be generated as usual 4 : bsums = map bsum [0..] ∗ ISO > genericTake 7 bsums [0,2,6,22,278,65814,4295033110] What we are really in terested in to, is decomp osing n in to the distance n-m to the last bsum m smaller than n , and the index that generates the sum, k . to_bsum n = (k,n-m) where k = pred (head [x | x ← [0..],bsum x > n]) m = bsum k 3 defined by the same n umber of v ariables 4 bsums is sequence A060803 in The On-Line Encyclop edia of In teger Sequences, http: //www.research.att.com/ ~ njas/sequences Unr anking of an arbitrary BDD is no w easy - the index k determines the n umber of v ariables and n-m determines the rank. T ogether they select the righ t BDD with unfold bdd and bdd . nat2bdd n = unfold_bdd (k,n_m) where (k,n_m) = to_bsum n R anking of a BDD is even easier: we shift its rank within the set of BDDs with nv v ariables, b y the v alue (bsum nv) that coun ts the ranks previously assigned. bdd2nat bdd@(BDD nv _) = (bsum nv) + tt where (_,tt) = fold_bdd bdd As the following example shows bdd2nat implements the inv erse of nat2bdd . ∗ ISO > nat2bdd 42 BDD 3 (D 2 (D 1 (D 0 B0 B1) (D 0 B1 B0)) (D 1 (D 0 B0 B0) (D 0 B0 B0))) ∗ ISO > bdd2nat it 42 This provides the Enco der: pbdd :: Encoder (BDD Nat) pbdd = compose (Iso bdd2nat nat2bdd) nat w orking as follows: ∗ ISO > as pbdd nat 2008 BDD 4 (D 3 (D 2 B0 (D 1 (D 0 B0 B1) B1)) (D 2 (D 1 (D 0 B1 B1) B0) (D 1 B0 B1))) ∗ ISO > as nat pbdd it 2008 W e can now rep eat the r anking function construction for eval bdd : ev_bdd2nat bdd@(BDD nv _) = (bsum nv) + (eval_bdd bdd) W e can confirm that ev bdd2nat also acts as an inv erse to nat2bdd : ∗ ISO > ev_bdd2nat (nat2bdd 2008) 2008 W e obtain the Enco der: bdd :: Encoder (BDD Nat) bdd = compose (Iso ev_bdd2nat nat2bdd) nat w orking as follows: ∗ ISO > as bdd nat 2008 BDD 4 (D 3 (D 2 (D 1 (D 0 B0 B0) (D 0 B0 B0)) (D 1 (D 0 B0 B1) (D 0 B1 B0))) (D 2 (D 1 (D 0 B1 B1) (D 0 B0 B0)) (D 1 (D 0 B0 B0) (D0 B1 B0)))) ∗ ISO > as nat bdd it 2008 This result can b e seen as an intriguing isomorphism b et w een b o olean, arithmetic and symbolic computations. 11.1 Reducing the B D D s W e will sketc h here a simplified reduction mechanism for BDDs eliminating iden tical branc hes. As nodes of a BDD are mapp ed bijectively to unique natural n umbers we will omit the (trivial) implementation of no de sharing, with the implicit assumption that subtrees having the same enco ding are shared. The function bdd reduce reduces a B D D b y collapsing iden tical left and righ t subtrees, and the function bdd asso ciates this reduced form to n ∈ N at . bdd_reduce (BDD n bt) = BDD n (reduce bt) where reduce B0 = B0 reduce B1 = B1 reduce (D _ l r) | l = = r = reduce l reduce (D v l r) = D v (reduce l) (reduce r) unfold_rbdd = bdd_reduce . unfold_bdd The results returned by unfold rbdd for n=2 are: BDD 2 (C 0) BDD 2 (D 1 (D 0 (C 1) (C 0)) (C 0)) BDD 2 (D 1 (C 0) (D 0 (C 1) (C 0))) BDD 2 (D 0 (C 1) (C 0)) ... BDD 2 (D 1 (D 0 (C 0) (C 1)) (C 1)) BDD 2 (C 1) W e can now define the unr anking op eration on reduced BDDs nat2rbdd = bdd_reduce . nat2bdd and obtain the Enco der rbdd :: Encoder (BDD Nat) rbdd = compose (Iso ev_bdd2nat nat2rbdd) nat w orking as follows ∗ ISO > as rbdd nat 2008 BDD 4 (D 3 (D 2 B0 (D 1 (D 0 B0 B1) (D 0 B1 B0))) (D 2 (D 1 B1 B0) (D 1 B0 (D 0 B1 B0)))) ∗ ISO > as nat rbdd it 2008 T o b e able to compare its space complexit y with other representations w e will define a size op eration on a BDD as follows: bdd_size (BDD _ t) = 1 + (size t) where size B0 = 1 size B1 = 1 size (D _ l r) = 1 + (size l) + (size r) This measures the size of the BDD or reduced BDD as an expression tree. T o tak e into account sharing (as presen t in a standard R OBDD implementation) one can simply eliminate duplicated subtrees: robdd_size (BDD _ t) = 1 + (rsize t) where rsize = genericLength . nub . rbdd_nodes rbdd_nodes B0 = [B0] rbdd_nodes B1 = [B1] rbdd_nodes (D v l r) = [(D v l r)] + + (rbdd_nodes l) + + (rbdd_nodes r) 12 Generalizing BDD ranking/unranking functions 12.1 Enco ding BDDs with Arbitrary V ariable Order While the enco ding built around the equiv alence described in Prop. 10 b et w een bit wise pairing/unpairing op erations and b o olean decomp osition is arguably as simple and elegant as p ossible, it is useful to parametrize BDD generation with resp ect to an arbitrary v ariable order. This is of particular importance when using BDDs for circuit minimization, as differen t v ariable orders can mak e circuit sizes flip from linear to exp onential in the num ber of v ariables [26]. Giv en a p erm utation of n v ariables represen ted as natural num b ers in [0 ..n − 1] and a truth table tt ∈ [0 .. 2 2 n − 1] we can define: to_bdd vs tt | 0 ≤ tt && tt ≤ m = BDD n (to_bdd_mn vs tt m n) where n = genericLength vs m = bigone n to_bdd _ tt = error ("bad arg in to_bdd ⇒ " + + (show tt)) where the function to bdd mn recurses ov er the list of v ariables vs and ap- plies Shannon expansion [28], expressed as bitvector op erations. This computes branc hes f 1 and f 0, to b e used as then and else parts, when ev aluating back the BDD to a truth table with if-the-else functions. to_bdd_mn [] 0 _ _ = B0 to_bdd_mn [] _ _ _ = B1 to_bdd_mn (v:vs) tt m n = D v l r where cond = var_mn m n v f0 = (m ‘xor‘ cond) .&. tt f1 = cond .&. tt l = to_bdd_mn vs f1 m n r = to_bdd_mn vs f0 m n Prop osition 11 The function to bdd builds an (unr e duc e d) BDD c orr esp ond- ing to a truth table tt for variable or der vs that r eturns tt when evaluate d as a b o ole an function. W e can reduce the resulting BDDs, and conv ert back from BDDs and reduced BDDs to truth tables with b o olean ev aluation: to_rbdd vs tt = bdd_reduce (to_bdd vs tt) from_bdd bdd = eval_bdd bdd W e can obtain BDDs and reduced BDDs of v arious sizes as follows: ∗ ISO > as perm nat 5 [0,2,1] ∗ ISO > to_bdd (as perm nat 5) 42 BDD 3 (D 0 (D 2 (D 1 B0 B0) (D 1 B1 B1)) (D 2 (D 1 B0 B0) (D 1 B1 B0))) ∗ ISO > to_rbdd (as perm nat 5) 42 BDD 3 (D 0 (D 2 B0 B1) (D 2 B0 (D 1 B1 B0))) ∗ ISO > to_rbdd (as perm nat 8) 42 BDD 3 (D 2 B0 (D 0 B1 (D 1 B1 B0))) ISO > from_bdd it 42 Finally , we can, obtain a minimal BDD expressing a logic function of n v ariables giv en as a truth table as follows: to_min_bdd n t = search_bdd min n t search_bdd f n tt = snd $ foldl1 f (map (sized_rbdd tt) (all_permutations n)) where sized_rbdd tt vs = (robdd_size b,b) where b = to_rbdd vs tt all_permutations n = if n = = 0 then [[]] else [nth2perm (n,i) | i ← [0..(factorial n)-1]] where factorial n = foldl1 ( ∗ ) [1..n] As the following examples show, this can provide an effectiv e m ultilevel b oolean form ula minimization up to functions with 6-7 arguments. ∗ ISO > to_min_bdd 3 42 BDD 3 (D 0 (D 2 B0 B1) (D 1 (D 2 B0 B1) B0)) ∗ ISO > to_min_bdd 4 2008 BDD 4 (D 3 (D 1 (D 0 B0 B1) (D 0 B1 B0)) (D 2 (D 1 (D 0 B0 B1) B0) (D 0 B1 B0))) ∗ ISO > to_min_bdd 7 2008 BDD 7 (D 0 (D 1 (D 2 (D 6 (D 4 (D 3 B0 B1) (D 3 B1 B0)) (D 5 (D 4 (D 3 B0 B1) B0) (D 3 B1 B0))) B0) B0) B0) ∗ ISO > robdd_size it 12 12.2 Multi-T erminal Binary Decision Diagrams (MTBDD) MTBDDs [29, 30] are a natural generalization of BDDs allowing non-binary v al- ues as leav es. Such v alues are typically bitstrings representing the outputs of a m ulti-terminal b oolean function, enco ded as unsigned integers. W e shall no w describ e an encoding of M T B D Ds that can b e extended to ranking/unranking functions, in a w ay similar to B D Ds as shown in section 11. Our MTBDD data t yp e is a binary tree lik e the one used for B D D s , parameter- ized by tw o integers m and n , indicating that an MTBDD represents a function from [0 ..n − 1] to [0 ..m − 1], or equiv alen tly , an n -input/ m -output b o olean func- tion. data MT a = L a | M a (MT a) (MT a) deriving (Eq,Ord,Read,Show) data MTBDD a = MTBDD a a (MT a) deriving (Show,Eq) The function to mtbdd creates, from a natural num b er tt representing a truth table, an MTBDD representing functions of type N → M with M = [0 .. 2 m − 1] , N = [0 .. 2 n − 1]. Similarly to a BDD, it is represen ted as binary tree of n levels, except that its leav es are in [0 .. 2 m − 1]. to_mtbdd m n tt = MTBDD m n r where mlimit = 2^m nlimit = 2^n ttlimit = mlimit^nlimit r = if tt < ttlimit then (to_mtbdd_ mlimit n tt) else error ("bt: last arg " + + (show tt) + + " should be < " + + (show ttlimit)) Giv en that correctness of the range of tt has been chec ked, the function to mtbdd applies bitmerge unpair recursiv ely up to depth n , where lea ves in range [0 ..ml imit − 1] are created. to_mtbdd_ mlimit n tt | (n < 1)&&(tt < mlimit) = L tt to_mtbdd_ mlimit n tt = (M k l r) where (x,y) = bitunpair tt k = pred n l = to_mtbdd_ mlimit k x r = to_mtbdd_ mlimit k y Con verting bac k from M T B D D s to natural num b ers is basically the same thing as for B D D s , except that assertions ab out the range of leaf data are enforced. from_mtbdd (MTBDD m n b) = from_mtbdd_ (2^m) n b from_mtbdd_ mlimit n (L tt) | (n < 1)&&(tt < mlimit) = tt from_mtbdd_ mlimit n (M _ l r) = tt where k = pred n x = from_mtbdd_ mlimit k l y = from_mtbdd_ mlimit k r tt = bitpair (x,y) The following examples sho w that to mtbdd and from mtbdd are indeed inv erses v alues in [0 .. 2 n − 1] × [0 .. 2 m − 1]. > to_mtbdd 3 3 2008 MTBDD 3 3 (M 2 (M 1 (M 0 (L 2) (L 1)) (M 0 (L 2) (L 1))) (M 1 (M 0 (L 2) (L 0)) (M 0 (L 1) (L 1)))) > from_mtbdd it 2008 > mprint (to_mtbdd 2 2) [0..3] MTBDD 2 2 (M 1 (M 0 (L 0) (L 0)) (M 0 (L 0) (L 0))) MTBDD 2 2 (M 1 (M 0 (L 1) (L 0)) (M 0 (L 0) (L 0))) MTBDD 2 2 (M 1 (M 0 (L 0) (L 0)) (M 0 (L 1) (L 0))) MTBDD 2 2 (M 1 (M 0 (L 1) (L 0)) (M 0 (L 1) (L 0))) 13 Revisiting Enco dings of Finite F unctions W e will now generalize the bitpair pairing function to k -tuples and then w e will derive an alternativ e enco ding for finite functions. 13.1 T uple Enco dings as Generalized Bitpair The function to tuple: N at → N at k con verts a natural num ber to a k -tuple b y splitting its bit representation into k groups, from whic h the k members in the tuple are finally rebuilt. This op eration can b e seen as a transp osition of a bit matrix obtained by expanding the num b er in base 2 k : to_tuple k n = map (from_base 2) ( transpose ( map (to_maxbits k) ( to_base (2^k) n ) ) ) T o conv ert a k -tuple bac k to a natural num b er we will merge their bits, k at a time. This op eration uses the transp osition of a bit matrix obtained from the tuple, seen as a num b er in base 2 k , with help from bit crunc hing functions giv en in APPENDIX: from_tuple ns = from_base (2^k) ( map (from_base 2) ( transpose ( map (to_maxbits l) ns ) ) ) where k = genericLength ns l = max_bitcount ns The follo wing example shows the deco ding of 42 , its decomposition in bits (right to left), the formation of a 3-tuple and the enco ding of the tuple back to 42 . ∗ ISO > to_base 2 42 [0,1,0,1,0,1] ∗ ISO > to_tuple 3 42 [2,1,2] ∗ ISO > to_base 2 2 [0,1] ∗ ISO > to_base 2 1 [1] ∗ ISO > from_tuple [2,1,2] 42 Fig. 18 shows m ultiple steps of the same decomp osition, with shared no des col- lected in a DA G. 0 1 2 2 0 1 42 1 2 0 0 1 2 2 0 1 4 1 0 2 14 2 1 0 2008 2 1 0 Fig. 18: Rep eated 3-tuple expansions: 42 and 2008 13.2 Enco ding Finite F unctions as T uples As finite sets can b e put in a bijection with an initial segment of N at , a finite function can b e seen as a function defined from an initial segment of N at to N at . W e can enco de and deco de a finite function from [0 ..k − 1] to N at (seen as the list of its v alues), as a natural num b er: ftuple2nat [] = 0 ftuple2nat ns = succ (pepis_pair (pred k,t)) where k = genericLength ns t = from_tuple ns nat2ftuple 0 = [] nat2ftuple kf = to_tuple (succ k) f where (k,f) = pepis_unpair (pred kf) As the length of the tuple, k , is usually smaller than the num b er obtained b y merging the bits of the k -tuple, we ha ve pick ed the Pepis pairing function, exp o- nen tial in its first argument and linear in its second, to embed the length of the tuple needed for the deco ding. This suggest the following alternative Encoder for finite functions: fun’ :: Encoder [Nat] fun’ = compose (Iso ftuple2nat nat2ftuple) nat as well as the related alternative hylomorphism: nat_fun’ = Iso nat2ftuple ftuple2nat hff’ :: Encoder T hff’ = compose (hylo nat_fun’) nat The enco ding/decoding and the hylomorphism work as follo ws: ∗ ISO > as fun’ nat 2008 [3,2,3,1] ∗ ISO > as nat fun’ it 2008 ∗ ISO > as hff’ nat 2008 H [H [H [H []]],H [H [],H []],H [H [H []]],H [H []]] ∗ ISO > as nat hff’ it 2008 14 Directed Graphs, Undirected graphs, Multigraphs and Hyp ergraphs W e will no w sho w that more complex data t yp es like digraphs and hypergraphs ha ve extremely simple enco ders. This shows once more the imp ortance of com- p ositionalit y in the design of our embedded transformation language. 14.1 Enco ding Directed Graphs W e can find a bijection from directed graphs (with no isolated vertices, corre- sp onding to their view as binary relations), to finite sets by fusing their list of ordered pair representation in to finite sets with a pairing function: digraph2set ps = map bitpair ps set2digraph ns = map bitunpair ns The resulting Enco der is: digraph :: Encoder [Nat2] digraph = compose (Iso digraph2set set2digraph) set w orking as follows: ∗ ISO > as digraph nat 2008 [(1,1),(2,0),(2,1),(3,1),(0,2),(1,2),(0,3)] ∗ ISO > as nat digraph it 2008 ∗ ISO > as rbdd digraph [(1,1),(2,0),(2,1), (3,1),(0,2),(1,2),(0,3)] BDD 4 (D 3 (D 2 B0 (D 1 (D 0 B0 B1) (D 0 B1 B0))) (D 2 (D 1 B1 B0) (D 1 B0 (D 0 B1 B0)))) Fig. 19 shows the digraph asso ciated to 2008 . 0 3 2 1 Fig. 19: 2008 as a digraph 14.2 Enco ding Undirected Graphs W e can find a bijection from undirected graphs to finite sets by fusing their list of unordered pair representation into finite sets with a pairing function on unordered pairs: graph2set ps = map unord_pair ps set2graph ns = map unord_unpair ns The resulting Enco der is: graph :: Encoder [[Nat]] graph = compose (Iso graph2set set2graph) set w orking as follows: ∗ ISO > as graph nat 2008 [[1,3],[2,3],[2,4],[3,5],[0,3],[1,4],[0,4]] ∗ ISO > as nat graph it 2008 ∗ ISO > as nat graph [[1,3],[3,2],[2,4],[5,3],[0,3],[4,1],[0,4]] 2008 Note that, as exp ected, the result is in v arian t to changing the order of elements in pairs like [1,4] and [3,5] to [4,1] and [5,3] . 14.3 Enco ding Directed Multigraphs W e can find a bijection from directed m ultigraphs (directed graphs with m ultiple edges b et ween pairs of vertices) to finite sequences by fusing their list of ordered pair representation in to finite sequences with a pairing function: The resulting Enco der is: mdigraph :: Encoder [Nat2] mdigraph = compose (Iso digraph2set set2digraph) fun w orking as follows: ∗ ISO > as mdigraph nat 2008 [(1,1),(0,0),(1,0),(0,0),(0,0),(0,0),(0,0)] ∗ ISO > as nat mdigraph it 2008 Note that the only change to the digraph Enco der is replacing the comp osition with set by a comp osition with fun . 14.4 Enco ding Undirected Multigraphs W e can find a bijection from undirected m ultigraphs (undirected graphs with m ultiple edges b et ween unordered pairs of v ertices) to finite sequences b y fusing their list of pair representation into finite sequences with a pairing function on unordered pairs: The resulting Enco der is: mgraph :: Encoder [[Nat]] mgraph = compose (Iso graph2set set2graph) fun w orking as follows: ∗ ISO > as mgraph nat 2008 [[1,3],[0,1],[1,2],[0,1],[0,1],[0,1],[0,1]] ∗ ISO > as nat mgraph it 2008 Note that the only change to the graph Enco der is replacing the comp osition with set by a comp osition with fun . 14.5 Enco ding Hyp ergraphs Definition 2 A hyp er gr aph (also c al le d set system ) is a p air H = ( X , E ) wher e X is a set and E is a set of non-empty subsets of X . W e can easily derive a bijective enco ding of hyp er gr aphs , represented as sets of sets: set2hypergraph = map nat2set hypergraph2set = map set2nat The resulting Enco der is: hypergraph :: Encoder [[Nat]] hypergraph = compose (Iso hypergraph2set set2hypergraph) set w orking as follows ∗ ISO > as hypergraph nat 2008 [[0,1],[2],[1,2],[0,1,2],[3],[0,3],[1,3]] ∗ ISO > as nat hypergraph it 2008 15 Enco ding SA T problems Bo olean Satisfiability (SA T) problems are enco ded as lists of lists representing conjunctions of disjunctions of p ositiv e or negative propositional sym b ols. After defining: set2sat = map (set2disj . nat2set) where shift0 z = if (z < 0) then z else z + 1 set2disj = map (shift0. nat2z) sat2set = map (set2nat . disj2set) where shiftback0 z = if(z < 0) then z else z-1 disj2set = map (z2nat . shiftback0) w e obtain the Enco der sat :: Encoder [[Z]] sat = compose (Iso sat2set set2sat) set w orking as follows: ∗ ISO > as sat nat 2008 [[1,-1],[2],[-1,2],[1,-1,2],[-2],[1,-2],[-1,-2]] ∗ ISO > as nat sat it 2008 Clearly this enco ding can b e used to generate random SA T problems out of easier to generate random natural num b ers. 16 An Enco der for Graph Mo dels Graph models [31, 32] pro vide a semantics of λ -calculus (Y-com binator included) in terms of sets of finite sets of natural num b ers. F ollowing [31] a gr aph mo del is a pair ( D , p ) where D is an infinite set and p : D ∗ × D → D is an injectiv e total function. W e will s trengthen this to b e a bijection, for the case D = N at as follows. gmodel2nat (set,m) = pred (fun2nat (m : (set2fun set))) nat2gmodel n = (fun2set xs,m) where (m:xs) = nat2fun (succ n) This provides the Enco der: type Gdomain = ([Nat],Nat) gmodel :: Encoder Gdomain gmodel = compose (Iso gmodel2nat nat2gmodel) nat w orking as follows: ∗ ISO > as gmodel nat 42 ([0,2,4],0) ∗ ISO > as nat gmodel it 42 The interests of such mo dels is that they pro vide an accurate set theoretic se- man tics for unt yp ed lambda calculus describing key computational mechanisms lik e β -conv ersion and fixp oin t com binators. 17 A mapping to a dense set: Dy adic Rationals in [0 , 1) So far our isomorphisms ha ve focused on natural num b ers, finite sets and other discrete data types. Dyadic rationals are fractions with denominators restricted to b e exponents of 2. They are a dense set in R i.e. they provide arbitrarily close appro ximations for any real num b er. An interesting isomorphism to such a set w ould allow b orrowing things like distance or av erage functions that could hav e in teresting interpretations in symbolic or b oolean domains. It also makes sense to pick a b ounded sub domain of the dyadic rationals that can b e meaningful as the range of probabilistic b o olean functions or fuzzy sets. W e will build an Enco der for Dyadic Rationals in [0 , 1) by providing a bijection from finite sets of natural num b ers seen this time as ne gative exp onen ts of 2. dyadic :: Encoder (Ratio Nat) dyadic = compose (Iso dyadic2set set2dyadic) set The function set2dyadic mimics set2nat defined in subsection 3.3, except for the use of negative exp onents and computation on rationals. set2dyadic :: [Nat] → Ratio Nat set2dyadic ns = rsum (map nexp2 ns) where nexp2 0 = 1%2 nexp2 n = (nexp2 (n-1)) ∗ (1%2) rsum [] = 0%1 rsum (x:xs) = x + (rsum xs) The function dyadic2set extracts negative exp onents of t wo from a dyadic rational and it is mo deled after nat2set defined in subsection 3.3. dyadic2set :: Ratio Nat → [Nat] dyadic2set n | good_dyadic n = dyadic2exps n 0 where dyadic2exps 0 _ = [] dyadic2exps n x = if (d < 1) then xs else (x:xs) where d = 2 ∗ n m = if d < 1 then d else (pred d) xs = dyadic2exps m (succ x) dyadic2set _ = error "dyadic2set: argument not a dyadic rational" As not all rational num b ers are dyadics in [0 , 1), the predicate good dyadic is needed v alidate the input of dyadic2set . This also ensures that dyadic2set alw ays terminates returning a finite set. good_dyadic kn = (k = = 0 && n = = 1) | | ((kn > 0%1) && (kn < 1%1) && (is_exp2 n)) where k = numerator kn n = denominator kn is_exp2 1 = True is_exp2 n | even n = is_exp2 (n ‘div‘ 2) is_exp2 n = False Some examples of b orro w/lend op erations are: dyadic_dist x y = abs (x-y) dist_for t x y = as dyadic t (borrow2 (with dyadic t) dyadic_dist x y) dsucc = borrow (with nat dyadic) succ dplus = borrow2 (with nat dyadic) ( + ) dconcat = lend2 dyadic ( + + ) ∗ ISO > dist_for nat 6 7 1%2 ∗ ISO > dist_for set [1,2,3] [3,4,5] 21%64 ∗ ISO > dsucc (3%8) 7%8 Fig. 20 shows the dyadic rationals asso ciated to natural num b ers in [0..255]. Fig. 20: Dyadic rationals asso ciated to n in [0..255] 18 Strings and Paren thesis Languages 18.1 Enco ding Strings As strings can b e seen just as a notational equiv alent of lists of natural n umbers w e obtain an Enco der immediately as: string :: Encoder String string = Iso string2fun fun2string string2fun cs = map (fromIntegral . ord) cs fun2string ns = map (chr . fromIntegral) ns Note how ev er that this is only an isomorphism within the chr/ord conv ersion range, therefore we shall assume this constraint as a law go verning this Enco der. ∗ ISO > as set string "hello" [104,206,315,424,536] ∗ ISO > as string set it "hello" 18.2 Enco ding a P arenthesis Language An enco der for a parenthesis language is obtained by combining a parser and writer. As Hereditarily Finite F unctions naturally map one-to-one to a paren- thesis expression we will choose them as target of the transformers. pars :: Encoder [Char] pars = compose (Iso pars2hff hff2pars) hff The parser recurses ov er a string and builds a HFF as follows: pars2hff cs = parse_pars ’(’ ’)’ cs parse_pars l r cs | newcs = = [] = t where (t,newcs) = pars_expr l r cs pars_expr l r (c:cs) | c = = l = ((H ts),newcs) where (ts,newcs) = pars_list l r cs pars_list l r (c:cs) | c = = r = ([],cs) pars_list l r (c:cs) = ((t:ts),cs2) where (t,cs1) = pars_expr l r (c:cs) (ts,cs2) = pars_list l r cs1 The writer recurses ov er a HFF and collects matching paren thesis pairs: hff2pars = collect_pars ’(’ ’)’ collect_pars l r (H ns) = [l] + + (concatMap (collect_pars l r) ns) + + [r] The transformations of 42 lo ok as follows: ∗ ISO > as pars nat 42 "((())(())(()))" ∗ ISO > as hff pars it H [H [H []],H [H []],H [H []]] ∗ ISO > as nat hff it 42 Alternativ ely , b y using a 0 and 1 as left and righ t paren thesis w e can define: bitpars2hff cs = parse_pars 0 1 cs hff2bitpars = collect_pars 0 1 hff_pars :: Encoder [Nat] hff_pars = compose (Iso bitpars2hff hff2bitpars) hff w orking as follows: ∗ ISO > as hff_pars nat 2008 [0,0,0,1,0,1,1,0,1,0,0,1,1,0,1,0,1,0,1,0,1,1] ∗ ISO > as nat hff_pars it 2008 ∗ ISO > as nat bits (as hff_pars nat 2008) 7690599 As the last example sho ws, the information density of a paren thesis representa- tion is low er. This is expected, given that order is constrained b y balancing and con tent is constrained b y having the same n umber of 0s and 1s . The following example ∗ ISO > map ((as nat bits) . (as hff_pars nat)) [0..7] [5,27,119,115,495,483,471,467] sho ws that this application is injectiv e only . Therefore a succinct representation of an abstract tree structure can b e obtained b y enco ding it as a natural n umber as in: ∗ ISO > as nat pars "((()())()(())()()()())" 2008 Note how ev er, that ∗ ISO > as nat bits (as hff_pars nat (2^2^16)) 32639 while the conv en tional representation of the same num b er would hav e a few thousand digits. This suggest defining: nat2parnat n = as nat bits (as hff_pars nat n) parnat2nat n = as nat hff_pars (as bits nat n) and find out that ∗ ISO > [x | x ← [0..2^16],nat2parnat x < x] [8192,16384,32768,32769,49152,65536] One can see that more compact representations only happen for a few num b ers that are p ow ers of tw o or “sparse” sums of p o wers of tw o. A go o d wa y to ev al- uate “information densit y” for an arbitrary data type that is isomorphic to Nat through one of our encoders is to compute the total bitsize of its actual encoding o ver an interv al like [0 .. 2 n − 1 ]. F or instance, hff_bitsize n = sum (map hff_bsize [0..2^n-1]) hff_bsize k = genericLength (as bits nat (nat2parnat k)) Kno wing that the optimal bit representation of all num b ers in [0 .. 2 n − 1 ] totals n ∗ 2 n (2 n of them, n bits eac h), we can define a measure of information densit y for a bit-enco ded parenthesis language seen as a representation for HFF as: info_density_hff n = (n ∗ 2^n)%(hff_bitsize n) One can see that information density progressively increases to conv erge to a v alue ab o v e half of the “p erfect” v alue of 1 : ∗ ISO > map info_density_hff [0..12] [0%1,1%3,4%9,12%25,1%2,80%157,16%31,112%215, 32%61,48%91,1024%1933,2816%5297,2048%3841] ∗ ISO > map fromRational it [0.0,0.3333333333333333,0.4444444444444444,0.48,0.5, 0.5095541401273885,0.5161290322580645,0.5209302325581395, 0.5245901639344263,0.5274725274725275,0.5297465080186239, 0.5316216726448934,0.5331944806040094] T o compare this with the information density of hereditarily finite sets, mul- tisets and permutations, we can also map their structure to a bit-represen ted paren thesis language by defining the enco der: pars_hf = Iso bitpars2hff hff2bitpars hff_pars’ :: Encoder [Nat] hff_pars’ = compose pars_hf hff’ hfs_pars :: Encoder [Nat] hfs_pars = compose pars_hf hfs hfpm_pars :: Encoder [Nat] hfpm_pars = compose pars_hf hfpm hfm_pars :: Encoder [Nat] hfm_pars = compose pars_hf hfm bhfm_pars :: Encoder [Nat] bhfm_pars = compose pars_hf hfbm bhfm_pars’ :: Encoder [Nat] bhfm_pars’ = compose pars_hf hfbm’ hfp_pars :: Encoder [Nat] hfp_pars = compose pars_hf hfp and then defining: parsize_as t n = genericLength (hff2bitpars (as t nat n)) parsizes_to m t = map (parsize_as t) [0..2^m-1] nat2hfsnat n = as nat bits (as hfs_pars nat n) hfs_bitsize n = sum (map hfs_bsize [0..2^n-1]) hfs_bsize k = genericLength (as bits nat (nat2hfsnat k)) info_density_hfs n = (n ∗ 2^n)%(hfs_bitsize n) The intuition that hereditarily finite functions hav e higher information density than hereditarily finite sets can now b e conjectured: ∗ ISO > map info_density_hfs [0..12] [0%1,1%3,2%5,3%8,1%3,5%16,2%7,7%27,4%17,3%13,2%9,11%52,1%5] ∗ ISO > map fromRational it [0.0,0.3333333333333333,0.4,0.375,0.3333333333333333, 0.3125,0.2857142857142857,0.25925925925925924,0.23529411764705882, 0.23076923076923078,0.2222222222222222,0.21153846153846154,0.2] Con trary to the case of bit-encoded HFFs, in this case information density is decreasing for larger v alues - an observ ation that can help with finding a simple pro of for the conjecture. More generally , suc h techniques suggest applications to exp erimen tal mathematics. 19 Self-delimiting co des A more precise estimate of the actual size of v arious bitstring represen tations re- quires also counting the ov erhead for “delimiting” their comp onen ts. An asymp- totically optimal me c hanism for this is the use of a universal self-delimiting c o de for instance, the Elias ome ga c o de . T o implemen t it, the enco der pro ceeds by recursiv ely enco ding length of the string, the length of the length of the strings etc. to_elias :: Nat → [Nat] to_elias n = (to_eliasx (succ n)) + + [0] to_eliasx 1 = [] to_eliasx n = xs where bs = to_lbits n l = (genericLength bs)-1 xs = if l < 2 then bs else (to_eliasx l) + + bs The decoder first rebuilds recursiv ely the sequence of lengths and then the actual bitstring. It mak es sense to design the deco der to extract the num b er represented b y the self-delimiting co de from a sequence/stream of bits and also return what is left after the extraction. from_elias :: [Nat] → (Nat, [Nat]) from_elias bs = (pred n,cs) where (n,cs) = from_eliasx 1 bs from_eliasx n (0:bs) = (n,bs) from_eliasx n (1:bs) = r where hs = genericTake n bs ts = genericDrop n bs n’ = from_lbits (1:hs) r = from_eliasx n’ ts to_lbits = reverse . (to_base 2) from_lbits = (from_base 2) . reverse W e obtain the Enco der: elias :: Encoder [Nat] elias = compose (Iso (fst . from_elias) to_elias) nat w orking as follows: ∗ ISO > as elias nat 42 [1,0,1,0,1,1,0,1,0,1,1,0] ∗ ISO > as nat elias it 42 ∗ ISO > as elias nat 2008 [1,1,1,0,1,0,1,1,1,1,1,0,1,1,0,0,1,0] ∗ ISO > as nat elias it 2008 Note that self-delimiting co des are not onto the regular language { 0 , 1 } ∗ , there- fore this Enco der cannot b e used to map arbitrary bitstrings to num b ers. 20 Enco ding DNA W e hav e co vered so far enco dings for “artificial en tities” used in v arious fields. W e will no w add an encoding of “natural origin”, DNA bases and strands. While it is an (utterly) simplified mo del of the real thing, it captures some essen tial algebraic prop erties of DNA bases and strands. W e start with a DNA data type, following [33, 34]: data Base = Adenine | Cytosine | Guanine | Thymine deriving(Eq,Ord,Show,Read) type DNA = [Base] W e will enco de/deco de the DNA base alphab et as follo ws: alphabet2code Adenine = 0 alphabet2code Cytosine = 1 alphabet2code Guanine = 2 alphabet2code Thymine = 3 code2alphabet 0 = Adenine code2alphabet 1 = Cytosine code2alphabet 2 = Guanine code2alphabet 3 = Thymine The mapping is simply a symbolic v arian t of conv ersion to/from base 4: dna2nat = (from_base 4) . (map alphabet2code) nat2dna = (map code2alphabet) . (to_base 4) W e can now define a deco der for base sequences as follo ws: dna :: Encoder DNA dna = compose (Iso dna2nat nat2dna) nat A first set of DNA op erations act on base sequences. The transformation b et w een complemen ts lo oks as follows: dna_complement :: DNA → DNA dna_complement = map to_compl where to_compl Adenine = Thymine to_compl Cytosine = Guanine to_compl Guanine = Cytosine to_compl Thymine = Adenine Rev ersing is just list reversal. dna_reverse :: DNA → DNA dna_reverse = reverse As reversal and complement are indep enden t op erations their comp osition is comm utative - we can pic k rev ersing first and then complementing: dna_comprev :: DNA → DNA dna_comprev = dna_complement . dna_reverse The following examples show in teraction of DNA co des with other data t yp es and their op erations: ∗ ISO > as dna nat 2008 [Adenine,Guanine,Cytosine,Thymine,Thymine,Cytosine] ∗ ISO > borrow (with dna nat) dna_reverse 42 42 ∗ ISO > borrow (with dna nat) dna_reverse 2008 637 ∗ ISO > borrow (with dna nat) dna_complement 2008 2087 ∗ ISO > borrow (with dna nat) dna_comprev 2008 3458 ∗ ISO > borrow (with dna bits) dna_comprev [1,0,1,0,1,1,0,1,0,1] [1,1,1,0,1,0,0,0,0,1,1] Note that each of these DNA op erations induces a bijection N at → N at . Lik e signed integers, DNA strands hav e “p olarit y” - their direction matters: data Polarity = P3x5 | P5x3 deriving(Eq,Ord,Show,Read) data DNAstrand = DNAstrand Polarity DNA deriving(Eq,Ord,Show,Read) P olarity can b e easily enco ded as parity ev en/o dd: strand2nat (DNAstrand polarity strand) = add_polarity polarity (dna2nat strand) where add_polarity P3x5 x = 2 ∗ x add_polarity P5x3 x = 2 ∗ x-1 nat2strand n = if even n then DNAstrand P3x5 (nat2dna (n ‘div‘ 2)) else DNAstrand P5x3 (nat2dna ((n + 1) ‘div‘ 2)) W e can now define an Enco der for DNA strands: dnaStrand :: Encoder DNAstrand dnaStrand = compose (Iso strand2nat nat2strand) nat Tw o additional op erations lift DNA sequences to strands with p olarities: dna_down :: DNA → DNAstrand dna_down = (DNAstrand P3x5) . dna_complement dna_up :: DNA → DNAstrand dna_up = DNAstrand P5x3 W e can now lend or b orrow op erations as follo ws: ∗ ISO > as dnaStrand nat 1234 DNA P3x5 [Cytosine,Guanine,Guanine,Cytosine,Guanine] ∗ ISO > lend (with dnaStrand nat) succ (DNAstrand P5x3 [Adenine,Cytosine,Guanine,Thymine]) DNAstrand P5x3 [Cytosine,Cytosine,Guanine,Thymine] The DoubleHelix is a stable combination of t wo complementary strands. This built-in redundancy protects against unw an ted m utations. data DoubleHelix = DoubleHelix DNAstrand DNAstrand deriving(Eq,Ord,Show,Read) dna_double_helix :: DNA → DoubleHelix dna_double_helix s = DoubleHelix (dna_up s) (dna_down s) W e can now generate a double helix from a natural num b er: ∗ ISO > dna_double_helix (nat2dna 33) DoubleHelix (DNAstrand P5x3 [Cytosine,Adenine,Guanine]) (DNAstrand P3x5 [Guanine,Thymine,Cytosine]) This can b e used for generating random instances of double helixes b y reusing a random generator for natural num b ers. 21 T esting It All W e will now describ e a random testing mechanism to v alidate our Enco ders. While QuickChec k [7] provides an elegant general purp ose random tester, it w ould require writing a sp ecific adaptor for eac h isomorphism. W e will describ e here a shortcut through a few higher order com binators. First, we build a simple random generator for nat rannat = rand (2^50) rand :: Nat → Nat → Nat rand max seed = n where (n,g) = randomR (0,max) (mkStdGen (fromIntegral seed)) W e can now design a generic random test for any Enco der as follows: rantest :: Encoder t → Bool rantest t = and (map (rantest1 t) [0..255]) rantest1 t n = x = = (visit_as t x) where x = rannat n visit_as t = (to nat) . (from t) . (to t) . (from nat) Note that in rantest1 , visit at starts with a random natural num b er from whic h it generates its test data of a given type. After testing the enco der, the result is brough t back as a natural num b er that should be the same as the original random num b er. W e can now implement our tester isotest that in a few seconds go es ov er of thousands of test cases and aggregates the result with a final and : isotest = and (map rt [0..25]) rt 0 = rantest nat rt 1 = rantest fun rt 2 = rantest set rt 3 = rantest bits rt 4 = rantest funbits rt 5 = rantest hfs rt 6 = rantest hff rt 7 = rantest uhfs rt 8 = rantest uhff rt 9 = rantest perm rt 10 = rantest hfp rt 11 = rantest nat2 rt 12 = rantest set2 rt 13 = rantest clist rt 14 = rantest pbdd rt 15 = rantest bdd rt 16 = rantest rbdd rt 17 = rantest digraph rt 18 = rantest graph rt 19 = rantest mdigraph rt 20 = rantest mgraph rt 21 = rantest hypergraph rt 22 = rantest dyadic rt 23 = rantest string rt 24 = rantest pars rt 25 = rantest dna The empirical correctness test of the “whole enchilada” follo ws: ∗ ISO > isotest True suggesting that the probability of having errors in the co de describ ed so far is extremely small. 22 Applications Besides their utility as a uniform basis for a general purpose data conv ersion library , let us p oin t out some sp ecific applications of our isomorphisms. 22.1 Com binatorial Generation A free combinatorial generation algorithm (providing a constructive pro of of recursiv e enumerabilit y) for a giv en structure is obtained simply through an isomorphism from nat : nth thing = as thing nat nths thing = map (nth thing) stream_of thing = nths thing [0..] ∗ ISO > nth set 42 [1,3,5] ∗ ISO > nth bits 42 [1,1,0,1,0] ∗ ISO > take 3 (stream_of hfs) [H [],H [H []],H [H [H []]]] ∗ ISO > take 3 (stream_of bdd) [BDD 0 B0,BDD 0 B1,BDD 1 (D 0 B0 B0)] 22.2 Random Generation Com bining nth with a random generator for nat provides free algorithms for random generation of complex ob jects of customizable size: ran thing seed largest = head (random_gen thing seed largest 1) random_gen thing seed largest n = genericTake n (nths thing (rans seed largest)) rans seed largest = randomRs (0,largest) (mkStdGen seed) F or instance ∗ ISO > random_gen set 11 999 3 [[0,2,5],[0,5,9],[0,1,5,6]] generates a list of 3 random sets. F or instance ∗ ISO > ran digraph 5 (2^31) [(1,0),(0,1),(2,1),(1,3),(2,2),(3,2),(4,0),(4,1), (5,1),(6,0),(6,1),(7,1),(5,3),(6,2),(6,3)] ∗ ISO > ran hfs 7 30 H [H [],H [H [],H [H []]],H [H [H [H []]]]] ∗ ISO > ran dnaStrand 1 123456789 DNAstrand P5x3 [Guanine,Thymine,Guanine,Cytosine, Cytosine,Thymine,Thymine,Thymine,Thymine, Adenine,Thymine,Cytosine,Cytosine] generate a random digraph, a hereditarily finite set and a DNA strand. Random generator for v arious data types are useful for further automating test generators in to ols lik e QuickChec k [7] by generating customized random tests. An interesting other application is generating random problems or programs of a given t yp e and size. F or instance ∗ ISO > ran sat 8 (2^31) [[-1],[1,-1],[-1,2],[1,-1,2],[-2],[1,-2],[-1,-2],[1,-1,-2], [2,-2],[1,2,-2],[-1,2,-2],[3],[1,-1,3],[1,-1,2,3],[1,-2,3], [-1,-2,3],[2,-2,3],[1,2,-2,3],[-1,2,-2,3]] ∗ ISO > ran clist 8 12345 Cons (Atom 0) (Cons (Cons (Atom 0) (Atom 0)) (Atom 100)) generate, resp ectiv ely , a random SA T-problem and a random Cons -list. 22.3 Succinct Representations Dep ending on the information theoretical densit y of v arious data representations as well as on the constan t factors inv olved in v arious data structures, significan t data compression can b e achiev ed b y choosing an alternate isomorphic represen- tation, as shown in the following examples: ∗ ISO > as hff hfs (H [H [H []],H [H [], H [H []]],H [H [],H [H [H []]]]]) H [H [H []],H [H []],H [H []]] ∗ ISO > as nat hff (H [H [H []],H [H []],H [H []]]) 42 ∗ ISO > as fun bits [0,1,0,0,0,0,0,0,0,0,0] [0,10] ∗ ISO > as rbdd hfs (H [H [],H [H [],H [H []]], H [H [H []],H [H [H []]]]]) BDD 3 (D 1 B1 B0) ∗ ISO > as hff bdd (BDD 3 (D 2 (D 1 (D 0 B1 B0) (D 0 B0 B1)) (D 1 (D 0 B1 B1) (D 0 B1 B1)))) H [H [],H [H [],H [],H []]] In particular, mapping to efficient arbitrary length integer implementations (usually C-based libraries), can pro vide more compact representations or im- pro ved p erformance for isomorphic higher lev el data represen tations. Alterna- tiv ely , lazy representations as provided b y functional binary num b ers or BDDs, for very large integers encapsulating results of some computations might turn out to b e more effective space-wise or time-wise. W e can compare representations sharing a common datatype to conjecture ab out their asymptotic information density . 22.4 Exp erimen tal Mathematics Comparing compactness of representations F or instance, after defining: length_as t = fit genericLength (with nat t) sum_as t = fit sum (with nat t) size_as t = fit tsize (with nat t) one can conjecture that finite functions are more compact than p erm utations whic h are more compact than sets asymptotically ∗ ISO > length_as set 123456789012345678901234567890 54 ∗ ISO > length_as perm 123456789012345678901234567890 28 ∗ ISO > length_as fun 123456789012345678901234567890 54 ∗ ISO > sum_as set 123456789012345678901234567890 2690 ∗ ISO > sum_as perm 123456789012345678901234567890 378 ∗ ISO > sum_as fun 123456789012345678901234567890 43 One might observe that the same trend applies also to their hereditarily finite deriv ativ es: ∗ ISO > size_as hfs 123456789012345678901234567890 627 ∗ ISO > size_as hfp 123456789012345678901234567890 276 ∗ ISO > size_as hff 123456789012345678901234567890 91 While confirming or refuting this conjecture is b eyond the scop e of this pap er, the affirmativ e case would imply , interestingly , that “order” (p erm utations) has asymptotically higher information density than “conten t” (sets), and explain wh y finite functions (that in v olve b oth) dominate data represen tations in v arious computing fields. Based on the same experiment, reduced BDDs (esp ecially if one implemen ts sharing, as computed b y robdd size ) also provide relativ ely compact represen- tations: ∗ ISO > bdd_size $ as bdd nat 123456789012345678901234567890 256 ∗ ISO > bdd_size $ as rbdd nat 123456789012345678901234567890 144 ∗ ISO > robdd_size $ as rbdd nat 123456789012345678901234567890 39 Figures 21, 22, 23 compare the sizes of bitstring, BDD, HFF, HFS, HFP rep- resen tations, first with the most succinct ones (bitstring, BDDs, HFF) group ed together in Fig. 21, then the less succinct ones (HFS and HFP) in Fig. 22 and finally all representations together for n in the larger interv al [0 .. 2 16 − 1]. Fig. 21: Comparison of curve1=Bit, curve2=BDD and curv e3=HFF sizes It is also interesting to observ e the abilit y of some represen tations to express h uge n umbers that normally ov erflow computer memory but which are gen uinely “lo w complexit y” as a result of a small num bers of simple computational steps that generate them. Fig. 22: Comparison of curve1=HFS and curv e1=HFP sizes Fig. 23: Comparison of all represen tation sizes at a larger scale F or instance, ∗ ISO > map (as nat pars) [ "()","(())","((()))","(((())))", "((((()))))","(((((())))))"] [0,1,2,4,16,65536] ∗ ISO > as hff pars "((()))" H [H [H []]] sho ws that parenthesis sequences (structurally isomorphic to hereditarily finite functions) can represent succinctly the fast growing but lo w complexity series a n = 2 2 n . Clearly , terms of the series would exhaust computer memory quite quic kly using a conv entional bitvector based arbitrary size in teger represen ta- tion! This suggest the usefulness of a universal p ossibly lazy “shap eshifting” algorithm, that can decide on the most efficien t data represen tation automati- cally , using size estimates, at the time when data is actually constructed. Sparseness criteria As a first step, one can in tro duce a “sparseness criteria” b y comparing the size of a representation f with the size of the self-delimiting Elias omega co de. One can obtain an enco ding of such sequences b y encoding its length and then enco ding each term, parametrized by a function f : N at → [ N at ]: nat2self f n = (to_elias l) + + concatMap to_elias ns where ns = f n l = genericLength ns nat2sfun n = nat2self (as fun nat) n This function is injective (but not onto!) and its action can b e reversed by first deco ding the length l and then extracting self delimited sequences l times. self2nat g ts = (g xs,ts’) where (l,ns) = from_elias ts (xs,ts’) = take_from_elias l ns take_from_elias 0 ns = ([],ns) take_from_elias k ns = ((x:xs),ns’’) where (x,ns’) = from_elias ns (xs,ns’’) = take_from_elias (k-1) ns’ sfun2nat ns = xs where (xs,[]) = self2nat (as nat fun) ns W e obtain the Enco der: sfun :: Encoder [Nat] sfun = compose (Iso sfun2nat nat2sfun) nat w orking as follows: ∗ ISO > as sfun nat 42 [1,0,1,0,0,0,1,0,0,1,0,0,1,0,0] ∗ ISO > as nat sfun it 42 A simple concept of sparseness is deriv ed by comparing the size of a self- delimiting co de for a num b er n vs. the size of its self-delimiting representation as a finite sequence, finite set or finite p erm utation as sho wn in Fig. 24, computed as follows: linear_sparseness_pair t n = (genericLength (to_elias n),genericLength (nat2self (as t nat) n)) linear_sparseness f n = x / y where (x,y) = linear_sparseness_pair f n Fig. 24: Sparseness measures with curv e1=fun, curv e2=set, curve3=perm up to 2 7 W e can also extend this comparison the hereditarily finite represen tations, whic h, as a pleasant surprise, turn out to provide self-delimiting co des. sparseness_pair f n = (genericLength (to_elias n),genericLength (as f nat n)) sparseness f n = x / y where (x,y) = sparseness_pair f n One can then compare (self-delimiting) parenthesis language representations for hereditarily finite enco ders provided by HFF, HFS, HFP and disco ver the “p eaks” of sparseness as shown in Fig. 25 and 26. Fig. 25: Sparseness measures with curv e1=HFF, curv e2=HFS, curve3=HFP up to 2 8 Fig. 26: Sparseness measures with curv e1=HFF curv e2=HFS, curve3=HFP up to 2 14 A new self-delimiting co de While the HFF representation is generally less compact than Elias omega code, its simplicity suggest it as a possibly useful self- delimiting co de, esp ecially interesting for streams of “sparse” v alues, as shown in Fig. 27. Fig. 27: Comparison of co des: curve1=Undelimited curve2=Elias, curve3=HFF up to 2 7 One can collect v alues that ha ve smaller HFF co des than Elias omega co des i.e. “sparse num b ers” with: sparses_to m = [n | n ← [0..m-1], (genericLength (as hff_pars nat n)) < (genericLength (as elias nat n))] w orking as follows ISO > sparses_to (2^11) [15,16,17,24,32,64,65,96,128,129,192,256,257,258,259,320,384, 385,448,512,513,514,515,516,517,518,519,520,544,576,640,641,704,768, 769,770,771,832,896,897,960,1024,1025,1026,1027,1028,1029,1030,1031, 1032,1088,1152,1280,1281,1408,1536,1537,1538,1539,1664,1792,1793,1920] and notice that the list collects an unusually large num b er of v arious p opular memory chip and computer screen sizes. Figure 28 shows distribution of “sparse n umbers” in [0 .. 2 18 ]. Fig. 28: Sparse n umbers in [0 .. 2 18 ], x=nth sparse num ber, y=its v alue Primes and Pairing F unctions Products of tw o prime num bers hav e the in teresting prop ert y that they are sp ecial a case where no information is lost b y multiplication in the sense of [35]. Indeed, in this case multiplication is re- v ersible, i.e. the tw o factors can b e reco vered given the pro duct. As the pro duct is comparativ ely easy to compute, while in case of large primes factoring is b e- liev ed in tractable, this property has well-kno wn uses in cryptography . Giv en the isomorphism betw een natural n umbers and primes mapping a prime to its p osi- tion in the sequence of primes, one can transp ort pairing/unpairing op erations to prime num b ers ppair pairingf (p1,p2) | is_prime p1 && is_prime p2 = from_pos_in ps (pairingf (to_pos_in ps p1,to_pos_in ps p2)) where ps = primes punpair unpairingf p | is_prime p = (from_pos_in ps n1,from_pos_in ps n2) where ps = primes (n1,n2) = unpairingf (to_pos_in ps p) w orking as follows: ∗ ISO > ppair bitpair (11,17) 269 ∗ ISO > punpair bitunpair it (11,17) Clearly , this defines a bijection f : P r imes × P r imes → P r imes that is tempting to compare with the pro duct of tw o primes . Figs. 29 and 30 shows the surfaces generated b y pro ducts and multiset pairings of primes. While both commutativ e op erations are reversible and likely to b e asymptotically equiv alen t in terms of information densit y , one can notice the m uch smoother transition in the case of lossless multiplication. Fig. 29: Lossless m ultiplication of primes W e hav e seen that recursive application of the unpairing function bitunpair pro vided an isomorphisms b etw een natural num b ers and BDDs. Given an un- p airing function u : N at → N at × N at and a predicate p(n) ov er the set of natural num bers, it makes sense to inv estigate subsets of N at suc h that if p holds for n then it also holds after applying the unpairing function u to n . More in terestingly , one can lo ok at subsets for which this prop ert y holds recursiv ely . Assuming a prime recognizer is prime and a generator primes for the stream of prime num b ers (see App endix), we can define: hyper_primes u = [n | n ← primes, all_are_primes (uparts u n)] where all_are_primes ns = and (map is_prime ns) uparts u = sort . nub . tail . (split_with u) where split_with _ 0 = [] split_with _ 1 = [] Fig. 30: Lossless m ultiset pairing of primes split_with u n = n:(split_with u n0) + + (split_with u n1) where (n0,n1) = u n w orking as follows: ∗ ISO > take 20 (hyper_primes bitunpair) [2,3,5,7,11,13,17,19,23,29,31,43,47,59,71,79,83,89,103,139] ∗ ISO > take 20 (hyper_primes pepis_unpair) [2,3,5,7,11,13,19,23,29,31,43,53,59,107,127,173,223,251,311,347] This leads to the following conjectures, in increasing order of generality: Conjecture 1 The sets gener ate d by (hyp er primes bitp air) and (hyp er primes p epis unp air) ar e infinite. Conjecture 2 If u is a bije ction fr om u : N at → N at × N at such that: 1. if n > 1 and u n = ( n 0 , n 1 ) then n 0 < n and n 1 < n 2. p is a pr e dic ate on N at such that P = { n : p ( n ) } is infinite then the set P ∩ { n : upar ts u n } is also infinite. Figure 31 sho ws the complete unpairing graph for t w o h yp er-primes obtained with bitunpair . It is interesting to compare the action of pairing of natural n um b ers with their action on functions on primes and h yp er-primes with products. Clearly pro ducts 0 1 2 0 1 3 1 0 7 1 0 29 1 0 47 0 1 1783 1 0 1 3 1 0 7 1 0 43 0 1 47 0 1 1151 1 0 2109167 0 1 Fig. 31: mset unpair hyper-primes: 1783 and 2109167 are not reversible, except when num b ers are primes, while pairing functions are alw ays rev ersible. T o factor in the fact that pro ducts comm ute while pairing functions do not, we hav e considered 2 xy instead of xy . Figures 32 and 33 show this comparison. Fig. 32: Pairing of primes vs. 2xy Fig. 33: Pairing of hyper-primes vs. 2xy Hyp er-primes and F ermat primes One could exp ect to model more closely the b eha vior of primes and pro ducts by focusing on commutativ e functions like the multiset pairing function mset pair : ∗ ISO > take 16 (hyper_primes mset_unpair) [2,3,5,13,17,113,173,257,10753,17489,34897,34961,43633,43777,65537,142781101] W e remind that: Definition 3 A F ermat-prime is a prime of the form 2 2 n + 1 with n > 0 . Fig. 34 shows a hyper-prime that is also a F ermat prime and a hyper-prime that is not a F ermat prime. 0 1 2 0 1 3 0 1 5 1 0 17 1 0 257 1 0 65537 1 0 0 1 2 0 1 3 0 1 5 1 0 13 0 1 17 10 173 0 1 34897 0 1 Fig. 34: mset unpair hyper-primes: F ermat prime and Non-F ermat prime This time a more interesting conjecture emerges. W e can now state that: Conjecture 3 Al l F ermat primes ar e mset unpair induc e d hyp er-primes. W e will just observe that this would follow from the widely b eliev ed conjecture that there the only F ermat primes are [3,5,17,257,65537] as these 5 primes are indeed on our list of mset unpair hyperprimes. In the even t of the alternative, w e will now state: Prop osition 12 If ther e ar e F ermat primes other than [3,5,17,257,65537] then ther e ar e F ermat primes that ar e not mset unpair hyp er-primes. T o prov e Prop. 12 we need a few additional results. First, the following known fact, implying that we only need to prov e that there are primes of the form 2 2 n + 1 that are not hyper-primes. Lemma 1 If n > 0 and 2 n + 1 is prime then n is a p ower of 2 . It is easy to prov e, from the definition of mset pair that: Lemma 2 mset pair (2 2 n + 1 , 2 2 n + 1) ≡ 2 2 n +1 + 1 (23) Indeed, from the identit y 19 we obtain mset pair ( a, a ) ≡ bitpair ( a, 0) (24) and then observe that from 14 it follows that bitpair (2 2 n + 1 , 0) ≡ 2 2 n +1 + 1 (25) W e can no w prov e Prop. 34. If 2 2 n +1 + 1 is a F ermat prime that is also a h yp er- prime, then 2 2 n + 1 would b e also a F ermat prime that is h yper-prime. This w ould form a descending sequence of consecutiv e F ermat primes - a contradiction, giv en that it has b een prov en (b y Leonhard Euler in 1732) that for instance, 2 32 + 1 = 641 ∗ 6 , 700 , 417 is not a prime. 22.5 A surprising “free algorithm”: strange sort A simple isomorphism like nat set can exhibit interesting prop erties as a build- ing blo ck of more intricate mappings like Ack ermann’s enco ding, but let’s also note a (surprising to us) “free algorithm” – sorting a list of distinct elemen ts without explicit use of comparison op erations: strange_sort = (from nat_set) . (to nat_set) ∗ ISO > strange_sort [2,9,3,1,5,0,7,4,8,6] [0,1,2,3,4,5,6,7,8,9] This algorithm emerges as a consequence of the commutativit y of addition and the unicity of the decomp osition of a natural num b er as a sum of p o w ers of 2. The cognoscenti might notice that such surprises are not totally unexp ected in the w orld of functional programming. In a differen t context, they go bac k as early as W adler’s F ree Theorems [36]. In a similar wa y , to sort sequences with rep eated elemen ts one can write strange_sort’ = (to mset) . (from mset) strange_sort’’ = (as mset nat) . (as nat mset) ∗ ISO > strange_sort’ [2,4,1,1,0,3,17,1.4] [0,1,1,1,2,3,4,4,17] ∗ ISO > strange_sort’’ [2,4,1,1,0,3,17,1,4] [0,1,1,1,2,3,4,4,17] 22.6 Circuit Minimization Let us consider the classic problem of syn thesizing a half adder, comp osed of an X OR ( ^ ) and an AND ( * ) function. W e can combine the t wo functions with an if- then-else with selector v ariable A to obtain: ITE(A,B^C,B*C) with the following truth table: [0,0,0]:0 [0,0,1]:0 [0,1,0]:0 [0,1,1]:1 [1,0,0]:0 [1,0,1]:1 [1,1,0]:1 [1,1,1]:0 Note that this 3 argument single output function (encoded as the natural num ber 22 b y reading its v alue column in binary), fuses the tw o op erations with the upper half of the truth table representing the AND and the low er half representing the XOR . When running to min bdd on this function we obtain: ISO > from_base 2 [0,1,1,0, 1,0,0,0] 22 ∗ ISO > to_min_bdd 3 22 BDD 3 (D 0 (D 1 (D 2 B0 B1) (D 2 B1 B0)) (D 1 (D 2 B1 B0) B0)) 22.7 Other Applications A fairly large n umber of useful algorithms in fields ranging from data compres- sion, co ding theory and cryptography to compilers, circuit design and computa- tional complexity inv olve bijectiv e functions b et ween heterogeneous data types. Their systematic encapsulation in a generic API that co exists well with strong t yping can bring significan t simplifications to v arious soft ware mo dules with the added b enefits of reliability and easier maintenance. In a Genetic Programming con text [37] the use of isomorphisms b etw een bitvectors/natural num b ers on one side, and trees/graphs representing HFSs, HFFs on the other side, lo oks lik e a promising phenotype-genotype connection. Mutations and crossov ers in a data type close to the problem domain are transparently mapp ed to numer- ical domains where ev aluation functions can b e computed easily . In particular, “biological prov en” enco dings lik e DNA strands are likely to pro vide interest- ing genotypes implemen tations. In the context of Softw are T ransaction Memory implemen tations (lik e Hask ell’s STM [38]), enco dings through isomorphisms are sub ject to efficient shortcuts, as undo op erations in case of transaction failure can b e performed by applying in verse transformations without the need to sav e the intermediate c hain of data structures inv olved. 23 Related w ork The closest reference on encapsulating bijections as a Haskell data type is [39] and Conal Elliott’s comp osable bijections module [40], where, in a more complex setting, Arrows [41] are used as the underlying abstractions. While our Iso data t yp e is similar to the Bij data type in [40] and BiArrow concept of [39], the techniques for using such isomorphisms as building blo c ks of an embedded comp osition language centered around enco dings as Natural Numbers are new. As the domains b et ween whic h w e define our isomorphisms can be organized as categories, it is likely that some of our constructs w ould b enefit from natur al tr ansformation [42] and n-c ate gory formulations [43]. R anking functions can b e traced back to G¨ odel num b erings [5, 6] asso ciated to formulae. T ogether with their inv erse unr anking functions they are also used in com binatorial generation algorithms [44, 27, 45, 46]. Ho wev er the generic view of suc h transformations as hylomorphisms obtained comp ositionally from simpler isomorphisms, as describ ed in this pap er, is new. Natural Number enco dings of Hereditarily Finite Sets hav e triggered the in terest of researchers in fields ranging from Axiomatic Set Theory and F ounda- tions of Logic to Complexity Theory and Combinatorics [47–52]. Computational and Data Representation asp ects of Finite Set Theory hav e b een describ ed in logic programming and theorem proving con texts in [12, 53]. P airing functions hav e b een used in work on decision problems as early as [20, 23]. A t ypical use in the foundations of mathematics is [54]. An extensiv e study of v arious pairing functions and their computational prop erties is presented in [55]. V arious mappings from natural num b ers to rational num b ers are describ ed in [56], also in a functional programming framework. W e ha ve learned from Knuth’s recent w ork on combinatorial algorithms [27] the tec hniques related to bitv ector encodings of pro jection functions and b oolean op erations and ab out BDDs and reduced ordered BDDs from Bryan t’s seminal pap er on the topic [26]. How ever, the connection with pairing/unpairing func- tions and the equiv alence results of subsection 10.4 are new. The concepts of hereditarily finite functions and permutations as w ell as their enco dings, are likely to b e new, giv en that our sustained search efforts hav e not lead so far to anything similar. Some other techniques, ranging from factoradics to cons-lists and functional binary num b ers to DNA enco dings and dyadic rationals are for sure part of the scientific commons. In that case our fo cus was to express them as elegan tly ans p ossible in a uniform framework. In these cases as well, most of the time it w as faster to “just do it”, by implementing them from scratch in a functional programming framework, rather than adapting pro cedural algorithms found else- where. 24 Conclusion W e hav e shown the expressiveness of Hask ell as a metalanguage for executable mathematics, by describing enco dings for functions and finite sets in a uniform framew ork as data t yp e isomorphisms with a group oid structure. Haskell’s higher order functions and recursion patterns ha ve help ed the design of an em b ed- ded data transformation language. Using higher order combinators a simplified Quic kCheck style random testing mechanism has b een implemen ted as an em- pirical correctness test. The framework has b een extended with hylomorphisms pro viding generic mechanisms for enco ding Hereditarily Finite Sets and Hered- itarily Finite F unctions. In the pro cess, a few surprising “free algorithms” hav e emerged as well as a generalization of Ac kermann’s enco ding to Hereditarily Fi- nite Sets with Urelements. W e plan to explore in depth in the near future, some of the results that are lik ely to b e of interest in fields ranging from combina- torics and b oolean logic to data compression and arbitrary precision numerical computations. References 1. T arau, P .: Isomorphisms, Hylomorphisms and Hereditarily Finite Data Types in Hask ell. In: Pro ceedings of A CM SA C’09, Honolulu, Ha waii, ACM (Marc h 2009) 2. Lak off, G., Johnson, M.: Metaphors W e Live By . Univ ersity of Chicago Press, Chicago, IL, USA (1980) 3. Co ok, S.: Theories for complexity classes and their propositional translations. In: Complexit y of computations and proofs. (2004) 1–36 4. Co ok, S., Urquhart, A.: F unctional interpretations of feasibly constructive arith- metic. Annals of Pure and Applied Logic 63 (1993) 103–200 5. G¨ odel, K.: ¨ Ub er formal unentsc heidbare S¨ atze der Principia Mathematica und v erwandter Systeme I. Monatshefte f ¨ ur Mathematik und Physik 38 (1931) 173– 198 6. Hartmanis, J., Baker, T.P .: On simple go edel num b erings and translations. In Lo ec kx, J., ed.: ICALP . V olume 14 of Lecture Notes in Computer Science., Springer (1974) 301–316 7. Claessen, K., Hughes, J.: T esting monadic code with quick c heck. SIGPLAN Notices 37 (12) (2002) 47–59 8. Singh, D., Ibrahim, A.M., Y ohanna, T., Singh, J.N.: An ov erview of the applica- tions of multisets. Novi Sad J. Math 52 (2) (2007) 73–92 9. Hutton, G.: A T utorial on the Universalit y and Expressiv eness of F old. J. F unct. Program. 9 (4) (1999) 355–372 10. Meijer, E., Hutton, G.: Bananas in Space: Extending F old and Unfold to Exp o- nen tial Types. In: FPCA. (1995) 324–333 11. Ac kermann, W.F.: Die Widerspruchsfreiheit der allgemeinen Mengenlhere. Math- ematisc he Annalen (114) (1937) 305–315 12. Piazza, C., Policriti, A.: Ack ermann Enco ding, Bisim ulations, and OBDDs. TPLP 4 (5-6) (2004) 695–718 13. G¨ ob el, F.: On a 1-1-correspondence b et ween ro oted trees and natural num bers. J. Com b. Theory , Ser. B 29 (1) (1980) 141–143 14. Kn uth, D.E.: The art of computer programming, v olume 2 (3rd ed.): semin umerical algorithms. Addison-W esley Longman Publishing Co., Inc., Boston, MA, USA (1997) 15. Man taci, R., Rakotondra jao, F.: A permutations representation that knows what ”eulerian” means. Discrete Mathematics & Theoretical Computer Science 4 (2) (2001) 101–108 16. P epis, J.: Ein verfahren der mathematischen logik. The Journal of Symbolic Logic 3 (2) (jun 1938) 61–76 17. Kalmar, L.: On the reduction of the decision problem. first pap er. ack ermann prefix, a single binary predicate. The Journal of Symbolic Logic 4 (1) (mar 1939) 1–9 18. Kalmar, Laszlo, Suranyi, Janos: On the reduction of the decision problem. The Journal of Symbolic Logic 12 (3) (sep 1947) 65–73 19. Kalmar, Laszlo, Suran yi, Janos: On the reduction of the decision problem: Third pap er. pepis prefix, a single binary predicate. The Journal of Sym b olic Logic 15 (3) (sep 1950) 161–173 20. Robinson, J.: General recursive functions. Pro ceedings of the American Mathe- matical So ciety 1 (6) (dec 1950) 703–718 21. Robinson, J.: A note on primitiv e recursiv e functions. Pro ceedings of the American Mathematical So ciety 6 (4) (aug 1955) 667–670 22. Robinson, J.: Recursive functions of one v ariable. Proceedings of the American Mathematical So ciety 19 (4) (aug 1968) 815–820 23. Robinson, J.: Finite generation of recursively enumerable sets. Pro ceedings of the American Mathematical So ciet y 19 (6) (dec 1968) 1480–1486 24. Robinson, J.: An introduction to hyperarithmetical functions. The Journal of Sym b olic Logic 32 (3) (sep 1967) 325–342 25. Pigeon, S.: Contributions ` a la compression de donn ´ ees. Ph.d. thesis, Univ ersit´ e de Mon tr´ eal, Montr ´ eal (2001) 26. Bry ant, R.E.: Graph-based algorithms for b oolean function manipulation. IEEE T ransactions on Computers 35 (8) (1986) 677–691 27. Kn uth, D.: The Art of Computer Programming, V olume 4, draft (2006) h ttp://www-cs-faculty .stanford.edu/ ∼ kn uth/tao cp.h tml. 28. Shannon, C.E.: Claude Elw o od Shannon: collected papers. IEEE Press, Piscataw a y , NJ, USA (1993) 29. F ujita, M., McGeer, P .C., Y ang, J.C.Y.: Multi-terminal binary decision diagrams: An efficient data structure for matrix representation. F ormal Metho ds in System Design 10 (2/3) (1997) 149–169 30. Ciesinski, F., Baier, C., Gro esser, M., Park er, D.: Generating compact MTBDD- represen tations from Probmela sp ecifications. In: Pro c. 15th International SPIN W orkshop on Mo del Checking of Soft ware (SPIN’08). (2008) 31. Bucciarelli, A., Salibra, A.: The sensible graph theories of lambda calculus. In: LICS ’04: Proceedings of the 19th Ann ual IEEE Symp osium on Logic in Computer Science, W ashington, DC, USA, IEEE Computer So ciety (2004) 276–285 32. Berline, C.: Graph mo dels of λ -calculus at work, and v ariations. Mathematical. Structures in Comp. Sci. 16 (2) (2006) 185–221 33. Li, Z.: Algebraic prop erties of dna op erations. Biosystems 52 (Octob er 1999) 55–61(7) 34. Hinze, T., Sturm, M.: A universal functional approach to dna computing and its exp erimen tal practicabilit y . In: Pro ceedings 6th DIMACS W orkshop on DNA Based Computers, held at the Universit y of Leiden, Leiden, The Netherlands, 13 - 17. (2000) 257–266 35. Pipp enger, N.: The av erage amount of information lost in multiplication. IEEE T ransactions on Information Theory 51 (2) (2005) 684–687 36. W adler, P .: Theorems for free! In: FPCA ’89: Pro ceedings of the fourth interna- tional conference on F unctional programming languages and computer architec- ture, New Y ork, NY, USA, ACM (1989) 347–359 37. Koza, J.R.: Genetic Programming: On the Programming of Computers b y Means of Natural Selection. MIT Press, Cambridge, MA, USA (1992) 38. Harris, T., Marlow, S., Jones, S.L.P ., Herlihy , M.: Comp osable memory transac- tions. Commun. ACM 51 (8) (2008) 91–100 39. Alimarine, A., Smetsers, S., v an W eelden, A., v an Eek elen, M., Plasmeijer, R.: There and back again: arrows for inv ertible programming. In: Haskell ’05: Pro- ceedings of the 2005 A CM SIGPLAN workshop on Haskell, New Y ork, NY, USA, A CM Press (2005) 86–97 40. Conal Elliott: Data.Bijections Haskell Module. h ttp://haskell.org/hask ellwiki/TypeComp ose. 41. Hughes, J.: Generalizing Monads to Arrows Science of Computer Programming 37, pp. 67-111, Ma y 2000. 42. Mac Lane, S.: Categories for the W orking Mathematician. Springer-V erlag, New Y ork, NY, USA (1998) 43. Baez, J.C.: An introduction to n-categories. In: In 7th Conference on Category Theory and Computer Science, Springer-V erlag (1997) 1–33 44. Martinez, C., Molinero, X.: Generic algorithms for the generation of combinatorial ob jects. In Rov an, B., V o jtas, P ., eds.: MFCS. V olume 2747 of Lecture Notes in Computer Science., Springer (2003) 572–581 45. Rusk ey , F., Proskurowski, A.: Generating binary trees by transpositions. J. Algo- rithms 11 (1990) 68–84 46. Myrv old, W., Ruskey , F.: Ranking and unranking p erm utations in linear time. Information Pro cessing Letters 79 (2001) 281–284 47. T ak ahashi, M.o.: A F oundation of Finite Mathematics. Publ. Res. Inst. Math. Sci. 12 (3) (1976) 577–708 48. Ka ye, R., W ong, T.L.: On Interpretations of Arithmetic and Set Theory. Notre Dame J. F ormal Logic V olume 48 (4) (2007) 497–510 49. Abian, A., Lamacchia, S.: On the consistency and indep endence of some set- theoretical constructs. Notre Dame Journal of F ormal Logic X1X (1) (1978) 155– 158 50. Avigad, J.: The Combinatorics of Prop ositional Prov ability. In: ASL Winter Meeting, San Diego (Jan uary 1997) 51. Kirb y , L.: Addition and multiplication of sets. Math. Log. Q. 53 (1) (2007) 52–65 52. Leon tjev, A., Sazonov, V.Y.: Capturing LOGSP ACE o ver Hereditarily-Finite Sets. In Sc hewe, K.D., Thalheim, B., eds.: F oIKS. V olume 1762 of Lecture Notes in Computer Science., Springer (2000) 156–175 53. P aulson, L.C.: A Concrete Final Coalgebra Theorem for ZF Set Theory. In Dyb jer, P ., Nordstr¨ om, B., Smith, J.M., eds.: TYPES. V olume 996 of Lecture Notes in Computer Science., Springer (1994) 120–139 54. C ´ egielski, P ., Richard, D.: Decidability of the theory of the natural integers with the cantor pairing function and the successor. Theor. Comput. Sci. 257 (1-2) (2001) 51–77 55. Rosen b erg, A.L.: Efficient pairing functions - and why you should care. Interna- tional Journal of F oundations of Computer Science 14 (1) (2003) 3–17 56. Gibb ons, J., Lester, D., Bird, R.: En umerating the rationals. Journal of F unctional Programming 16 (4) (2006) App endix The co de in the pap er is organized in a mo dule with the following dep endencies: module ISO where import Data.List import Data.Bits import Data.Graph import Data.Graph.Inductive import Graphics.Gnuplot.Simple import Data.Char import Ratio import Random Bit crunching functions The function bitcount computes the num b er of bits needed to represent an in- teger and max bitcount computes the maximum bitcoun t for a list of integers. bitcount n = head [x | x ← [1..],(2^x) > n] max_bitcount ns = foldl max 0 (map bitcount ns) The following function con vert a n umber to to binary , padded with 0s, up to maxbits. to_maxbits maxbits n = bs + + (genericTake (maxbits-l)) (repeat 0) where bs = to_base 2 n l = genericLength bs Primes The follo wing co de implements factoring function to primes a primality test ( is prime ) and a generator for the infinite stream of prime num b ers primes . primes = 2 : filter is_prime [3,5..] is_prime p = [p] = = to_primes p to_primes n | n > 1 = to_factors n p ps where (p:ps) = primes to_factors n p ps | p ∗ p > n = [n] to_factors n p ps | 0 = = n ‘mod‘ p = p : to_factors (n ‘div‘ p) p ps to_factors n p ps@(hd:tl) = to_factors n hd tl W e will briefly describe here the functions used to visualize v arious data t yp es with the help of Hask ell libraries providing interfaces to graphviz and gnuplot . Multiset Op erations The following functions provide multiset analogues of the usual set op erations, under the assumption that m ultisets are represen ted as non-decreasing sequences. msetInter [] _ = [] msetInter _ [] = [] msetInter (x:xs) (y:ys) | x = = y = (x:zs) where zs = msetInter xs ys msetInter (x:xs) (y:ys) | x < y = msetInter xs (y:ys) msetInter (x:xs) (y:ys) | x > y = msetInter (x:xs) ys msetDif [] _ = [] msetDif xs [] = xs msetDif (x:xs) (y:ys) | x = = y = zs where zs = msetDif xs ys msetDif (x:xs) (y:ys) | x < y = (x:zs) where zs = msetDif xs (y:ys) msetDif (x:xs) (y:ys) | x > y = zs where zs = msetDif (x:xs) ys msetSymDif xs ys = sort ((msetDif xs ys) + + (msetDif ys xs)) msetUnion xs ys = sort ((msetDif xs ys) + + (msetInter xs ys) + + (msetDif ys xs)) Building a m ultigraph from a natural n um b er using a function asso ciating to eac h natural num b er a sequence or set of natural n umbers. fun2g ns = nat2fgs nat2fun ns set2g ns = nat2sgs nat2set ns perm2g ns = nat2fgs nat2perm ns pmset2g ns = nat2fgs nat2pmset ns bmset2g ns = nat2fgs nat2bmset ns nat2fg f n = nat2gx fun_edge f nat2pftree n :: Gr Nat Int nat2fgs f ns = nat2gsx fun_edge f nat2pftree ns :: Gr Nat Int nat2sg f n = nat2gx set_edge f nat2pftree n :: Gr Nat () nat2sgs f ns = nat2gsx set_edge f nat2pftree ns :: Gr Nat () set_edge xs (a,b,i) = (lookUp a xs,lookUp b xs,()) fun_edge xs (a,b,i) = (lookUp a xs,lookUp b xs,i) nat2gx e f g n = mkGraph vs (map (e xs) es) where es = g f n (xs,vs) = labeledVertices es nat2gsx e f g ns = mkGraph vs (map (e xs) es) where es = nub (concatMap (g f) ns) (xs,vs) = labeledVertices es labeledVertices es = (xs,vs) where xs = fvertices es is = [0..(length xs)-1] vs = zip is xs nat2pftree f n = nub (nat2pftreex f (n,n,0)) nat2pftreex f (_,n,_) = ps + + (concatMap (nat2pftreex f) ps) where ps = nat2pfun f (n,n,0) nat2pfun _ (_,0,_) = [] nat2pfun f (_,n,_) | n > 0 = ps where ps = zipWith ( λ x i → (n,x,i)) (f n) [0..] fvertices ps = (sort . nub) (concatMap f ps) where f (a,b,_) = [a,b] lookUp n ns = i where Just i = elemIndex n ns Building Inductive Graphs from Lists of Pairs pairs2gr :: [(Nat,Nat)] → Gr Nat () pairs2gr ps = mkGraph lvs les where vs = to_vertices ps lvs = zip [0..] vs es = to_edges vs ps les = map f es f (x,y) = (x,y,()) to_vertices es = sort $ nub $ concat [[fst p,snd p] | p ← es] to_edges vs ps = map (f vs) ps where f vs (x,y) = (lookUp x vs,lookUp y vs) Generating lab eled edge triplets b y recursing ov er unpairing functions The follo wing function represents a num b er as a set of triplets expressing branches of decomp osition with an unpairing function f , for instance, in the case of BDDs with function bitunpair . unpairing_edges f tt = nub (h f tt) where h _ tt | tt < 2 = [] h f n = ys where (n0,n1) = f n ys = (n,n0,0):(n,n1,1): (h f n0) + + (h f n1) The function works as follows: ∗ ISO > unpairing_edges bitunpair 42 [(42,0,0),(42,7,1),(7,3,0),(7,1,1),(3,1,0),(3,1,1)] ∗ JFISO > unpairing_edges pepis_unpair 42 [(42,0,0),(42,21,1),(21,1,0),(21,5,1),(5,1,0),(5,1,1)] ∗ ISO > Generating lab eled edge triplets by recursing o ver un tupling functions The follo wing function represents a num b er as a set of triplets expressing branches of decomp osition with an untupling function fk , for instance to tuple k . untupling_edges f k tt = nub (h f k tt) where h _ _ tt | tt < 2 = [] h f k n = ys where ns = f k n ys = (zip3 (repeat n) ns [0..]) + + (concatMap (h f k) ns) The function works as follows: ∗ ISO > untupling_edges to_tuple 3 2008 [(2008,14,0),(2008,14,1),(2008,4,2),(14,2,0),(14,1,1),(14,1,2), (2,0,0),(2,1,1),(2,0,2),(4,0,0),(4,0,1),(4,1,2)] Building Inductive Graphs from Unpairing and Untupling T rees W e can no w turn a BDD as w ell as an y other unpairing function generated tree in to an inductive graph, as follows: to_unpair_graph f tt = nat2fun_graph (unpairing_edges f) tt to_untuple_graph f k tt = nat2fun_graph (untupling_edges f k) tt nat2fun_graph f n = mkGraph vs fs :: Gr Nat Int where es = f n (xs,vs) = labeledVertices es fs = map (fun_edge xs) es The functions work as follows: ∗ ISO > to_unpair_graph bitunpair 42 0:0 → [] 1:1 → [] 2:3 → [(1,1),(0,1)] 3:7 → [(0,2),(1,1)] 4:42 → [(1,3),(0,0)] ∗ ISO > to_unpair_graph pepis_unpair 42 0:0 → [] 1:1 → [] 2:5 → [(1,1),(0,1)] 3:21 → [(1,2),(0,1)] 4:42 → [(1,3),(0,0)] ∗ ISO > to_untuple_graph to_tuple 3 2008 0:0 → [] 1:1 → [] 2:2 → [(2,0),(1,1),(0,0)] 3:4 → [(2,1),(1,0),(0,0)] 4:14 → [(0,2),(2,1),(1,1)] 5:2008 → [(2,3),(1,4),(0,4)] Visualization with graphviz gviz g = writeFile "iso.gv" ((graphviz g "" (0.0,0.0) (2,2) Portrait) + + " λ n") funviz f n = gviz (nat2fg f n) setviz f n = gviz (nat2sg f n) pviz t n = gviz (pairs2gr (as t nat n)) uviz f tt = gviz (to_unpair_graph f tt) tviz f k tt = gviz (to_untuple_graph f k tt) Plotting with gnuplot plot3d f xs ys = plotFunc3d [Title ""] [] xs ys f cplot3d f = plot3d (curry f) plotpairs m | m ≤ 2^8 = cplot3d bitpair ls ls where ls = [0..m-1] plotdyadics m = plotList [Title "Dyadics"] (map (fromRational . (as dyadic nat)) [0..m-1]) sizes_to m t = map (size_as t) [0..m-1] plot_hf m = plotLists [Title "Bit, BDD, HFF, HFS, and HFP sizes"] ( [bits_to m,bsizes_to m] + + (map (sizes_to m) [hff,hfs,hfm,hfp]) ) plot_best m = plotLists [Title "Bit, BDD and HFF and HFF’ sizes"] ( [bits_to m,bsizes_to m] + + (map (sizes_to m) [hff,hff’]) ) plot_worse m = plotLists [Title "HFM, HFS and HFP sizes"] ( (map (sizes_to m) [hfm,hfs,hfp]) ) plot hf m = plotx [hf] m plotx hfx m = plotLists [Title "HF tree size"] ( (map (sizes_to (2^m-1)) hfx) ) -- plots pairs pplot f m = plotPath [] (map (to_ints . f) [0..2^m-1]) zplot f m = plotPath [] (map (to_ints . f) [-(2^m)..2^m-1]) to_ints (i,j) = (fromIntegral i,fromIntegral j) diplot n = plotPath [] (map to_ints (as digraph nat n)) bsize_of n = robdd_size (as rbdd nat n) bsizes_to m = map bsize_of [0..m-1] bits_to m = map s [0..m-1] where s n = genericLength (as bits nat n) plot_linear_sparseness m = plotLists [Title "Linear Sparseness"] [(map (linear_sparseness fun) [0..m-1]), (map (linear_sparseness pmset) [0..m-1]), (map (linear_sparseness mset) [0..m-1]), (map (linear_sparseness set) [0..m-1]), (map (linear_sparseness perm) [0..m-1])] plot_sparseness m = plotLists [Title "Recursive Sparseness"] [(map (sparseness hff_pars) [0..m-1]), (map (sparseness hfpm_pars) [0..m-1]), (map (sparseness hfm_pars) [0..m-1]), (map (sparseness hfs_pars) [0..m-1]), (map (sparseness hfp_pars) [0..m-1])] plot_sparseness1 m = plotLists [Title "Recursive Sequence vs. Multiset Sparseness"] [ (map (sparseness hff_pars) [0..m-1]), (map (sparseness hfpm_pars) [0..m-1]) ] plot_sparseness2 m = plotLists [Title "Recursive Multiset Sparseness"] [ (map (sparseness bhfm_pars) [0..m-1]), (map (sparseness hfm_pars) [0..m-1]) ] plot_sparseness3 m = plotLists [Title "Recursive Multiset Sparseness"] [ (map (sparseness hff_pars) [0..m-1]), (map (sparseness hff_pars’) [0..m-1]) ] plot_sparseness4 m = plotLists [Title "Recursive Multiset vs Multiset with Primes Sparseness"] [ (map (sparseness hfm_pars) [0..m-1]), (map (sparseness hfpm_pars) [0..m-1]) ] plot_sparseness5 m = plotLists [Title "Recursive Multisets vs. Sequences"] [ (map (sparseness hff_pars) [0..m-1]), (map (sparseness hfm_pars) [0..m-1]) ] plot_selfdels m = plotLists [Title "Self-delimiting codes: Undelimited vs. Elias vs. HFF"] [(map (genericLength . (as bits nat)) [0..m-1]), (map (genericLength . (as elias nat)) [0..m-1]), (map (genericLength . (as hff_pars nat)) [0..m-1])] plot_pairs_prods m = plotLists [Title "Pairs vs. products"] [ms,prods] where ms = [1..m] pairs = map bitunpair ms prods = map prod pairs where prod (x,y) = 2 ∗ x ∗ y plot_lifted_pairs m = plotLists [Title "Lifted pairs"] [us0,us1] where ms = [0..m-1] pairs = map bitunpair ms us0 = map fst pairs us1 = map snd pairs plot_lifted_pairs1 m = plotLists [Title "Lifted pairs and products"] [ps,s0,s1,xys] where ms = [0..m-1] pairs = map bitunpair ms us0 = map fst pairs us1 = map snd pairs ps = zipWith ( ∗ ) us0 us1 s0 = map (^2) us0 s1 = map (^2) us1 xys = map f pairs where f (x,y) = x ∗ y plot_primes_prods m = plotLists [Title "Primes vs. products"] [ps,prods] where ms = [0..m] ps = genericTake m primes pairs = map bitunpair ps prods = map prod pairs where prod (x,y) = 2 ∗ x ∗ y plot_hypers_prods m = plotLists [Title "Hyper-primes vs. products"] [ps,prods] where ms = [0..m] ps = genericTake m (hyper_primes bitunpair) pairs = map bitunpair ps prods = map prod pairs where prod (x,y) = 2 ∗ x ∗ y Generated Figures f1 = gviz (nat2sg nat2set 2008) f2 = gviz (nat2fg nat2fun 2008) f2a = gviz (nat2fg nat2mset 2008) f3 = gviz (nat2fg nat2perm 2008) f4 = gviz (nat2fg nat2perm 2009) f5 = pviz digraph 2008 f6 = plotpairs 64 f7 = plotdyadics 256 f8 = plot_best (2^6) f9 = plot_worse (2^10) f10 = plot_hf (2^8) f11a = plot_linear_sparseness (2^7) f11 = plot_sparseness (2^8) f11b = plot_sparseness1 (2^8) f11c = plot_sparseness2 (2^10) f12 = plot_sparseness (2^14) f13 = plot_sparseness (2^17) f14 = plot_selfdels (2^7) f15 = plotList [] (sparses_to (2^18)) f16 = gviz (nat2fgs nat2fun [0..7]) arp24 i = 468395662504823 + 205619 ∗ 23 ∗ i arps24 = map arp24 [0..23] arp25 i = 6171054912832631 + 366384 ∗ 23 ∗ i arps25 = map arp25 [0..24] f17 = gviz (fun2g arps24) f17a = gviz (fun2g arps25) f18 = gviz (fun2g [2^65 + 1,2^131 + 3]) f18a = gviz (set2g [2^65 + 1,2^131 + 3]) f19 = gviz (fun2g [0..7]) f20 = gviz (pmset2g [0..7]) f20a = gviz (bmset2g [0..7]) f21 = gviz (set2g [0..7]) f22 = gviz (perm2g [0..7]) g1 tt = uviz bitunpair tt g2 tt = uviz pepis_unpair tt g2’ tt = uviz pepis_unpair’ tt g3 tt = uviz rpepis_unpair tt isofermat = uviz mset_unpair 65537 isofermat1 = uviz mset_unpair 142781101 isonfermat = uviz mset_unpair 34897 isopairs = plot_pairs_prods 256 isoprimes = plot_primes_prods 256 isohypers = plot_hypers_prods 256 isounpair1 = pplot bitunpair 10 isounpair2 = pplot pepis_unpair 10 isounpair3 = pplot mset_unpair 10 isozunpair n = zplot zunpair n ms2pms n = as nat pmset (as mset nat n) pms2ms n = as nat mset (as pmset nat n) kms2pms 0 n = n kms2pms k n = ms2pms (kms2pms (k-1) n) kpms2ms 0 n = n kpms2ms k n = pms2ms (kpms2ms (k-1) n) lms k m = [x | x ← [0..2^m-1], kms2pms k x < kpms2ms k x] xms k m = [x | x ← [0..2^m-1],kms2pms k x < x] eqms k m = [x | x ← [0..2^m-1],kms2pms k x = = x] xpms k m = [x | x ← [0..2^m-1],kpms2ms k x < x] eqpms k m = [x | x ← [0..2^m-1],kpms2ms k x = = x] qms k m = [(toRational (kpms2ms k x)) - (toRational (kms2pms k x)) | x ← [1..2^m-1]] q1 k m = plotList [] (qms k m) q2 k m = plotLists [] [map (kms2pms k) xs,map (kpms2ms k) xs] where xs = [0..2^m-1] mult_vs_pairing p1 p2 = (p1 ∗ p2) % (ppair bitpair (p1,p2)) mult_vs_mset_pairing p1 p2 = (p1 ∗ p2) % (ppair mset_pair (p1,p2)) q3 n = plotFunc3d [Title "Prime Multiplication vs. Prime Pairing"] [] ps ps mult_vs_pairing where ps = genericTake n primes q4 n = plotFunc3d [Title "Prime Multiplication vs. Prime Multiset Pairing"] [] ps ps mult_vs_mset_pairing where ps = genericTake n primes n4a n = plotFunc3d [Title "Multiplication"] [] ps ps ( ∗ ) where ps = [0..2^n-1] n4b n = plotFunc3d [Title "Multiset Pairing"] [] ps ps (curry mset_pair) where ps = [0..2^n-1] n4c n = plotFunc3d [Title "mprod operation"] [] ps ps (mprod) where ps = [0..2^n-1] n4d n = plotFunc3d [Title "pmprod’ operation"] [] ps ps (pmprod’) where ps = [0..2^n-1] n4e n = plotFunc3d [Title "mprod’ operation"] [] ps ps (mprod’) where ps = [0..2^n-1] n4f n = plotFunc3d [Title "mprod’ x y / x ∗ y"] [] ps ps ( λ x y → (mprod’ x y) % (x ∗ y)) where ps = [1..2^n] expMexp k m = plotLists [] [map ( λ x → x^k) xs, map ( λ x → mexp’ x k) xs] where xs = [0..2^m] p4a n = plotFunc3d [Title "Prime Multiplication"] [] ps ps ( ∗ ) where ps = genericTake n primes p4b n = plotFunc3d [Title "Prime Multiset Pairing"] [] xs ys (curry mset_pair) where ps = genericTake n primes xs = ps ys = ps p4c n = plotFunc3d [Title "mprod on primes"] [] xs ys (mprod) where ps = genericTake n primes xs = ps ys = ps p4d n = plotFunc3d [Title "pmprod on primes"] [] xs ys (pmprod) where ps = genericTake n primes xs = ps ys = ps p4f n = plotFunc3d [Title "mprod’ x y / x ∗ y"] [] ps ps ( λ x y → (mprod’ x y) % (x ∗ y)) where ps = genericTake n primes q4c n = plotFunc3d [Title "Prime Pairing"] [] ps ps (curry bitpair) where ps = genericTake n primes q5 n = plotLists [Title "Prime Multiplication vs. Prime Pairing curves"] [prods,pairs] where us = map bitunpair [0..2^n-1] (xs,ys) = unzip us ps = primes xs’ = map (from_pos_in ps) xs ys’ = map (from_pos_in ps) ys prods = zipWith ( ∗ ) xs’ ys’ us’ = zip xs’ ys’ pairs = map (ppair mset_pair) us’ plot_gauss_op f m = plotFunc3d title [] zs zs (curry f) where title = [Title "Gauss Integer operations through Pairing Functions"] zs = [-2^m..2^m-1] gsum m = plot_gauss_op gauss_sum m gdif m = plot_gauss_op gauss_dif m gprod m = plot_gauss_op gauss_prod m

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment