Ranking Catamorphisms and Unranking Anamorphisms on Hereditarily Finite Datatypes

Ranking Catamorphisms and Unranking Anamor phisms on Her editarily Finite Datatypes – unpublished draft – Paul T arau Department of Computer Science and Engineering Univ ersity of North T exas ta rau@cs.unt.edu Abstract Using specializations of unfold and fold on a generic tree data type we deri ve unranking and r anking functions provid- ing natural number encodings for v arious Hereditarily Finite datatypes. In this context, we interpret unranking operations as in- stances of a generic anamorphism and ranking operations as instances of the corresponding catamorphism . Starting with Ackerman’ s Encoding from Hereditarily Finite Sets to Natural Numbers we deﬁne pairings and ﬁnite tuple encodings that provide building blocks for a theory of Her editarily F inite Functions . The more dif ﬁcult problem of ranking and unrank- ing Her editarily F inite P ermutations is then tackled using Lehmer codes and factoradics. The self-contained source code of the paper , as generated from a literate Haskell program, is av ailable at http:// logic.csci.unt.edu/tarau/research/2008/fFUN.zip . Keyw ords ranking/unr anking, pairing/tupling functions, Ack ermann encoding, her editarily ﬁnite sets, hereditarily ﬁ- nite functions, permutations and factoradics, computational mathematics, Haskell data r epr esentations 1. Introduction This paper is an e xploration with functional programming tools of ranking and unranking problems on ﬁnite functions and bijections and their related hereditarily ﬁnite uni v erses. The ranking pr oblem for a family of combinatorial objects is ﬁnding a unique natural number associated to it, called its rank . The in verse unranking pr oblem consists of generating Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. T o copy otherwise, to republish, to post on servers or to redistrib ute to lists, requires prior speciﬁc permission and/or a fee. Copyright c  ACM [to be supplied]. . . $5.00 a unique combinatorial object associated to each natural number . The paper is organized as follows: section 2 introduces generic ranking/unranking framework parameterized by bi- jectiv e transformers and terminating conditions based on ur elements , section 3 introduces Ack ermann’ s encoding and its inv erse as instances of the framew ork. After discussing some classic pairing functions, section 4 introduces new pairing/unpairing and tuple operations on natural numbers and uses them for encodings of ﬁnite functions (section 5), resulting in encodings for “Hereditarily Finite Functions” (section 6). Ranking/unranking of permutations and Heredi- tarily Finite Permutations as well as Lehmer codes and fac- toradics are covered in section 7. Sections 8 and 9 discuss related work, future work and conclusions. The paper is part of a lar ger ef fort to co ver in a declara- tiv e programming paradigm, arguably more elegantly , some fundamental combinatorial generation algorithms along the lines of (Knuth 2006). The practical expressiv eness of func- tional programming languages (in particular Hask ell) are put at test in the process. While the main focus of the paper was testdriving Haskell on the curvy tracks of non-trivial combinatorial generation problems, we have bumped, somewhat accidentally , into making a fe w ne w contrib utions to the ﬁeld as such, that could be easily blamed on the quality of the v ehicle we were testdriving: 1. the three ranking/unranking algorithms from ﬁnite func- tions to natural numbers are new 2. the univ erse of Hereditarily Finite Functions, as a func- tional analogue of the well known univ erse of Hereditar- ily Finite Sets is new 3. the universe of Hereditarily Finite Permutations, as an analogue of the well known univ erse of Hereditarily Fi- nite Sets is new 4. the natural number tupling/untupling functions are new 5. the ranking/unranking algorithm for permutations of ar- bitrary sizes is new (although it is based on a known Lehmer code-based algorithm for permutations of ﬁxed size) 6. the catamorphism/anamorphism view of ranking/unrank- ing functions is ne w and it is likely to be reusable for various f amilies of combinatorial generation problems Through the paper , we will use the follo wing set of prim- itiv e arithmetic functions: double n = 2 ∗ n half n = n ‘div‘ 2 exp2 n = 2^n together with succ , pred , even , odd and sum Haskell func- tions, to emphasize that this subset is easily hardware im- plementable (by only using boolean operations, shifts and adders) and that these functions also have O(log n) or better software implementations for inte gers of (arbitrary) length n . When possible, we will use point-free notations (unnec- essary function arguments omitted) to emphasize the generic function composition dataﬂow . As we hav e put signiﬁcant effort to ensure that all our types can be inferred, we will omit type declarations, with apologies to the type-curious reader , who can hav e them dis- played as needed, while interacting with the Haskell sources of the paper av ailable online. 2. Generic Unranking and Ranking with Higher Order Functions W e will use, through the paper , a generic “rose tree” type T distinguishing between atoms tagged with A ) and subforests (tagged with F ). data T a = A a | F [T a] deriving (Eq,Ord,Read,Show) Atoms will be mapped to natural numbers in [0..ulimit-1] . When ulimit is ﬁx ed, we denote this set A . W e denote N at the set of natural numbers and T the set of trees of type T with atoms in A . The unranking operation is seen here as an instance of a generic anamorphism mechanism unfold , while the ranking operation is seen as an instance of the corresponding cata- morphism fold (Hutton 1999; Meijer and Hutton 1995). Unranking As an adaptation of the unfold operation, natu- ral numbers will be mapped to elements of T with a generic higher order function unrank ulimit f , deﬁned from N at to T , parameterized by the the natural number ulimit and the transformer function f : unrank_ ulimit _ n | (n < ulimit) && (n ≥ 0) = A n unrank_ ulimit f n | n ≥ ulimit = (F (map (unrank_ ulimit f) (f (n-ulimit)))) A global constant default ulimit will be used through the paper to ﬁx the default range of atoms, allowing us to work with a default unrank function: default_ulimit = 0 unrank = unrank_ default_ulimit Ranking Similarly , as an adaptation of fold , generic in- verse mappings rank ulimit and rank ) from T to N at are deﬁned as: rank_ ulimit _ (A n) | (n < ulimit) && (n ≥ 0) = n rank_ ulimit g (F ts) = ulimit + (g (map (rank_ ulimit g) ts)) rank = rank_ default_ulimit Note that the guard in the second deﬁnition simply states correctness constraints ensuring that atoms belong to the same set A for rank and unrank . This ensures that the following holds: P R O P O S I T I O N 1 . If the transformer function f : N at → [ N at ] is a bijection with in verse g, such that n ≥ ul imit ∧ f ( n ) = [ n 0 , ...n i , ...n k ] ⇒ n i < n , then unrank is a bijection from N at to T , with in verse rank and the r ecursive computations of both functions terminate in a ﬁnite number of steps. Proof: by induction on the structur e of N at and T , using the fact that map pr eserves bijections. Ranking functions can be traced back to G ¨ odel number- ings (G ¨ odel 1931; Hartmanis and Baker 1974) associated to formulae. T ogether with their inv erse unr anking functions they are also used in combinatorial generation algorithms (Martinez and Molinero 2003; Knuth 2006). 3. Hereditarily Finite Sets and Ackermann’ s Encoding While the Univ erse of Hereditarily Finite Sets is best kno wn as a model of the Zermelo-Fraenkel Set theory with the Ax- iom of Inﬁnity replaced by its negation (T akahashi 1976; Meir et al. 1983), it has been the object of renewed prac- tical interest in various ﬁelds, from representing structured data in databases (Leontje v and Sazonov 2000) to reasoning with sets and set constraints (Dovier et al. 2000; Piazza and Policriti 2004). 3.1 Ackermann’ s Encoding The Uni verse of Hereditarily Finite Sets is built from the empty set (or a set of Urelements ) by successiv ely applying powerset and set union operations. A surprising bijection, discovered by W ilhelm Acker- mann in 1937 (Ackermann 1937; Abian and Lamacchia 1978; Kaye and W ong 2007) maps Hereditarily Finite Sets ( H F S ) to Natural Numbers ( N at ): f ( x ) = if x = {} then 0 else P a ∈ x 2 f ( a ) Assuming H F S extended with Ur elements (objects not containing any elements) our generic “rose tree” represen- tation can be used for Hereditarily Finite Sets, with Ur ele- ments seen as atoms, i.e. Natural Numbers in [0..ulimit-1] . The constructor A a marks Ur elements of type a (usually the arbitrary length Integer type in Haskell) and the constructor F marks a list of recursi vely b uilt H F S type elements. Note that if no elements are used with the A constructor , we obtain the “pure” H F S univ erse with ev erything b uilt out from the empty set represented as F [] . Let us note that Ackermann’ s encoding can be seen as the recursiv e application of a bijection set2nat from ﬁnite sub- sets of N at to N at , that associates to a set of (distinct!) nat- ural numbers a (unique!) natural number . With this represen- tation, Ackermann’ s encoding from H F S to N at hfs2nat can be expressed in terms of our generic rank function as: hfs2nat = rank set2nat set2nat ns = sum (map exp2 ns) T o obtain the in verse of the Ackerman encoding, let’ s ﬁrst deﬁne the inv erse nat2set of the bijection set2nat . It de- composes a natural number into a list of exponents of 2 (seen as bit positions equaling 1 in its bitstring representation, in increasing order). nat2set n = nat2exps n 0 where nat2exps 0 _ = [] nat2exps n x = if (even n) then xs else (x:xs) where xs = nat2exps (half n) (succ x) The in verse of the Ackermann encoding, with urelements in [0..default ulimit-1] and the empty set mapped to F [] is deﬁned as follows: nat2hfs = unrank nat2set This deﬁnition is motiv ated by the f act that nat2hfs and hfs2nat are obtained through recursiv e compositions of nat2set and set2nat , respecti vely . Generalizing the en- coding mechanism to use other bijections with similar prop- erties, naturally leads to the anamorphism/catamorphism view of unrank/rank . The following proposition summarizes the results in this subsection: P R O P O S I T I O N 2 . Given id = λx.x , the following function equivalences hold: nat 2 set ◦ set 2 nat ≡ id (1) set 2 nat ◦ nat 2 set ≡ id (2) nat 2 hf s ◦ hf s 2 nat ≡ id (3) hf s 2 nat ◦ nat 2 hf s ≡ id (4) 3.2 Combinatorial Generation as Iteration Using the in verse of Ackermann’ s encoding, the inﬁnite stream H F S can be generated simply by iterating ov er the inﬁnite stream [0..] : iterative_hfs_generator = map nat2hfs [0..] take 5 iterative_hfs_generator [F [],F [F []],F [F [F []]], F [F [],F [F []]],F [F [F [F []]]]] One can try out nat2hfs and its in verse hfs2nat and print out a canonical string representation of H F S with the setShow functions giv en in Appendix: nat2hfs 42 F [F [F []],F [F [],F [F []]], F [F [],F [F [F []]]]] hfs2nat (nat2hfs 42) 42 setShow 42 "{{{}},{{},{{}}},{{},{{{}}}}}" Note that setShow n will b uild a string representation of n ∈ N at , implicitly “deforested” as a H F S with Urele- ments in [0..default ulimit-1] . Figure 1 sho ws the di- rected acyclic graph obtained by merging shared nodes in the r ose tr ee representation of the H F S associated to a natural number (with arrows pointing from sets to their elements). Figure 1: Hereditarily Finite Set associated to 42 4. Pairing Functions and T uple Encodings P airings are bijecti ve functions N at × N at → N at . Fol- lowing the classic notation for pairings of (Robinson 1950), giv en the pairing function J , its left and right in verses K and L are such that J ( K ( z ) , L ( z )) = z (5) K ( J ( x, y )) = x (6) L ( J ( x, y )) = y (7) W e refer to (C ´ egielski and Richard 2001) for a typical use in the foundations of mathematics and to (Rosenberg 2002) for an extensi ve study of various pairing functions and their computational properties. W e will start by overvie wing two classic pairing functions. 4.1 Cantor’ s Pairing Function Cantor’ s geometrically inspired pairing function (also present in earlier work by Cauchy) is deﬁned as: nat_cpair x y = (x + y) ∗ (x + y + 1) ‘div‘ 2 + y As the following example shows, it grows symmetrically in both arguments: [nat_cpair i j|i<-[0..3],j<-[0..3]] [0,2,5,9,1,4,8,13,3,7,12,18,6,11,17,24] 4.2 The Pepis-Kalmar -Robinson Pairing Function An interesting pairing function asymmetrically gr owing, faster on the ﬁrst argument , is the function pepis J and its left and right unpairing companions pepis K and pepis L that have been used, by Pepis, Kalmar and Robinson to- gether with Cantor’ s functions, in some fundamental work on recursion theory , decidability and Hilbert’ s T enth Prob- lem in (Pepis 1938; Kalmar 1939; Kalmar , Laszlo and Suranyi, Janos 1947, 1950; Robinson 1950, 1955, 1968a,b, 1967). The function pepis J combines two numbers re- versibly by multiplying a power of 2 deriv ed from the ﬁrst and an odd number deriv ed from the second: f ( x, y ) = 2 x ∗ (2 ∗ y + 1) − 1 (8) Its Haskell implementation, together with its in v erse is: pepis_J x y = pred ((exp2 x) ∗ (succ (double y))) pepis_K n = two_s (succ n) pepis_L n = half (pred (no_two_s (succ n))) two_s n | even n = succ (two_s (half n)) two_s _ = 0 no_two_s n = n ‘div‘ (exp2 (two_s n)) This pairing function (slower in the second ar gument) w orks as follows: pepis_J 1 10 41 pepis_J 10 1 3071 [pepis_J i j|i<-[0..3],j<-[0..3]] [0,2,4,6,1,5,9,13,3,11,19,27,7,23,39,55] As Haskell provides a built-in ordered pair , it is con venient to re group J, K, L as mappings to/from built-in ordered pairs: haskell2pepis (x,y) = pepis_J x y pepis2haskell n = (pepis_K n,pepis_L n) 4.3 The BitMerge Pairing Function W e will introduce here an unusually simple pairing function (that we ha ve found out recently as being the same as the one in deﬁned in Stev en Pigeon’ s PhD thesis on Data Com- pression (Pigeon 2001), page 114). It provides compact rep- resentations for various constructs in volving ordered pairs. The bijection bitmerge pair from N at × N at to N at and its in verse bitmerge unpair are deﬁned as follows: bitmerge_pair (i,j) = set2nat ((evens i) + + (odds j)) where evens x = map double (nat2set x) odds y = map succ (evens y) bitmerge_unpair n = (f xs,f ys) where (xs,ys) = partition even (nat2set n) f = set2nat . (map half) The function bitmerge pair works by splitting a number’ s big endian bitstring representation into odd and ev en bits while its in verse bitmerge unpair blends the odd and even bits back together . W ith help of the function to rbits given in Appendix, that decomposes n ∈ N at into a list of bits (smaller units ﬁrst) one can follo w what happens, step by step: to_rbits 2008 [0,0,0,1, 1,0,1,1, 1,1,1] bitmerge_unpair 2008 (60,26) to_rbits 60 [0,0, 1,1, 1,1] to_rbits 26 [0,1, 0,1, 1] bitmerge_pair (60,26) 2008 P R O P O S I T I O N 3 . The following function equivalences hold: bitmer g e pair ◦ bitmer g e unpair ≡ id (9) bitmer g e unpair ◦ bitmer g e pair ≡ id (10) 4.4 T uple Encodings as Generalized BitMerge W e will no w generalize this pairing function to k -tuples and then we will deriv e an encoding for ﬁnite functions. The function to tuple: N at → N at k con verts a natu- ral number to a k -tuple by splitting its bit representation into k groups, from which the k members in the tuple are ﬁnally rebuilt. This operation can be seen as a transposition of a bit matrix obtained by expanding the number in base 2 k : to_tuple k n = map from_rbits ( transpose ( map (to_maxbits k) ( to_base (exp2 k) n ) ) ) T o con vert a k -tuple back to a natural number we will merge their bits, k at a time. This operation uses the transposition of a bit matrix obtained from the tuple, seen as a number in base 2 k , with help from bit crunching functions giv en in Appendix: from_tuple ns = from_base (exp2 k) ( map from_rbits ( transpose ( map (to_maxbits l) ns ) ) ) where k = genericLength ns l = max_bitcount ns The following example shows the decoding of 42 , its decom- position in bits (right to left), the formation of a 3 -tuple and the encoding of the tuple back to 42 . to_rbits 42 [0,1,0, 1,0,1] to_tuple 3 42 [2,1,2] to_rbits 2 [0,1] to_rbits 1 [1] from_tuple [2,1,2] 42 Fig. 2 shows multiple steps of the same decomposition, with shared nodes collected in a D A G. Note that cylinders represent markers on edges indicating argument positions, the cubes indicate leaf vertices (0,1) and the small pyramid indicates the root where the expansion has started. The following proposition states that this tupling function is a generalization of bitmerge pair P R O P O S I T I O N 4 . The following function equivalences hold: bitmer g e unpair n ≡ to tupl e 2 n (11) bitmer g e pair ( x, y ) ≡ f r om tupl e [ x, y ] (12) Figure 2: Repeated 3-tuple expansions: 42 and 2008 5. Encoding Finite Functions As ﬁnite sets can be put in a bijection with an initial segment of N at we can narrow down the concept of ﬁnite function as follows: D E FI N I T I O N 1 . A finite function is a function deﬁned fr om an initial se gment of N at to N at . This deﬁnition implies that a ﬁnite function can be seen as an array or a list of natural numbers except that we do not limit the size of the representation of its values. 5.1 Encoding Finite Functions as T uples W e can now encode and decode a ﬁnite function from [0 ..k − 1] to N at (seen as the list of its v alues), as a natural number: ftuple2nat [] = 0 ftuple2nat ns = haskell2pepis (pred k,t) where k = genericLength ns t = from_tuple ns nat2ftuple 0 = [] nat2ftuple kf = to_tuple (succ k) f where (k,f) = pepis2haskell kf As the length of the tuple, k , is usually smaller than the number obtained by mer ging the bits of the k -tuple, we hav e picked the Pepis pairing function, exponential in its ﬁrst argument and linear in its second, to embed the length of the tuple needed for the decoding. The encoding/decoding works as follo ws: ftuple2nat [1,0,2,1,3] 21295 nat2ftuple 21295 [1,0,2,1,3] map nat2ftuple [0..15] [[],[0,0],[1],[0,0,0],[2],[1,0],[3], [0,0,0,0],[4],[0,1],[5],[1,0,0],[6], [1,1],[7],[0,0,0,0,0]] Note that map nat2ftuple [0..] provides an iterativ e generator for the stream of ﬁnite func- tions. 5.2 Deriving an Encoding of Finite Functions from Ackermann’ s Encoding Giv en that a ﬁnite set with n elements can be put in a bijection with [0..n-1], a ﬁnite functions f : [0 ..n − 1] → N at can be represented as the list [ f (0) ...f ( n − 1)] . Such a list has howe v er repeated elements. So how can we turn it into a set with distinct elements, bijectiv ely? The following tw o functions provide the answer . First, we just sum up the list of the values of the function with scanl , resulting in a monotonically growing sequence (provided that we ﬁrst increment e very number by 1 to ensure that 0 values do not break monotonicity). fun2set ns = map pred (tail (scanl ( + ) 0 (map succ ns))) The in verse function re verting back from a set of distinct values collects the increments from a term to the next (and ignores the last one): set2fun ns = map pred (genericTake l ys) where l = genericLength ns xs = (map succ ns) ys = (zipWith (-) (xs + + [0]) (0:xs)) P R O P O S I T I O N 5 . The following function equivalences hold: nat 2 set ◦ set 2 nat ≡ id (13) set 2 nat ◦ nat 2 set ≡ id (14) The following example shows the con version and its in- verse. fun2set [1,0,2,1,2] [1,2,5,7,10] set2fun [1,2,5,7,10] [1,0,2,1,2] By combining this with Ackermann encoding’ s basic step set2nat and its in verse nat2set , we obtain an encoding from ﬁnite functions to N at follows: nat2fun = set2fun . nat2set fun2nat = set2nat . fun2set nat2fun 2008 [3,0,1,0,0,0,0] fun2nat [3,0,1,0,0,0,0] 2008 P R O P O S I T I O N 6 . The following function equivalences hold: nat 2 f un ◦ f un 2 nat ≡ id (15) f un 2 nat ◦ nat 2 f un ≡ id (16) One can see that this encoding ignores 0 s in the binary representation of a number , while counting 1 sequences as increments. Run Length Encoding of binary sequences (Mki- nen and Navarro 2005) encodes 0 s and 1 s symmetrically , by counting the numbers of 1 s and 0 s. This encoding is re- versible, kno wing that 1 s and 0 s alternate, and that the most signiﬁcant digit is always 1 : bits2rle [] = [] bits2rle [_] = [0] bits2rle (x:y:xs) | x = = y = (c + 1):cs where (c:cs) = bits2rle (y:xs) bits2rle (_:xs) = 0:(bits2rle xs) rle2bits [] = [] rle2bits (n:ns) = (genericReplicate (n + 1) b) + + xs where xs = rle2bits ns b = if [] = = xs then 1 else 1-(head xs) By composing them with conv erters to/from bitlists, we ob- tain the bijection nat 2 r l e : N at → [ N at ] and its inv erse r le 2 nat : [ N at ] → N at nat2rle = bits2rle . to_rbits0 rle2nat = from_rbits . rle2bits to_rbits0 0 = [] to_rbits0 n = to_rbits n P R O P O S I T I O N 7 . The following function equivalences hold: nat 2 r le ◦ r le 2 nat ≡ id (17) r le 2 nat ◦ nat 2 r le ≡ id (18) 6. Encodings for “Hereditarily Finite Functions” One can no w b uild a theory of “Hereditarily Finite Func- tions” ( H F F ) centered around using a transformer like nat2ftuple , nat2fun , nat2rle and its in verse ftuple2nat , fun2nat , rle2nat in way similar to the use of nat2set and set2nat for H F S , where the empty function (de- noted F [] ) replaces the empty set as the quintessential “urfunction” . Similarly to Urelements in the H F S theory , “urfunctions” (considered here as atomic values) can be in- troduced as constant functions parameterized to belong to [0 ..ulimit − 1] . By using the generic rank function deﬁned in section 2 we can extend the bijections deﬁned in this section to encod- ings of Hereditarily Finite Functions. By instantiating the transformer function in unrank to nat2ftuple , nat2fun and nat2rle we obtain: nat2hff = unrank nat2fun nat2hff1 = unrank nat2ftuple nat2hff2 = unrank nat2rle By instantiating the transformer function in rank we ob- tain: hff2nat = rank fun2nat hff2nat1 = rank ftuple2nat hff2nat2 = rank rle2nat The following examples show that nat2hff , nat2hff1 and nat2hff2 are indeed bijections, and that the resulting H F F -trees are typically more compact than the H F S -tree associated to the same natural number . F [] nat2hff 1 F [F []] nat2hff1 0 F [] nat2hff1 1 F [F [],F []] nat2hff2 0 F [] nat2hff2 1 F [F []] nat2hff 42 F [F [F []],F [F []],F [F []]] nat2hff1 42 F [F [F [F [],F [],F []],F []]] nat2hff2 42 F [F [],F [],F [],F [],F [],F []] nat2hfs 42 F [F [F []],F [F [],F [F []]], F [F [],F [F [F []]]]] F [F [F []],F [F [],F [F []]], F [F [],F [F [F []]]]] nat2hff 12345 F [F [],F [F [F []]],F [], F [],F [F [F []],F []],F []] nat2hff1 12345 F [F [F [F [F [F [],F []]], F []]],F [F [],F [],F [F [],F []]]] nat2hff2 12345 F [F [],F [F []],F [F [],F []], F [F [],F [],F []],F [F []]] hff2nat (nat2hff 12345) 12345 hff2nat1 (nat2hff1 12345) 12345 hff2nat2 (nat2hff2 12345) 12345 Note that map nat2hff [0..], nat2hff1 [0..], nat2hff1 [0..] provide iterative generators for the (re- cursiv ely enumerable!) stream of hereditarily ﬁnite func- tions. The resulting HFF with urfunctions (seen as digits) can also be used as generalized numeral systems with applica- tions to building arbitrary length integer implementations. Assuming default ulimit=10 we obtain: nat2hff 1234567890 F [A 3,A 2,A 0,A 1,A 7, A 0,A 1,A 2,A 0,A 2,A 2 ] nat2hff1 1234567890 F [F [F [F [F [A 0,A 3]], F [F [F [A 2,A 0,A 1]]],A 1]] ] nat2hff2 1234567890 F [A 2,A 0,A 1,A 1,A 0,A 0,A 6,A 1, A 0,A 0,A 1,A 1,A 1,A 0,A 1,A 0 ] which display with the funShow functions giv en in Ap- pendix as: funShow 1234567890 "(3 2 0 1 7 0 1 2 0 2 2)" funShow1 1234567890 "(((((0 3)) (((2 0 1))) 1)))" funShow2 1234567890 "(2 0 1 1 0 0 6 1 0 0 1 1 1 0 1 0)" P R O P O S I T I O N 8 . The following function equivalences hold: nat 2 hf f 1 ◦ hf f 2 nat 1 ≡ id (19) hf f 2 nat 1 ◦ nat 2 hf f 1 ≡ id (20) nat 2 hf f ◦ hf f 2 nat ≡ id (21) hf f 2 nat ◦ nat 2 hf f ≡ id (22) 7. Encoding Finite Bijections T o obtain an encoding for ﬁnite bijections (permutations) we will ﬁrst revie w a ranking/unranking mechanism for permu- tations that in volv es an uncon ventional numeric representa- tion, factoradics . 7.1 The Factoradic Numeral System The factoradic numeral system (Knuth 1997) replaces digits multiplied by power of a base N with digits that multiply successiv e values of the factorial of N . In the increasing order v ariant fr the ﬁrst digit d 0 is 0, the second is d 1 ∈ { 0 , 1 } and the N -th is d N ∈ [0 ..N − 1] . The left-to-right, decreasing order variant fl is obtained by rev ersing the digits of fr . fr 42 [0,0,0,3,1] rf [0,0,0,3,1] 42 fl 42 [1,3,0,0,0] lf [1,3,0,0,0] 42 The function fr handles the special case for 0 and calls fr1 which recurses and divides with increasing values of N while collecting digits with mod : -- factoradics of n, right to left fr 0 = [0] fr n = f 1 n where f _ 0 = [] f j k = (k ‘mod‘ j) : (f (j + 1) (k ‘div‘ j)) The function fl , with digits left to right is obtained as follows: fl = reverse . fr The function lf (in v erse of fl ) con verts back to decimals by summing up results while computing the factorial pro- gressiv ely: rf ns = sum (zipWith ( ∗ ) ns factorials) where factorials = scanl ( ∗ ) 1 [1..] Finally , lf , the in verse of fl is obtained as: lf = rf . reverse 7.2 Ranking and unranking permutations of given size with Lehmer codes and factoradics The Lehmer code of a permutation f is deﬁned as the num- ber of indices j such that 1 ≤ j < i and f ( j ) < f ( i ) (Mantaci and Rakotondrajao 2001). P R O P O S I T I O N 9 . The Lehmer code of a permutation deter- mines the permutation uniquely . The function perm2nth computes a rank for a permutation ps of size>0 . It starts by ﬁrst computing its Lehmer code ls with perm2lehmer . Then it associates a unique natural number n to ls , by conv erting it with the function lf from factoradics to decimals. Note that the Lehmer code Ls is used as the list of digits in the factoradic representation. perm2nth ps = (l,lf ls) where ls = perm2lehmer ps l = genericLength ls perm2lehmer [] = [] perm2lehmer (i:is) = l:(perm2lehmer is) where l = genericLength [j | j ← is,j < i] The function nat2perm provides the matching unranking operation associating a permutation ps to a given size>0 and a natural number n . -- generates n-th permutation of given size nth2perm (size,n) = apply_lehmer2perm (zs + + xs) [0..size-1] where xs = fl n l = genericLength xs k = size-l zs = genericReplicate k 0 -- converts Lehmer code to permutation lehmer2perm xs = apply_lehmer2perm xs is where is = [0..(genericLength xs)-1] -- extracts permutation from factoradic "digit" list apply_lehmer2perm [] [] = [] apply_lehmer2perm (n:ns) ps@(x:xs) = y : (apply_lehmer2perm ns ys) where (y,ys) = pick n ps pick i xs = (x,ys + + zs) where (ys,(x:zs)) = genericSplitAt i xs Note also that lehmer2perm is used this time to reconstruct the permutation ps from its Lehmer code, which in turn is computed from the permutation’ s factoradic representation. One can try out this bijectiv e mapping as follo ws: nth2perm (5,42) [1,4,0,2,3] perm2nth [1,4,0,2,3] (5,42) nth2perm (8,2008) [0,3,6,5,4,7,1,2] perm2nth [0,3,6,5,4,7,1,2] (8,2008) 7.3 A bijective mapping from permutations to N at One more step is needed to to extend the mapping be- tween permutations of a given length to a bijectiv e mapping from/to N at : we will have to “shift towards inﬁnity” the starting point of each new bloc of permutations in N at as permutations of larger and lar ger sizes are enumerated. First, we need to know by how much - so we compute the sum of all factorials up to n ! . -- fast computation of the sum of all factorials up to n! sf n = rf (genericReplicate n 1) This is done by noticing that the f actoradic representation of [0,1,1,..] does just that. The stream of all such sums can now be generated as usual: sfs = map sf [0..] What we are really interested into, is decomposing n into the distance to the last sum of factorials smaller than n , n m and the its index in the sum, k . to_sf n = (k,n-m) where k = pred (head [x | x ← [0..],sf x > n]) m = sf k Unranking of an arbitrary permutation is now easy - the in- dex k determines the size of the permutation and n-m deter- mines the rank. T ogether they select the right permutation with nth2perm . nat2perm 0 = [] nat2perm n = nth2perm (to_sf n) Ranking of a permutation is even easier: we ﬁrst compute its size and its rank, then we shift the rank by the sum of all factorials up to its size, enumerating the ranks pre viously assigned. perm2nat ps = (sf l) + k where (l,k) = perm2nth ps nat2perm 2008 [1,4,3,2,0,5,6] perm2nat [1,4,3,2,0,5,6] 2008 As ﬁnite bijections are faithfully represented by permuta- tions, this construction provides a bijection from N at to the set of Finite Bijections. P R O P O S I T I O N 1 0 . The following function equivalences hold: nat 2 per m ◦ per m 2 nat ≡ id ≡ per m 2 nat ◦ nat 2 per m (23) The stream of all ﬁnite permutations can now be generated as usual: perms = map nat2perm [0..] 7.4 Hereditarily Finite Permutations By using the generic unrank and rank functions deﬁned in section 2 we can extend the nat2perm and perm2nat to encodings of Hereditarily Finite Permutations ( H F P ). nat2hfp = unrank nat2perm hfp2nat = rank perm2nat The encoding works as follo ws: nat2hfp 42 F [F [],F [F [],F [F []]], F [F [F []],F []],F [F []], F [F [],F [F []],F [F [],F [F []]]]] hfp2nat it 42 Assuming default ulimit=10 and using the string repre- sentation provided by permSho w (Appendix) we obtain: nat2hfp 42 F [F [],A 2,A 3,A 1,A 4] permShow 42 "(0 2 3 1 4)" permShow 1234567890 "(1 6 (0 1 3 2) 2 0 3 (0 1 2 3) 7 8 5 9 4 (0 2 1 3))" P R O P O S I T I O N 1 1 . The following function equivalences hold: nat 2 hf p ◦ hf p 2 nat ≡ id ≡ hf p 2 nat ◦ nat 2 hf p (24) 8. Related work Natural Number encodings of Hereditarily Finite Sets have triggered the interest of researchers in ﬁelds ranging from Axiomatic Set Theory and Foundations of Logic to Com- plexity Theory and Combinatorics (T akahashi 1976; Kaye and W ong 2007; Kirby 2007; Abian and Lamacchia 1978; Booth 1990; Meir et al. 1983; Leontjev and Sazonov 2000; Sazonov 1993; A vigad 1997). Computational and Data Rep- resentation aspects of Finite Set Theory have been de- scribed in logic programming and theorem proving contexts in (Dovier et al. 2000; Piazza and Policriti 2004; Paulson 1994). Pairing functions have been used work on decision problems as early as (Pepis 1938; Kalmar 1939; Robinson 1950, 1968b). The tuple functions we have used to encode ﬁnite functions are ne w . While ﬁnite functions ha ve been used extensi vely in various branches of mathematics and computer science, we have not seen any formalization of hereditarily Finite Functions or Hereditarily Finite Bijec- tions as such in the literature. 9. Conclusion and Future W ork W e hav e shown the expressi v eness of Haskell as a meta- language for executable mathematics, by describing natural number encodings, tupling/untupling and ranking/unranking functions for ﬁnite sets, functions and permutations and by extending them in a generic way to Hereditarily Finite Sets, Hereditarily Finite Functions and Hereditarily Finite Permu- tations. In a Genetic Programming conte xt (Koza 1992; Poli et al.), the bijections between bitvectors/natural numbers on one side, and trees/graphs representing HFSs, HFFs, HPPs on the other side, suggest exploring the mapping and its action on various transformations as a phenotype-genotype connection. W e also foresee interesting applications in cryptography and steganography . For instance, in the case of the permuta- tion related encodings - something as simple as the order of the cities visited or the order of names on a greetings card, seen as a permutation with respect to their alphabetic order , can provide a steganographic encoding/decoding of a secret message by using functions like nat2perm and perm2nat . It looks like an interesting topic to in vestigate if higher den- sity and more random looking steganographic loads could be incorporated on top of Hereditarily Finite Permutations. References Alexander Abian and Samuel Lamacchia. On the consistency and independence of some set-theoretical constructs. Notre Dame Journal of F ormal Logic , X1X(1):155–158, 1978. W ilhelm Friedrich Ackermann. Die W iderspruchsfreiheit der allge- meinen Mengenlhere. Mathematisc he Annalen , (114):305–315, 1937. Jeremy A vigad. The Combinatorics of Propositional Prov ability. In ASL W inter Meeting , San Die go, January 1997. David Booth. Hereditarily Finite Finsler Sets. J. Symb. Log . , 55(2): 700–706, 1990. Patrick C ´ egielski and Denis Richard. Decidability of the theory of the natural integers with the cantor pairing function and the successor . Theor . Comput. Sci. , 257(1-2):51–77, 2001. Agostino Dovier , Carla Piazza, and Alberto Policriti. Comparing Expressiv eness of Set Constructor Symbols. In F rontier s of Combining Systems , pages 275–289, 2000. K. G ¨ odel. ¨ Uber formal unentscheidbare S ¨ atze der Principia Mathe- matica und verwandter Systeme I. Monatshefte f ¨ ur Mathematik und Physik , 38:173–198, 1931. Juris Hartmanis and Theodore P . Baker . On simple goedel numberings and translations. In Jacques Loeckx, edi- tor , ICALP , volume 14 of Lecture Notes in Computer Sci- ence , pages 301–316. Springer , 1974. ISBN 3-540-06841- 4. URL http://dblp.uni- trier.de/db/conf/icalp/ icalp74.html#HartmanisB74 . Graham Hutton. A T utorial on the Uni versality and Expressi veness of Fold. J. Funct. Pr ogram. , 9(4):355–372, 1999. Laszlo Kalmar . On the reduction of the decision problem. ﬁrst paper . ackermann preﬁx, a single binary predicate. The Journal of Symbolic Logic , 4(1):1–9, mar 1939. ISSN 0022-4812. Kalmar , Laszlo and Suranyi, Janos. On the reduction of the deci- sion problem. The Journal of Symbolic Logic , 12(3):65–73, sep 1947. ISSN 0022-4812. Kalmar , Laszlo and Suranyi, Janos. On the reduction of the deci- sion problem: Third paper . pepis preﬁx, a single binary predi- cate. The Journal of Symbolic Logic , 15(3):161–173, sep 1950. ISSN 0022-4812. Richard Kaye and Tin Lock W ong. On Interpretations of Arith- metic and Set Theory . Notre Dame J. F ormal Logic V olume , 48 (4):497–510, 2007. Laurence Kirby . Addition and multiplication of sets. Math. Log. Q. , 53(1):52–65, 2007. Donald Knuth. The Art of Computer Program- ming, V olume 4, draft, 2006. http://www-cs- faculty .stanford.edu/ ∼ /knuth/taocp.html. Donald E. Knuth. The art of computer pr ogr amming, volume 2 (3r d ed.): seminumerical algorithms . Addison-W esley Long- man Publishing Co., Inc., Boston, MA, USA, 1997. ISBN 0201896842. URL http://portal.acm.org/citation. cfm?id=270146 . John R. K oza. Genetic Pr ogramming: On the Pro gramming of Computers by Means of Natural Selection . MIT Press, Cam- bridge, MA, USA, 1992. ISBN 0-262-11170-5. Alexander Leontjev and Vladimir Y u. Sazonov . Capturing LOGSP A CE over Hereditarily-Finite Sets. In Klaus-Dieter Schewe and Bernhard Thalheim, editors, F oIKS , volume 1762 of Lectur e Notes in Computer Science , pages 156–175. Springer , 2000. ISBN 3-540-67100-5. Roberto Mantaci and Fanja Rakotondrajao. A permutations repre- sentation that knows what ”eulerian” means. Discrete Mathe- matics & Theor etical Computer Science , 4(2):101–108, 2001. Conrado Martinez and Xavier Molinero. Generic algorithms for the generation of combinatorial objects. In Branislav Rov an and Peter V ojtas, editors, MFCS , volume 2747 of Lectur e Notes in Computer Science , pages 572–581. Springer, 2003. ISBN 3- 540-40671-9. URL http://dblp.uni- trier.de/db/conf/ mfcs/mfcs2003.html#MartinezM03 . Erik Meijer and Graham Hutton. Bananas in Space: Extending Fold and Unfold to Exponential T ypes. In FPCA , pages 324–333, 1995. Amram Meir , John W . Moon, and Jan Mycielski. Hereditarily Finite Sets and Identity T rees. J . Comb . Theory , Ser . B , 35(2): 142–155, 1983. V eli Mkinen and Gonzalo Navarro. Succinct sufﬁx arrays based on run-length encoding. In Alberto Apostolico, Maxime Crochemore, and Kunsoo Park, editors, CPM , volume 3537 of Lecture Notes in Computer Science , pages 45–56. Springer , 2005. ISBN 3-540-26201-6. URL http://dblp.uni- trier. de/db/conf/cpm/cpm2005.html#MakinenN05 . Lawrence C. Paulson. A Concrete Final Coalgebra Theorem for ZF Set Theory . In Peter Dybjer , Bengt Nordstr ¨ om, and Jan M. Smith, editors, TYPES , volume 996 of Lecture Notes in Com- puter Science , pages 120–139. Springer , 1994. ISBN 3-540- 60579-7. Jozef Pepis. Ein verfahren der mathematischen logik. The Journal of Symbolic Logic , 3(2):61–76, jun 1938. ISSN 0022-4812. Carla Piazza and Alberto Policriti. Ackermann Encoding, Bisimu- lations, and OBDDs. TPLP , 4(5-6):695–718, 2004. Stephen Pigeon. Contributions ` a la compression de donn ´ ees. Ph.d. thesis, Univ ersit ´ e de Montr ´ eal, Montr ´ eal, 2001. Riccardo Poli, W illiam B. Langdon, Nicholas F . McPhee, and John R. K oza. A Field Guide to Genetic Pr ogramming . URL http://www.gp- field- guide.org.uk . e-book. Julia Robinson. General recursive functions. Pr oceedings of the American Mathematical Society , 1(6):703–718, dec 1950. ISSN 0002-9939. Julia Robinson. A note on primitive recursi ve functions. Pr oceed- ings of the American Mathematical Society , 6(4):667–670, aug 1955. ISSN 0002-9939. Julia Robinson. An introduction to hyperarithmetical functions. The J ournal of Symbolic Logic , 32(3):325–342, sep 1967. ISSN 0022-4812. Julia Robinson. Recursiv e functions of one v ariable. Pr oceed- ings of the American Mathematical Society , 19(4):815–820, aug 1968a. ISSN 0002-9939. Julia Robinson. Finite generation of recursiv ely enumerable sets. Pr oceedings of the American Mathematical Society , 19(6): 1480–1486, dec 1968b. ISSN 0002-9939. Arnold L. Rosenberg. Efﬁcient pairing functions - and why you should care. In IPDPS . IEEE Computer Society , 2002. ISBN 0-7695-1573-8. Vladimir Y u. Sazonov . Hereditarily-Finite Sets, Data Bases and Polynomial-T ime Computability. Theor . Comput. Sci. , 119(1): 187–214, 1993. Moto-o T akahashi. A Foundation of Finite Mathematics. Publ. Res. Inst. Math. Sci. , 12(3):577–708, 1976. A. A ppendix T o make the code in the paper fully self contained, we list here some auxiliary functions. String Representations The functions setShow and funShow provide a string representation of a natural number as a “pure” HFS or HFF . They are obtained as instances of gshow which provides a generic template parameterized with syn- tactic elements. setShow = (gshow "{" "," "}") . nat2hfs funShow = (gshow "(" " " ")") . nat2hff funShow1 = (gshow "(" " " ")") . nat2hff1 funShow2 = (gshow "(" " " ")") . nat2hff2 permShow = (gshow "(" " " ")") . nat2hfp gshow _ _ _ (A n) = show n gshow l _ r (F []) = -- empty function shown as 0 rather than () if default_ulimit > 1 then "0" else l + + r gshow l c r (F ns) = l + + foldl ( + + ) "" (intersperse c (map (gshow l c r) ns)) + + r Bit crunching functions The function bitcount computes the number of bits needed to represent an inte ger and max bitcount computes the maximum bitcount for a list of integers. bitcount n = head [x | x ← [1..],(exp2 x) > n] max_bitcount ns = foldl max 0 (map bitcount ns) The following functions implement con version opera- tions between bitlists and numbers. Note that our bitlists represent binary numbers by selecting exponents of 2 in in- creasing order (i.e. “right to left”). -- from decimals to binary as list of bits to_rbits n = to_base 2 n -- from bits to decimals from_rbits bs = from_base 2 bs -- to binary, padded with 0s, up to maxbits to_maxbits maxbits n = bs + + (genericTake (maxbits-l)) (repeat 0) where bs = to_base 2 n l = genericLength bs -- conversion to base n, as list of digits to_base base n = d : (if q = = 0 then [] else (to_base base q)) where (q,d) = quotRem n base -- conversion from any base to decimal from_base base [] = 0 from_base base (x:xs) = x + base ∗ (from_base base xs)

Ranking Catamorphisms and Unranking Anamorphisms on Hereditarily Finite Datatypes

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment