Commutative Algebra of Statistical Ranking

COMMUT A TIVE ALGEBRA OF ST A TISTICAL RANKING BERND STURMFELS AND V OLKMAR WELKER Abstra ct. A model for statistical ranking is a family of probability distributions whose states are orderings of a ﬁxed ﬁnite set of items. W e represent the orderings as maximal c hains in a graded p oset. The most widely used ranking mo dels are parameterized by rational function in the mo del parameters, so they deﬁne algebraic v arieties. W e study these v arieties from the p ersp ectiv e of combinatorial commutativ e algebra. One of our mo dels, the Plack ett-Luce model, is non-toric. Five others are toric: the Birkhoﬀ mo del, the ascending mo del, the Csiszár mo del, the in version model, and the Bradley-T erry mo del. F or these mo dels we examine the toric algebra, its lattice p olytop e, and its Mark o v basis. 1. Intr oduction A statistical mo del for rank ed data is a family M of probabilit y distribution on the symmetric group S n . Each distribution p ( θ ) in M dep ends on some mo del parameters θ and it associates a probabilit y p π ( θ ) to eac h permutation π of [ n ] = { 1 , 2 , . . . , n } . Th us the mo del M is a parametrized subset of the ( n ! − 1) -dimensional standard simplex ∆ S n . In algebraic statistics, one assumes that the probabilities p π ( θ ) are rational functions in the mo del parameters θ , so that M is a semi-algebraic set in ∆ S n , and one aims to c haracterize the prime ideal I M of p olynomials that v anish on M . In fact, one of the origins of the ﬁeld was the sp e ctr al analysis for p ermutation data describ ed b y Diaconis and Sturmfels in [12, §6.1]. The corresp onding Birkhoﬀ mo del M is the toric v ariet y of the Birkhoﬀ p olytop e. This polytop e consists of all bisto chastic matrices and it is the con vex h ull of all n × n p erm utation matrices. There has b een a considerable amount of research on the geometric inv ariants of the Birkhoﬀ mo del M . The simplest suc h inv ariant is its dimension, dim( M ) = ( n − 1) 2 . The de gr e e of M is the normalized v olume of the Birkhoﬀ p olytop e, a topic of indep endent in terest in combinatorics [6]. Diaconis and Eriksson [11] conjectured that the Mark ov basis of the Birkhoﬀ mo del consists of binomials of degree ≤ 3 . Besides the Birkhoﬀ mo del, there are man y other mo dels for ranked data that are b oth relev ant for statistical analysis and ha ve an interesting algebraic structure. It is the ob jec- tiv e of this article to conduct a comparativ e study of suc h mo dels from the persp ectives of comm utative algebra and geometric combinatorics. Both toric mo dels and non-toric mo dels are of in terest. The former include the models in tro duced by Csiszár [9, 10], and the latter include the Plac kett-Luce mo del [8, 24, 29] and the generalized Bradley-T erry models [21]. The organization of this pap er is as follo ws. In Section 2 we giv e an informal introduction to all our mo dels. W e write out form ulas for the probabilities for the six p ermutations of n = 3 items, and w e discuss the subsets they parametrize in the 5 -dimensional simplex ∆ S 3 . Precise formal deﬁnitions for the four toric mo dels are given in Section 3. W e represen t the 1 2 Bernd Sturmfels and V olkmar W elk er states as maximal c hains in a graded p oset Q . T ypically , Q is the distributiv e lattice induced b y some order constraints on the n items to b e rank ed. If there are no such constraints then Q = 2 [ n ] is the Bo olean lattice whose maximal c hains are all n ! p ermutations in S n . Non-trivial order constrain ts arise frequen tly in applications of ranking mo dels, for instance in computational biology [4] and machine learning [8]. Our algebraic framework based on graded p osets Q is well-suited for suc h contemporary applications of statistical ranking. While the Birkhoﬀ mo del has already received a lot of attention in the literature, w e here fo cus on the Csiszár mo del (Section 4), the asc ending mo del (Section 5) and the inver- sion mo del (Section 6). F or eac h of these toric v arieties, w e c haracterize the corresponding lattice p olytop e and its Marko v bases, that is, binomials that generate the toric ideal. Section 7 is concerned with the Plackett-Luc e mo del , which is not a toric mo del, but is parametrized b y certain conditional probabilities that are not monomials. In algebraic geometry language, this mo del is obtained b y blo wing up the pro jectiv e space P n − 1 along a family of linear subspaces of co dimension 2 , and w e study its co ordinate ring. W e also examine marginalizations of our mo dels, including the widely used Br ad ley-T erry mo del . 2. Toric Models: A Sneak Preview A toric mo del for complete p erm utation data is sp eciﬁed b y a non-negativ e integer matrix A with n ! columns that all hav e the same sum S . These column v ectors A π are indexed b y p erm utations π ∈ S n and they represen t the suﬃcient statistics of the model. The article [17] serves as our general reference for toric mo dels in statistics, their relationship with exp onen tial families, and the role of the matrix A . F or an introduction to algebraic statistics in general, and for further reading on toric mo dels, we refer to the bo oks [13, 28]. If r = rank( A ) then the conv ex h ull of the column vectors A π is a lattice p olytop e of dimension r − 1 . W e refer to it as the mo del p olytop e . The toric mo del can b e identiﬁed with the non-negativ e p oints on the pro jectiv e toric v ariety asso ciated with the mo del p olytop e. Eac h data set is summarized as a function u : S n 7→ N , where u ( π ) is the n umber of times the p ermutation π has b een observed. Thinking of u as a column v ector, w e can form the matrix-v ector pro duct Au , whose entries are the suﬃcien t statistics of the data u . Then the sum n ! · S of the en tries in the v ector Au coincides with the sample size N = P π ∈ S n u ( π ) . In subsequent sections we will generalize to the situation where S n is replaced b y a prop er subset, in which case A has fewer than n ! columns, but still lab eled b y p erm utations. These will b e the linear extensions of a given partial order on [ n ] = { 1 , 2 , . . . , n } . In fact, for some mo dels w e can even tak e the set of maximal c hains in an arbitrary rank ed poset. But for a ﬁrst lo ok we conﬁne ourselves to the situation describ ed ab ov e, where A has n ! columns. W e now deﬁne four toric mo dels for probabilit y distributions on S n . W e do this b y w a y of a v erbal description of the suﬃcien t statistics in eac h mo del. These suﬃcien t statistics are numerical functions on the p erm utations π of the given set [ n ] of items to b e ranked. (a) In the asc ending mo del , the suﬃcient statistics Au record, for each subset I ⊂ [ n ] , the num b er of samples π in the data u that ha v e the set I at the bottom. Here, the set I b eing at the b ottom means that ( i ∈ I and j 6∈ I ) implies π ( i ) < π ( j ) . Comm utative Algebra of Statistical Ranking 3 (b) In the Csiszár mo del , the suﬃcient statistics Au count, for eac h i ∈ I ⊂ [ n ] , the n umber of samples that ha ve I at the b ottom but with i as winner in the group I . This is the mo del studied by Villõ Csiszár [9, 10] under the name “L-decomp osable”. (c) In the Birkhoﬀ mo del of [12, §6.1], the suﬃcien t statistics Au of a data set u record, for each i, j ∈ [ n ] , the n um b er of samples π in which ob ject i is ranked in place j , (d) In the inversion mo del , the suﬃcient statistics Au coun t, for each ordered pair i < j in [ n ] , the num b er of samples π in which that pair is an in v ersion, meaning π − 1 ( i ) > π − 1 ( j ) . This mo del can b e seen as a multiv ariate version of the Mal lows mo del [25]. T o illustrate the diﬀerences b etw een these mo dels let us consider the simplest case n = 3 . In each case the toric ideal of the mo del is the kernel of a square-free monomial map from the p olynomial ring K [ p 123 , p 132 , p 213 , p 231 , p 312 , p 321 ] represen ting the probabilities to another p olynomial ring K [ a, b, . . . ] that represen ts the mo del parameters. The mo del p olytop e is the conv ex hull of the six 0 - 1 v ectors corresp onding to the square-free monomials: p 123 p 132 p 213 p 231 p 312 p 321 Birkhoﬀ a 11 a 22 a 33 a 11 a 23 a 32 a 12 a 21 a 33 a 12 a 23 a 31 a 13 a 21 a 32 a 13 a 22 a 31 in version b 12 b 13 b 23 b 12 b 13 q 23 q 12 b 13 b 23 q 12 q 13 b 23 b 12 q 13 q 23 q 12 q 13 q 23 ascending c 1 c 12 c 123 c 1 c 13 c 123 c 2 c 12 c 123 c 2 c 23 c 123 c 3 c 13 c 123 c 3 c 23 c 123 Csiszár d | 1 d 1 | 2 d 12 | 3 d | 1 d 1 | 3 d 13 | 2 d | 2 d 2 | 1 d 12 | 3 d | 2 d 2 | 3 d 23 | 1 d | 3 d 3 | 1 d 13 | 2 d | 3 d 3 | 2 d 23 | 1 The toric ideals record the algebraic relations among these square-free monomials: I inv = h p 132 p 231 − p 123 p 321 , p 213 p 312 − p 123 p 321 i has co dimension 2 , I birk = I asc = h p 123 p 231 p 312 − p 132 p 213 p 321 i has co dimension 1 , I csi = h 0 i has co dimension 0 . F or eac h mo del, the matrix A has six columns, indexed b y S 3 , and its ro ws are lab eled b y the mo del parameters. F or example, f or the ascending mo del, the matrix has seven ro ws: As 3 =          p 123 p 132 p 213 p 231 p 312 p 321 c 1 1 1 0 0 0 0 c 2 0 0 1 1 0 0 c 3 0 0 0 0 1 1 c 12 1 0 1 0 0 0 c 13 0 1 0 0 1 0 c 23 0 0 0 1 0 1 c 123 1 1 1 1 1 1          Here w e use the same notation for both the matrix and the mo del p olytop e, whic h is the con vex hull of the columns. F rom the equalit y of ideals, I birk = I asc , we infer that the poly- top e As 3 is aﬃnely isomorphic to the 3 × 3 -Birkhoﬀ p olytop e, whic h is a cyclic 4 -p olytop e with six v ertices. The ideal I inv rev eals that the mo del p olytop e for the inv ersion mo del is a r e gular o ctahe dr on , while the p olytop e for the Csiszár mo del is the full 5 -simplex . T o see that no t wo of our four mo dels agree, we need to go to n ≥ 4 . 4 Bernd Sturmfels and V olkmar W elk er Example 2.1 . Let n = 4 . Then all four mo del p olytop es hav e 24 v ertices but their dimen- sions are diﬀerent. The Birkhoﬀ mo del has dimension 9 , the inv ersion mo del has dimension 6 , the ascending mo del has dimension 11 , and the Csiszár mo del has dimension 17 . Theo- rem 3.1 will explain the precise relationships and inclusions among the four mo dels.  Our w ork on this pro ject started b y trying to understand a certain mo del whose toric closure is the ascending mo del. Here toric closur e refers to the smallest toric mo del con tain- ing a giv en model. That non-toric mo del for ranking is the Plackett-Luc e mo del [8, 24, 29]. It can b e obtained from the ascending model by the follo wing sp ecialization of parameters: c i 7→ 1 θ i , c ij 7→ 1 θ i + θ j , c ij k 7→ 1 θ i + θ j + θ k , . . . . The prime ideal of algebraic relations among the p π is a non-toric ideal which con tains the toric ideal I asc . The case n = 3 is work ed out explicitly in Example 7.1. Geometrically , that smallest Plack ett-Luce mo del corresponds to blowing up P 2 at the nine p oin ts in (19). 3. Toric Models: Definitions and General Resul ts Let Q b e a p oset on ﬁnite ground set Ω . A Q -r anking is a maximal c hain a 0 < · · · < a n in Q . A chain a 0 < · · · < a n b eing maximal means that a 0 is minimal in Q , a n is maximal, and a i < a i +1 is a cov er relation for 0 ≤ i ≤ n − 1 . W e write M( Q ) for the set of maximal chains in Q and Co v ( Q ) for the set of co ver relations in Q . If Q = 2 [ n ] is the Bo olean lattice of all subsets of [ n ] ordered b y inclusion then the maximal chains in Q are in bijection with the p erm utations in S n , and the mo dels b elow coincide with the ones describ ed in Section 2. W e shall deﬁne four toric mo dels whose states are the maximal chains π ∈ M( Q ) . The probabilit y of π is represented by an indeterminate p π . Eac h toric mo del for Q -rankings is deﬁned b y a non-negativ e in teger matrix A whose columns are indexed by M( Q ) and ha v e a ﬁxed co ordinate sum S . The matrix A represents a monomial map from the p olynomial ring K [ p ] in the unknowns p π , π ∈ M( Q ) , to a suitably chosen second p olynomial ring. An y data set giv es a function u : M( Q ) 7→ N , where u ( π ) is the num b er of times the p erm utation π has been observ ed. Thinking of u as a column v ector, w e can form the matrix-v ector pro duct Au , whose en tries are the suﬃcien t statistics of the data set u . The co ordinate sum of the vector Au is equal to S times the sample size N = P π ∈ M( Q ) u ( π ) . (a) In the asc ending mo del , the suﬃcien t statistic Au records, for an y given poset elemen t a ∈ Q , the num b er of observed maximal chains π that pass though a . The mo del parameters are represen ted b y unkno wns c a , and the monomial map is p π 7→ c a 0 c a 1 · · · c a n for π = ( a 0 π − 1 ( j ) . The mo del parameters are represen ted b y unkno wns u ij and v ij . The monomial map is p π 7→ Y 1 ≤ iπ − 1 ( j ) v ij for π ∈ L ( P ) . In general, w e hav e the following inclusions among the four toric mo dels (a)-(d). These inclusions of toric v arieties corresp ond to linear pro jections among the mo del p olytop es. Theorem 3.1. (i) The asc ending mo del and the Csiszár mo del on a p oset Q satisfy M asc ⊆ M csi , pr ovide d Q has either a unique minimal element ˆ 0 or a unique maximal element ˆ 1 . (ii) If Q = O ( P ) is a distributive lattic e, then the Birkhoﬀ mo del M birk , the inver- sion mo del M inv , the asc ending mo del M asc and the Csiszár mo del M csi satisfy M inv ⊆ M csi and M birk ⊆ M asc ⊆ M csi . (iii) The inclusions (ii) ar e strict in gener al. Mor e over, if n ≥ 4 and Q = 2 [ n ] then M inv 6⊂ M asc and M birk 6⊂ M inv . Pr o of. W e b egin by establishing (iii). The fact that the inclusions in (ii) are strict follows from Example 2.1. F or the second part of (iii) consider n = 4 . A direct computation as in Section 6 rev eals that the inv ersion mo del M inv is a pro jective toric v ariety of dimension 6 and degree 180 in P 23 . The Marko v basis of I inv consists of 81 quadrics. Since M birk has dimension 9 , we conclude that M birk 6⊂ M inv . An explicit p oint p in M birk \M inv is the uniform distribution on the nine derangemen ts. This arises by setting a ii = 0 for all i and 6 Bernd Sturmfels and V olkmar W elk er a ij = 1 / √ 3 for all i 6 = j . The quadric p 1243 p 4321 − p 2143 p 4312 ∈ I inv do es not v anish for this particular distribution. The ascending mo del M asc has dimension 11 and degree 808 . The Marko v basis of its toric ideal I asc consists of six quadrics, 64 cubics and 93 quartics. One of the cubics is (1) p 1234 p 1342 p 1423 − p 1243 p 1324 p 1432 ∈ I asc . An example of a p oint in M inv \M asc is obtained by taking the p arameter v alues u 12 = u 13 = u 14 = 0 , u 23 = u 24 = u 34 = v 12 = v 13 = v 23 = v 24 = 1 , v 34 = 2 , v 14 = 1 / 9 . The resulting distribution is supp orted on the six permutations in (1). Its co ordinates are p 1234 = p 1342 = p 1423 = 2 / 9 and p 1243 = p 1324 = p 1432 = 1 / 9 . This distribution is not a zero of (1), and hence it is not in the ascending mo del M asc . The t wo probability distributions on p ermutations seen ab ov e can b e lifted to similar coun terexamples for n ≥ 5 , and we conclude that the non-inclusions are v alid for all n ≥ 4 . The inclusion M asc ⊂ M csi in (i) is seen b y the sp ecialization of parameters that sends d aj v ij . This shows that the inv ersion mo del M inv is a subv ariety of the Csiszár mo del M csi . It remains to show that M birk ⊂ M asc . T o do this, w e let A denote the mo del matrix for M birk and B the mo del matrix for M asc . Both matrices hav e their en tries in { 0 , 1 } and they ha v e |L ( P ) | columns. The ro ws A ij of A are indexed by unordered pairs i, j ∈ [ n ] × [ n ] , and the rows B I of B are indexed b y subsets of [ n ] . W e hav e the identit y A ij = P  B I : I ∈  [ n ] j  and i ∈ I  − P  B I : I ∈  [ n ] j − 1  and i ∈ I  . This shows that ev ery row of A is a Z -linear com bination of the rows of B . Hence, the kernel of A contains the kernel of B , and this implies that the toric ideal I A = I birk con tains the toric ideal I B = I asc . W e conclude that M birk is a submo del of M asc .  In the rest of this pap er we consider the ascending and Csiszár mo dels only in the graded situation, that is, when the monomial images of all the unkno wns p c , c ∈ M( Q ) , ha ve the same total degree. The latter is equiv alent to requiring that all maximal chains in Q hav e the same cardinalit y , which in turn is equiv alen t to Q b eing graded. F or a graded p oset Q w e denote b y rk : Q → N its rank function and write Q i for the set of its elemen ts of rank i . By rk( Q ) w e denote the rank of Q , which is the maximal rank of any of its elemen ts. In the next three sections we undertake a detailed study of the models (b), (a) and (d), in this order. The Birkhoﬀ mo del (c) has already receiv ed considerable attention in the literature [11, 12], at least for L ( P ) = S n , and w e con ten t ourselv es with a few brief Comm utative Algebra of Statistical Ranking 7 remarks. Its mo del p olytop e, the Birkhoﬀ p olytop e of doubly sto c hastic matrices, is a k ey pla yer in combinatorial optimization, and it is link ed to many ﬁelds of pure mathematics. The restriction of the Birkhoﬀ mo del and its p olytop e to prop er subsets L ( P ) of S n has b een studied only in some sp ecial cases. F or example, Chan, Robbins and Y uen [7] considered this p olytop e for the constraint p oset P giv en by the transitive closure of j > j − 2 and j > j − 3 for 3 ≤ j ≤ n . They stated a conjecture on its volume whic h was pro v ed b y Zeilb erger [34]. W e close b y noting a formula for the dimension of these p olytop es. Prop osition 3.2. L et P b e an arbitr ary c onstr aint p oset on [ n ] = { 1 , 2 , . . . , n } . Set Z = { ( i, j ) ∈ [ n ] × [ n ] | π ( i ) 6 = j for al l π ∈ L ( P ) } and C =  ( i, j ) ∈ [ n ] × [ n ] | ( i, j ) 6∈ Z and ( i, j 0 ) ∈ Z for some j 0 > j or ( i 0 , j ) ∈ Z for some i 0 > i  . The mo del p olytop e Bi of the Birkhoﬀ mo del, expr esse d using c o or dinates x ij on R n × n , e quals the fac e of the classic al Birkhoﬀ p olytop e of bisto chastic n × n -matric es deﬁne d by (2) x ij = 0 for al l ( i, j ) ∈ Z . In p articular, the dimension of the Birkhoﬀ mo del p olytop e is dim(Bi) = n 2 − | Z | − | C | . Pr o of. Clearly , the mo del p olytop e Bi of the Birkhoﬀ mo del is con tained in the classical Birkhoﬀ p olytop e. Equally obvious is that all equations (2) are v alid for the mo del p olytop e. Hence Bi is con tained in the p olytop e cut out from the classical Birkhoﬀ p olytop e b y (2). F ollowing the lines of the Birkhoﬀ-von Neumann Theorem (see e.g. [1, (5.2)]), we note that the vertices of the p olytop e cut out by (2) from the classical Birkhoﬀ p olytop e are the p ermutation matrices for the permutations π ∈ L ( P ) . The ﬁrst assertion now follo ws. The linear relations on the Birkhoﬀ p olytop e state that all ro w and column sums are 1 . W e set x ij = 0 for ( i, j ) ∈ Z . In the resulting linear relations precisely the v ariables x ij for ( i, j ) ∈ C are the leading terms with resp ect to order of the v ariables induced by the lexicographic order on the index tuples. This prov es the dimension statemen t.  W e illustrate Prop osition 3.2 with t w o simple examples. If P is an n -element antic hain then Z = ∅ and C = { (1 , n ) , (2 , n ) , . . . , ( n, n ) , ( n, n − 1) , . . . ( n, 1) } . Here our form ula gives the dimension n 2 − 0 − (2 n − 1) = ( n − 1) 2 of the classical Birkhoﬀ p olytop e. If P is the n -c hain 1 < 2 < · · · 0 . Let c 00 1 , . . . , c 00 s b e the chains from the ab ov e expansion of x 00 that con tain a and for whic h λ c 00 > 0 . The coordinate x 0 a of x 0 then equals P λ c 0 i and the co ordinate x 00 a of x 00 equals P λ c 00 i . Since x 0 a and x 00 a coincide with the co ordinate x a of x , w e hav e P λ c 0 i = P λ c 00 i . After relab eling (and p ossibly swapping x 0 and x 00 ) w e ma y assume that λ c 0 1 is the minimum of { λ c 0 1 , . . . , λ c 0 r , λ c 00 1 , . . . , λ c 00 s } . Then we replace λ c 00 1 b y λ c 00 1 − λ c 0 1 . Let c 1 ∈ M( Q ) be the concatenation of c 0 1 and c 00 1 . Now set λ c 1 = λ c 0 1 and p ro ceed with the new co eﬃcien ts and the c hains c 0 2 , . . . , c 0 r and c 00 1 , . . . , c 00 s . Cl early the sums of the co eﬃcien ts of c 0 2 , . . . , c 0 r and c 00 1 , . . . , c 00 s still coincide. Pro ceeding b y induction and summing o ver all a ∈ Q n − 1 for which x a > 0 , one constructs an expansion P λ i c i in terms of c hains in M( Q ) whose pro jection on to M ( Q 0 ) equals x 0 and whose pro jection onto M ( Q 00 ) equals x 00 . Hence x = P λ i c i , and we hav e λ i ≥ 0 and P λ i = P a ∈ Q n − 1 x a = 1 by (7). This prov es that x ∈ As .  In the preceding pro of, when showing that an y x satisfying (7) – (9) lies in As , w e use (9) only in the induction base n = 1 . The equations (7) are complete and indep endent when Q = 2 [ n ] is the Bo olean lattice, so in that case the dimension of the mo del p olytop e As is equal to 2 n − n − 1 . In general the dimension is more subtle to calculate and we do not kno w an y go o d description. F or example if the induced subp oset of Q on the elements of t wo consecutiv e ranks i and i + 1 is disconnected then As is contained in eac h h yp erplane deﬁned by the equalit y of the sum ov er the v ariables of rank i and i + 1 in a comp onen t. No w w e turn to the toric ideal I asc of the ascending mo del. It is the kernel of the map (10) K [ p ] → K [ t ] , p π 7→ t a 0 t a 1 · · · t a n for π = ( a 0 < · · · < a n ) ∈ M( Q ) . If rk( Q ) = 0 then this map is injective and I asc = { 0 } , so we assume rk( Q ) ≥ 1 from no w on. The case rk( Q ) = 1 serv es as the base case for our inductiv e constructions. Here the p oset Q is identiﬁed with a bipartite graph on Q 0 and Q 1 , and the monomial map Comm utative Algebra of Statistical Ranking 13 p π 7→ t a 0 t a 1 deﬁnes the toric ring asso ciated with a bipartite graph in comm utative algebra. A generating set of the kernel of this map w as determined in [27, Lemma 1.1] and sho wn to b e a univ ersal Gröbner basis in [33, Prop osition 8.1.10]. This result has already pro v en to b e useful in algebraic statistics (see e.g. [14]). Lemma 5.2 (Ohsugi-Hibi [27], Villerreal [33]) . L et Q b e a gr ade d p oset of r ank 1 . Then a universal Gr öbner b asis of the toric ide al I asc c onsists of al l cycles in Q , expr esse d as binomials p a 0 a 2 < · · · a 2 m = a 0 ) in the subp oset Q i,i +1 of al l elements having r ank i or i + 1 in Q . Then the maximal chains π j , ¯ π j for 0 ≤ j ≤ s ar e chosen such that π j = ( u j, 0 = ¯ u j, 0 < · · · < u j,i = ¯ u j,i = a 2 j < a 2 j +1 = u j,i +1 < · · · < u j,n ) and ¯ π j = ( u j, 0 = ¯ u j, 0 < · · · < u j,i = ¯ u j,i = a 2 j < a 2 j − 1 = ¯ u j,i +1 < · · · < ¯ u j,n ) and the multisets { u j,` | 0 ≤ j ≤ s, i ≤ ` ≤ n } and { ¯ u j,` | 0 ≤ j ≤ s, i ≤ ` ≤ n } c oincide. In Figure 1 w e giv e a visual description of the binomial (12). F or the proof of this result w e shall emplo y Sulliv ant’s theory of toric ﬁb er pr o ducts from [32]. W e brieﬂy review that theory . Consider tw o p olynomial rings K [ p 0 ] and K [ p 00 ] and a surjective multigrading φ : { p 0 } ∪ { p 00 } → A ⊆ R d , called the A -gr ading . Then choose new v ariables z π ,τ for all π ∈ { p 0 } and τ ∈ { p 00 } suc h that φ ( π ) = φ ( τ ) . F or ideals I in K [ p 0 ] and J in K [ p 00 ] that are A -homogeneous, w e let I × A J denote the kernel of the map z π ,τ 7→ p 0 π ⊗ p 00 τ from K [ z ] to the tensor pro duct K [ p 0 ] /I ⊗ K [ p 00 ] /J . 14 Bernd Sturmfels and V olkmar W elker u 0 n ¯ u 0 n u 1 n ¯ u 1 n u sn ¯ u sn a 0 = a 2 s a 2 a 2 s − 2 a 2 s − 1 a 1 a 2 s − 3 u 00 = ¯ u 00 u 10 = ¯ u 10 u s 0 = ¯ u s 0 Figure 1. A binomial in the Gröbner basis of the ascending mo del In order to describ e a Gröbner basis of I × A J in terms of Gröbner bases of I and J , the concept of lifting monomials turns out to b e crucial [32, p. 567]. A lift of a v ariable p 0 π is z π τ for some τ with φ ( π ) = φ ( τ ) . Now assume that A is linearly indep enden t. Let f ∈ K [ p 0 ] b e an A -homogeneous p olynomial. Eac h monomial m in f factors as m a 1 . . . m a r where A = { a 1 , . . . , a r } and φ ( m a i ) = deg ( m a i ) a i . Moreov er, since A is linearly independent, eac h monomial m in f gives the same n um b er d i := deg( m a i ) of v ariables of degree a i (coun ted with m ultiplicit y). No w c ho ose a m ultisets of d i v ariables p 00 of degree a i . A lift of f is then an y p olynomial obtained from the ab ov e choices when lifting the v ariables in each monomial from f in suc h a w ay that for all monomials the c hosen multisets are exhausted. Pr o of. W e pro ceed by induction on n = rank( Q ) . If n = 1 then (11) describ es an empty set of binomials and the set in (12) coincides with the Gröbner basis given in Lemma 5.2. No w assume n ≥ 2 . As in the pro of of Theorem 5.1 we split Q into the subposet Q 0 = Q 0 ∪ · · · ∪ Q n − 1 consisting of ranks 0 , . . . , n − 1 and the bipartite p oset Q 00 = Q n − 1 ∪ Q n consisting of ranks n − 1 and n . Assume Q n − 1 = { a 1 , . . . , a r } . Any c hain in M( Q 0 ) ends in an elemen t from Q n − 1 , and an y c hain from M( Q 00 ) starts in an elemen t from Q n − 1 . W e consider the p olynomial ring K [ p 0 ] with v ariables p 0 π for π ∈ M( Q 0 ) and K [ p 00 ] with v ariables p 00 π for π ∈ M( Q 00 ) . Then w e grade p 0 π b y e i ∈ R r if π ends in a i and p 00 c b y e i ∈ R r if π b egins in a i . Note that the set of degrees A = { e 1 , . . . , e r } is linearly indep enden t. W e write I 0 asc for the ideal of the ascending model of Q 0 and I 00 asc for the ideal of the ascending mo del of Q 00 . The toric ideal of in terest to us is the ﬁb er pro duct I asc = I 0 asc × A I 00 asc . Since A is linear indep endent, w e can apply [32, Theorem 12] and the induction hypothesis Comm utative Algebra of Statistical Ranking 15 to pro ve the claim. Sulliv ant’s result tells us that a Gröbner basis of I asc can b e found by lifting Gröbner bases of the ideals I 0 asc and I 00 asc and by adding some quadratic relations. By induction, I 0 asc has a Gröbner basis G 0 consisting of elemen ts (11) and (12). W e shall lift these to binomials in I asc . Likewise, I 00 asc has a Gröbner basis G 00 consisting of elemen ts (12). There are no binomials of t yp e (11) in I 00 asc b ecause the p oset Q 00 has only rank 1 . Lifting (11): Let p π 1 p π 2 − p ¯ π 1 p ¯ π 2 b e a quadric (11) in G 0 . Since it is A -homogeneous, the m ultisets of endp oin ts of π 1 , π 2 and ¯ π 1 , ¯ π 2 coincide. Supp ose π 1 and ¯ π 1 ha ve the same endp oin t. In the lifting describ ed ab ov e we need to distinguish t wo cases. Case 1 : π 1 and π 2 end in diﬀerent endp oints. Then, for an y t wo maximal c hains τ 1 , τ 2 in Q 00 starting in the endp oin ts of π 1 and π 2 resp ectiv ely , the unique lift for these choices is (13) p π 1 τ 1 · p π 2 τ 2 − p ¯ π 1 τ 1 · p ¯ π 2 τ 2 ∈ I asc . Case 2 : π 1 and π 2 end in the same endp oint. Then, for an y tw o chains τ 1 , τ 2 in Q 00 starting in the common endp oin t of π 1 and π 2 , b esides the lift (13) we also hav e the lift (14) p π 1 τ 1 · p π 2 τ 2 − p ¯ π 1 τ 2 · p ¯ π 2 τ 1 ∈ I asc . One easily chec ks th at the binomials from (13) and (14) satisfy the conditions from (11). Lifting (12): First consider a binomial p π 1 · · · p π s − p ¯ π 1 · · · p ¯ π s of t yp e (12) in the Gröbner basis G 0 . Since it is A -homogeneous, the multisets { φ ( π 1 ) , . . . , φ ( π s ) } and { φ ( ¯ π 1 ) , . . . , φ ( ¯ π s ) } coincide. Now choose maximal chains π 00 1 , . . . , π 00 s from Q 00 with the same m ultiset of A - degrees { φ ( π 00 1 ) , . . . , φ ( π 00 s ) } . Note that the π 00 i are just single co v er relations. F or any γ ∈ S s suc h that φ ( ¯ π j ) = φ ( π 00 γ ( j ) ) , the binomial p π 1 π 00 1 · · · p π s π 00 s − p ¯ π 1 π 00 τ (1) · · · p ¯ π s π 00 γ ( s ) lies in I asc and is of t yp e (12). W e next consider a binomial p π 1 · · · p π s − p ¯ π 1 · · · p ¯ π s of type (12) in the Gröbner basis G 00 . The pro of is analogous to the previous case, but the multiset of A -degree { φ ( π 1 ) , . . . , φ ( π s ) } = { φ ( ¯ π 1 ) , . . . , φ ( ¯ π s ) } here is actually a set. Cho osing a set { π 0 1 , . . . , π 0 s } of maximal chains from Q 0 for whic h { φ ( π 1 ) , . . . , φ ( π s ) } and { φ ( π 0 1 ) , . . . , φ ( π 0 s ) } coincide leads to a unique lift p π 00 1 π 1 · · · p π 00 s π s − p π 00 1 ¯ π 1 · · · p π 00 s ¯ π s is I asc of t yp e (12). All the binomials constructed by these liftings from G 0 and G 00 are among the binomials describ ed in (11) and (12) for the ideal I asc w e seek to generate. Finally , we add the quadratic binomials p π 0 1 π 00 1 p π 0 2 π 00 2 − p π 0 1 π 00 2 p π 0 2 π 00 1 for all maximal c hains π 0 1 , π 0 2 ∈ M( Q 0 ) and π 00 1 , π 00 2 ∈ M( Q 00 ) whose A -degrees coincide. These binomials lie in I asc and they hav e t yp e (11). W e ha ve shown that the lifting of the Gröbner bases G 0 for I 0 asc and G 00 for I 00 asc plus the additional quadrics are a subset of the binomials describ ed in (11) and (12). Using [32, Theorem 12], we conclude that the binomials from (11) and (12) form a Gröbner basis of I asc . Actually , the following conv erse is true as w ell: all binomials (11) and (12) in I asc arise from I 0 asc and I 00 asc using the lifting pro cedure we described.  Corollary 5.4. The toric algebr a K [ p ] /I asc is normal and Cohen-Mac aulay. 16 Bernd Sturmfels and V olkmar W elker Pr o of. Theorem 5.3 ga v e a Gröbner basis for I asc whose leading monomials are squarefree. This sho ws that K [ p ] /I asc is normal. Ho chster’s Theorem [19, Theorem 1] implies Cohen- Macaula yness.  W e could also give an alternativ e pro of of Theorem 4.3 using toric ﬁb er pro ducts. Namely , the toric algebra K [ p ] /I csi can b e obtained as an iterated toric ﬁb er pro duct of suitably graded smaller p olynomial rings that are attac hed to the pieces in a decomp osition of Q in to an tic hains. The matrices M q in tro duced after the pro of of Theorem 4.3 represent the “glueing quadrics” used for constructing larger toric ideals from smaller ones. W e close with some brief remarks on the ascending mo del for the Bo olean lattice Q = 2 [ n ] . In Section 2 we sa w that, for n = 3 , the ideal I asc is principal with generator p 123 p 231 p 312 − p 132 p 213 p 321 . This cubic is of type (12). It represen ts the unique cycle in the hexagon Q 1 , 2 . F or n = 4 , the minimal Mark ov basis of the ascending mo del consists of 6 quadrics, 64 cubics and 93 quartics. Thus, here we encounter binomials of b oth types (11) and (12). The Hilb ert series of the Cohen-Macaulay ring K [ p ] /I asc for Q = 2 [4] is found to b e 1 + 12 t + 72 t 2 + 228 t 3 + 291 t 4 + 168 t 5 + 36 t 6 (1 − t ) 12 . 6. The inversion Model The inv ersion mo del is deﬁned only in the case when Q is the distributiv e lattice as- so ciated with a constraint p oset P on [ n ] . The maximal chains in Q corresp ond to linear extensions π ∈ L ( P ) of the constraint p oset. These are the permutations π ∈ S n that are compatible with P . Fix unknowns u ij and v ij for 1 ≤ i < j ≤ n . Algebraically , the in version mo del is deﬁned b y the toric ideal which is the kernel of the monomial map p π 7→ Y 1 ≤ iπ − 1 ( j ) v ij . W e b egin considering the unc onstr aine d inversion mo del . By this w e mean the case when P is an n -elemen t antic hain, so there are no constrain ts at all. In that unconstrained case, w e ha ve Q = 2 [ n ] and our state space M( Q ) = S n = L ( P ) consists of all n ! p erm utations. The Mal lows mo del [25] is a natural specialization of the unconstrained in version mo del to a single parameter q . It is obtained by setting u ij := 1 and v ij := q . So, in this mo del, the probability of observing the p erm utation π is P ( π ) = Z − 1 q | inv( π ) | , where in v ( π ) =  ( i, j ) : 1 ≤ i < j ≤ n, π − 1 ( i ) > π − 1 ( j )  is the set of in versions of π , and Z is a normalizing constant. In contrast, our inv ersion mo del p ermits diﬀeren t parameters for the v arious inv ersions o ccurring in a p erm utation. The mo del polytop e for the unconstrained in v ersion mo del is a familiar ob ject in com bi- natorial optimization, where it is known as the line ar or dering p olytop e [15, 18]. It is known that optimizing a general linear function o ver the linear ordering p olytop e is an NP-hard problem [18]. This mirrors the fact that the facial structure of this p olytop e is v ery com- plicated and a complete description app ears out of reach. As a result of this, we exp ect the Comm utative Algebra of Statistical Ranking 17 toric rings asso ciated with the in version mo dels to b e more complicated than those studied in the previous t w o sections. Our study w as limited to ﬁnding some computational results. Theorem 6.1. F or n ≤ 6 the toric ring of the unc onstr aine d inversion mo del is normal and henc e Cohen-Mac aulay. F or n ≤ 5 it is Gor enstein and its Markov b asis c onsists of quadrics. F or n = 6 it is not Gor enstein and ther e exists a Markov b asis element of de gr e e 3 . Pr o of. Computations using 4ti2 [16] show that the Marko v basis for n = 3 , 4 , 5 consists of 2 , 81 , 3029 quadratic binomials. W e do not kno w whether there is a quadratic Gröbner basis for n = 5 , or whether the ring is Koszul. The Hilb ert series for n ≤ 5 are n Hilb ert Series 3 (1 + 2 t + t 2 ) / (1 − t ) 4 4 (1 + 17 t + 72 t 2 + 72 t 3 + 17 t 4 + t 5 ) / (1 − t ) 7 5 (1+109 t +2966 t 2 +22958 t 3 +61026 t 4 +61026 t 5 +22958 t 6 +2966 t 7 +109 t 8 + t 9 ) / (1 − t ) 11 All three n umerator p olynomials are symmetric. Using normaliz [5] one chec ks that the toric ring is normal in each case. Ho c hster’s Theorem [19] implies that it is Cohen-Macaula y . The Gorenstein prop ert y now follows from the general result that any Cohen Macaula y domain whose Hilb ert series has a symmetric numerator polynomial is Gorenstein. F or n = 6 , the computations are muc h harder, and they reveal that the ab ov e nice prop- erties no longer hold. The soft w are also found that the Hilbert series of this unconstrained in version mo del is the pro duct of 1 / (1 − t ) 16 and the remark able numerator p olynomial 1 + 704 t + 117783 t 2 + 5125328 t 3 + 76415229 t 4 +475189840 t 5 + 1372165343 t 6 + 1943081264 t 7 + 1372165343 t 8 + 475189840 t 9 +76416069 t 10 + 5127008 t 11 + 118623 t 12 + 704 t 14 + t 14 . This p olynomial is close to symmetric but not symmetric, so the ring is not Gorenstein. In addition to 130377 quadrics, a Marko v basis for n = 6 m ust con tain the cubic binomial (15) p 123456 p 123645 p 416253 − p 123465 p 162345 p 412536 . Indeed, a computation shows that these are only tw o cubic monomials in the ﬁb er given b y the multiset of in v ersions { (1 , 4) , (2 , 4) , (2 , 6) , (3 , 4) , (3 , 5) , (3 , 6) , (4 , 6) , (5 , 6) , (5 , 6) } .  A complete description of the binomial quadrics in a Marko v basis w as recently found by Katthän [23]. How ever, the problem of c haracterizing a full Marko v basis is widely op en. W e do not know whether normalit y holds for n ≥ 7 , but we susp ect not. T o address this question, we return to the general situation of an underlying constraint poset P . The states π of the P -c onstr aine d inversion mo del are elemen ts of the subset L ( P ) ⊂ S n . This inclusion corresp onds to passing to some co ordinate h yp erplanes in the ambien t space of the mo del p olytop es. Therefore, the mo del p olytop e for the P -constrained mo del is a face of the mo del p olytop e for the unconstrained mo del. Hence, to answer our question ab out normalit y for n ≥ 7 , it could suﬃce to sho w that the toric ring for P is not normal. A t present our state of kno wledge ab out the P -constrained inv ersion mo dels is rather limited. W e do not yet even ha ve useful formula for the dimension of its mo del p olytop e. 18 Bernd Sturmfels and V olkmar W elker By contrast, the dimension of the unconstrained mo del equals  n 2  , as this is the dimension of the linear ordering p olytop e. This w as shown, for example, in [30, Prop osition 3.10]. W e wish to men tion a family of constrain t p osets that is imp ortan t for applications of statistical ranking in data mining, e.g. in recen t w ork of Cheng et al. [8]. F or that application one would tak e P to b e an y disjoin t union of a chain and an an tichain. Example 6.2 . Let n ≥ 4 and P b e the poset consisting of the 3 -chain 1 < 2 < 3 and n − 3 incomparable elemen ts. If n = 4 then L ( P ) = { 1234 , 1243 , 1423 , 4123 } and the toric ideal I inv is the zero ideal in the polynomial ring in four unknowns. If n = 5 then the n um b er of states is 20 and the mo del p olytop e has dimension 7 , degree 82 , and the Hilbert series is 1 + 12 t + 38 t 2 + 28 t 3 + 3 t 4 (1 − t ) 8 . The Marko v basis f or this P -constrained mo del consists of 40 quadrics: p 41523 p 51423 − p 14523 p 54123 p 41253 p 51423 − p 14253 p 54123 p 41235 p 51423 − p 14235 p 54123 p 41253 p 51243 − p 12453 p 54123 p 41235 p 51243 − p 12435 p 54123 p 15423 p 51243 − p 15243 p 51423 p 14253 p 51243 − p 12453 p 51423 p 14235 p 51243 − p 12435 p 51423 p 41235 p 51234 − p 12345 p 54123 p 15423 p 51234 − p 15234 p 51423 p 15243 p 51234 − p 15234 p 51243 p 14235 p 51234 − p 12345 p 51423 p 12543 p 51234 − p 12534 p 51243 p 12435 p 51234 − p 12345 p 51243 p 15423 p 45123 − p 14523 p 54123 p 15243 p 45123 − p 41523 p 51243 p 15234 p 45123 − p 41523 p 51234 p 12543 p 45123 − p 12453 p 54123 p 12534 p 45123 − p 41253 p 51234 p 12354 p 45123 − p 12345 p 54123 p 15243 p 41253 − p 12543 p 41523 p 15234 p 41253 − p 12534 p 41523 p 14523 p 41253 − p 14253 p 41523 p 15234 p 41235 − p 12354 p 41523 p 14523 p 41235 − p 14235 p 41523 p 14253 p 41235 − p 14235 p 41253 p 12534 p 41235 − p 12354 p 41253 p 12453 p 41235 − p 12435 p 41253 p 14253 p 15243 − p 12453 p 15423 p 14235 p 15243 − p 12435 p 15423 p 14235 p 15234 − p 12345 p 15423 p 12543 p 15234 − p 12534 p 15243 p 12435 p 15234 − p 12345 p 15243 p 12543 p 14523 − p 12453 p 15423 p 12534 p 14523 − p 14253 p 15234 p 12354 p 14523 − p 12345 p 15423 p 12534 p 14235 − p 12354 p 14253 p 12453 p 14235 − p 12435 p 14253 p 12435 p 12534 − p 12345 p 12543 p 12354 p 12453 − p 12345 p 12543 It can b e asked which P -constrained in v ersion mo dels ha ve a Marko v basis of quadrics and, more generally , whic h degrees app ear in a Mark ov basis. W e conﬁrmed the quadratic Mark ov basis for all p osets P on n ≤ 4 elements, all on n = 5 elements arising b y adding one incomparable element to a p oset on 4 elements, and all unconstrained mo dels for n ≤ 5 . In terestingly , the notion of inv ersion mo del c hanges if w e deﬁne i < j to b e an in version if π ( i ) > π ( j ) . The latter can b e seen as a homogeneous Babington-Smith mo del from [25]. The deﬁning monomial map for this alternative inversion mo del equals p π 7→ Y 1 ≤ iπ ( j ) v ij for π ∈ L ( P ) . F or the 3 -chain 1 < 2 < 3 with tw o incomparable elements, the Marko v basis no w consists of p 15243 p 51423 − p 12543 p 54123 p 15234 p 51423 − p 12534 p 54123 p 15423 p 51243 − p 12543 p 54123 p 15234 p 51243 − p 12354 p 54123 p 12534 p 51243 − p 12354 p 51423 p 15423 p 51234 − p 12534 p 54123 p 15243 p 51234 − p 12354 p 54123 p 15234 p 51234 − p 12345 p 54123 p 12543 p 51234 − p 12354 p 51423 p 12534 p 51234 − p 12345 p 51423 p 12354 p 51234 − p 12345 p 51243 p 12534 p 15243 − p 12354 p 15423 p 12543 p 15234 − p 12354 p 15423 p 12534 p 15234 − p 12345 p 15423 p 12354 p 15234 − p 12345 p 15243 p 12354 p 12534 − p 12345 p 12543 p 12435 p 12453 − p 12345 p 12543 , and p 14235 p 14253 p 14523 − p 12345 p 15243 p 15423 , and p 41235 p 41253 p 41523 p 45123 − p 12345 p 51243 p 51423 p 54123 . So, unlike in Example 6.2, this Mark o v basis is not quadratic. The Hilb ert series equals 1 + 9 t + 28 t 2 + 51 t 3 + 66 t 4 + 63 t 5 + 44 t 6 + 21 t 7 + 5 t 8 (1 − t ) 11 . Comm utative Algebra of Statistical Ranking 19 Note that, if L ( P ) is closed under taking inv ersions, then this mo del coincides with the normal P -constrain t in version mo del up to a relab eling. This holds for the unconstrained in version mo del. All examples tested in this alternativ e mo del had normal mo del p olytop es. 7. Pla ckett-Luce Model and Bradley-Terr y model The Plack ett-Luce mo del is a non-toric mo del on the set L ( P ) of p erm utations π ∈ S n that are consistent with a given constrain t p oset P on [ n ] . It can be deﬁned by the map (16) p π 7→ n − 1 Y i =1 1 P i j =1 θ π ( j ) for π ∈ L ( P ) . W e denote this mo del b y PL P and its homogeneous ideal by I PL P . Th us I PL P is the k ernel of the ring map R [ p π : π ∈ L ( P ) ] → R ( θ 1 , θ 2 , . . . , θ n ) deﬁned by the formula (16). The form ula shows that the Plack ett-Luce mo del is a submo del of the ascending mo del on L ( P ) . In fact, the ascending mo del is the toric closur e of the Plac kett-Luce mo del, b y which we mean that As P is the smallest toric mo del containing PL P . The sp ecialization map is (17) t π ( { 1 , 2 ,...,i } ) 7→  θ π (1) + θ π (2) + · · · + θ π ( i )  − 1 . W e ﬁx K = C and regard the Plac kett-Luce mo del PL P as a pro jective v ariety in P |L ( P ) |− 1 . The toric closure prop erty means that all binomials in I PL P m ust lie in I asc , and this follows from unique factorization in R [ θ 1 , . . . , θ n ] , given that the linear forms in (17) are distinct. In order for PL P to b e prop erly deﬁned as a statistical mo del, its probabilities should sum to 1 . F or this we would need to iden tify the normalizing constan t, whic h is the image of P π ∈L ( P ) p π under the map (16). A formula for this quan tity can b e derived, for man y situations of interest, from equations (25) and (26) in Hun ter’s article [21]. The most general situation where the normalizing constant was determined can b e found in [2]. They mak e use of sophisticated metho ds from the algebraic and geometric theory of v aluations on cones. In our situation, P π ∈ S n p π is mapp ed to 1 θ 1 θ 2 ··· θ n under the ring map in (16). Let us begin b y examining the unconstrained case when P is an antic hain, Q = 2 [ n ] and L ( P ) = M( Q ) = S n . This is the Plack ett-Luce mo del PL n familiar from the statistics literature [21, 24, 29]. With the correct normalizing constant, its parametrization equals (18) p π 7→ n Y i =1 θ π ( i ) P i j =1 θ π ( j ) for π ∈ S n . This deﬁnes a p olynomial map from the non-negative orthant R n ≥ 0 to the ( n ! − 1) -dimensional simplex of probability distributions on the symmetric group S n . W e shall regard PL n as a complex pro jectiv e v ariety in the ambien t P n ! − 1 . Being the image of a rational map from P n − 1 , the dimension of this v ariety is ≤ n − 1 . Theorem 7.4 sho ws that it equals n − 1 . Example 7.1 ( n = 3 ) . The Plac k ett-Luce mo del PL 3 is a surface of degree 7 em b edded in 5 - dimensional pro jective space P 5 . The parameterization (16) of that surface is equiv alen t to p 123 7→ θ 2 θ 3 ( θ 1 + θ 3 )( θ 2 + θ 3 ) , p 132 7→ θ 2 θ 3 ( θ 1 + θ 2 )( θ 2 + θ 3 ) , p 213 7→ θ 1 θ 3 ( θ 1 + θ 3 )( θ 2 + θ 3 ) , p 231 7→ θ 1 θ 3 ( θ 1 + θ 2 )( θ 1 + θ 3 ) , p 312 7→ θ 1 θ 2 ( θ 1 + θ 2 )( θ 2 + θ 3 ) , p 321 7→ θ 1 θ 2 ( θ 1 + θ 2 )( θ 1 + θ 3 ) . 20 Bernd Sturmfels and V olkmar W elker The deﬁning ideal I PL 3 of PL 3 is minimally generated b y three quadratic p olynomials, in addition to the familiar cubic binomial that sp eciﬁes the am bient ascending mo del: I PL 3 =  p 123 ( p 321 + p 231 ) − p 213 ( p 132 + p 312 ) , p 312 ( p 123 + p 213 ) − p 132 ( p 231 + p 321 ) , p 231 ( p 132 + p 312 ) − p 321 ( p 123 + p 213 ) , p 123 p 231 p 312 − p 132 p 321 p 213  . The singular lo cus of PL 3 consists of the three isolated p oin ts e 321 − e 231 , e 123 − e 213 and e 132 − e 312 in P 5 . In particular, there are no singular p oints with non-negative co ordinates, so this statistical mo del is a smo oth surface in the 5 -dimensional probabilit y simplex. F rom the p oin t of view of algebraic geometry , our parametrization map represen ts the blo w-up of the pro jectiv e plane P 2 at the following conﬁguration of nine sp ecial p oints: (19) ( 0 : 0 : 1 ) ( 0 : 1 : 0 ) ( 1 : 0 : 0 ) (1 : − 1 : 0) (1 : 0 : − 1) (0 : 1 : − 1) (1 : 1 : − 1) (1 : − 1 : 1) ( − 1 : 1 : 1) This conﬁguration has three 4 -p oin t lines and four 3 -p oin t lines. The map blows down the three 4 -p oint lines, and this creates a rational surface in P 5 with three singular p oin ts. F rom the point of view of comm utativ e algebra, one might ask whether the four genera- tors of the ideal I PL 3 form a Gröbner basis with resp ect to some term order. A computation rev eals that this is not the case. How ev er, we do get a square-free Gröbner basis for the lexicographic term order with p 123 >p 132 >p 213 >p 231 >p 312 >p 321 . The initial ideal equals in lex ( I PL 3 ) = h p 123 , p 132 , p 231 i ∩ h p 123 , p 132 , p 312 i ∩ h p 123 , p 132 , p 213 i ∩ h p 123 , p 213 , p 231 i ∩ h p 123 , p 213 , p 312 i ∩ h p 123 , p 312 , p 321 i ∩ h p 231 , p 312 , p 321 i . This represen ts a simplicial complex of sev en triangles, listed in a shelling order, so I PL 3 is Cohen-Macaula y . The Hilb ert series of the ring R [ p ] /I PL 3 equals (1 + 3 t + 3 t 2 ) / (1 − t ) 3 .  Example 7.2 ( n = 4 ) . The Plac k ett-Luce mo del PL 4 is a threefold of degree 191 in P 23 . It is obtained from P 3 b y blo wing up 55 lines. The homogeneous prime ideal I PL 4 that deﬁnes PL 4 is minimally generated b y 105 quadrics and 75 cubics. Its Hilb ert series equals 1 + 20 t + 105 t 2 + 65 t 3 (1 − t ) 4 . W e do not know whether I PL n is generated in degree 2 and 3 for n ≥ 5 .  Let us no w turn to the general Plack ett-Luce mo del with a given constraint p oset P , so only permutations π in L ( P ) are allow ed. The mo del PL P is obtained from PL n b y pro jecting on to those co ordinates. Algebraically , the prime ideal I P is obtained from I PL n b y eliminating all unknowns p π where π is a p erm utation that is not compatible with P . Example 7.3 . Let n = 4 and let P b e the p oset with t wo co vering relations 1 < 2 and 3 < 4 . The corresp onding distributiv e lattice L ( P ) is the pro duct of t wo c hains of length 3 . Note that L ( P ) has six maximal chains, namely , the p ermutations that resp ect 1 < 2 and 3 < 4 . The corresp onding unknowns are mapp ed to pro ducts of four linear forms as follo ws: p 1234 7→ θ 3 ( θ 1 + θ 3 )( θ 3 + θ 4 )( θ 1 + θ 3 + θ 4 ) , p 1324 7→ θ 3 ( θ 1 + θ 2 )( θ 3 + θ 4 )( θ 1 + θ 3 + θ 4 ) , p 1342 7→ θ 3 ( θ 1 + θ 2 )( θ 3 + θ 4 )( θ 1 + θ 2 + θ 3 ) , p 3124 7→ θ 1 ( θ 1 + θ 2 )( θ 3 + θ 4 )( θ 1 + θ 3 + θ 4 ) , p 3142 7→ θ 1 ( θ 1 + θ 2 )( θ 3 + θ 4 )( θ 1 + θ 2 + θ 3 ) , p 3412 7→ θ 1 ( θ 1 + θ 2 )( θ 1 + θ 3 )( θ 1 + θ 2 + θ 3 ) . Comm utative Algebra of Statistical Ranking 21 These reducible quartics meet in nine lines in P 3 , so the parametrization of PL P blo ws these up. The ideal I P is complete intersection. Its minimal generators are the cubic p 1234 p 1342 p 3142 + p 1234 p 2 3142 + p 1234 p 3142 p 3412 − p 1234 p 1324 p 3412 − p 2 1324 p 3412 − p 1324 p 3124 p 3412 and the binomial quadric p 1342 p 3124 − p 1324 p 3142 that deﬁnes the ascending mo del on P .  The follo wing is our main result in this section. It should b e useful for obtaining infor- mation ab out the ( n − 1) -dimensional v ariet y PL P and its homogeneous prime ideal I P . Theorem 7.4. The p ar ameterization P n − 1 → PL P ⊂ P |L ( P ) |− 1 of the Plackett-Luc e mo del on the p oset P is given ge ometric al ly as the blowing up of P n − 1 along an arr angement of line ar subsp ac es of c o dimension 2 . These subsp ac es ar e deﬁne d by the e quations P i ∈ A θ i = P j ∈ B θ j = 0 wher e { A, B } runs over al l inc omp ar able p airs in the distributive lattic e on P . Pr o of. Let R [ t ] denote the p olynomial ring of parameters in the ascending mo del (10). Its indeterminates are t A where A runs ov er subsets of [ n ] that are order ideals in P . W e deﬁne M to b e the Stanley-Reisner ideal of the distributiv e lattice of order ideals in P . This is the ideal in R [ p ] generated b y pro ducts t A t B where A and B are incomparable, meaning that neither A ⊂ B nor B ⊂ A holds. The Alexander dual of M is the monomial ideal M ∗ = \ { A,B } h t A , t B i , where the in tersection is o ver all incomparable pairs of order ideals. The generators of M ∗ corresp ond to the asso ciated primes of M , so they are indexed by compatible p erm utations π ∈ L ( P ) . Interpreting π as a maximal chain of order ideals, that corresp ondence is (20) p π 7→ Y A 6∈ π t A for π ∈ L ( P ) . The arrangement of subspaces describ ed in the statement of Theorem 7.4 is the in tersection of the v ariet y of M ∗ with a subspace P n − 1 deﬁned by t A = P i ∈ A θ i . By substituting this in to (20) we see that the blow-up along that subspace arrangemen t is deﬁned b y the map (21) p π 7→ Y A 6∈ π  X i ∈ A θ i  = const · Y A ∈ π 1 P i ∈ A θ i for π ∈ L ( P ) . This is precisely the deﬁning parametrization (16) of the Plac kett-Luce mo del PL P .  Example 7.5 . Let n = 4 and P as in Example 7.3. Then the abov e Stanley-Reisner ideal is M = h t 1 t 3 , t 3 t 12 , t 12 t 13 , t 1 t 34 , t 12 t 34 , t 13 t 34 , t 34 t 123 , t 12 t 134 , t 123 t 134 i . Its Alexander dual rev eals the combinatorial pattern of the map in Example 7.3: M ∗ = h t 3 t 13 t 34 t 134 , t 3 t 12 t 34 t 123 , t 1 t 12 t 34 t 123 , t 3 t 12 t 34 t 134 , t 1 t 12 t 34 t 134 , t 1 t 12 t 13 t 123 i . The mo del PL P is the blo w-up of P 3 at nine lines, one for eac h of the generators of M .  22 Bernd Sturmfels and V olkmar W elker Eac h of our unconstrained ranking mo dels was considered as a subv ariety of the complex pro jective space P n ! − 1 . If K is any k -element subset of [ n ] then w e obtain a natural rational map P n ! − 1 99K P k ! − 1 whic h records the probabilities for each of the k ! orderings of K only . Statistically , this map corresp onds to mar ginalization for the induced orderings on K . W e can no w take the direct pro duct of all of these maps, where K runs ov er all  n k  subsets of cardinalit y k in [ n ] . The resulting rational map into a pro duct of pro jectiv e spaces, (22) P n ! − 1 99K ( P k ! − 1 ) ( n k ) , is called the c omplete mar ginalization map of or der k . F or example, if n = 3 and k = 2 then w e are mapping into a pro duct of three pro jectiv e lines, with co ordinates ( q 12 : q 21 ) , ( q 13 : q 31 ) and ( q 23 : q 32 ) resp ectively . Here, the complete marginalization is the rational map P 5 99K P 1 × P 1 × P 1 whic h is given in co ordinates as follo ws: ( q 12 : q 21 ) = ( p 123 + p 132 + p 312 : p 213 + p 231 + p 321 ) , ( q 13 : q 31 ) = ( p 132 + p 123 + p 213 : p 312 + p 321 + p 231 ) , ( q 23 : q 32 ) = ( p 123 + p 213 + p 231 : p 132 + p 312 + p 321 ) . W e shall refer to the complete marginalization of order 2 as the p airwise mar ginalization . Example 7.6 . The pairwise marginalization of the Plack ett-Luce surface PL 3 ⊂ P 5 is the surface in P 1 × P 1 × P 1 that is deﬁned b y the binomial equation q 12 q 23 q 31 = q 21 q 32 q 13 . The comp osition of the map in Example 7.1 with the map in (22) is a toric rational map P 2 99K P 1 × P 1 × P 1 that blo ws up the three co ordinate p oints (1:0:0) , (0:1:0) and (0:0:1) .  It is w orth while, both algebraically and statistically , to study the v arious marginal- izations of the Csiszár mo del, ascending mo del, the in version model and the Plac k ett- Luce model. Of particular in terest is the pairwise marginalization of the Plac k ett-Luce mo del. This is known in the literature as the Br ad ley-T erry mo del [21]. All of these marginalized mo dels make sense relativ e to a ﬁxed constrain t p oset P . Here, we regard eac h k -set K as subp oset of P and w e write the corresp onding marginalization map as (23) P |L ( P ) |− 1 99K P |L ( K ) |− 1 . The complete k -th marginalization is the image of the direct pro duct of these maps, as K runs ov er all k -sets. F or conv enience, we shall here remov e those k -sets K that are totally ordered in P b ecause the corresp onding maps in (23) are constan t when |L ( K ) | = 1 . W e conclude this article with the follo wing algebraic c haracterization of the Bradley- T erry mo del. W e write P c for the bidirected graph on [ n ] where ( i, j ) is a directed edge if i and j are incomparable in P . Each circuit i 1 , i 2 , . . . , i r , i 1 in P c is encoded as a binomial: (24) q i 1 i 2 q i 2 i 3 · · · q i r − 1 i r q i r i 1 − q i 2 i 1 q i 3 i 2 · · · q i r i r − 1 q i 1 i r . These binomials deﬁne hypersurfaces in P ( n 2 ) . F or instance, the mo del in Example 7.6 is the toric hypersurface in P 1 × P 1 × P 1 th us asso ciated to a 3 -cycle. The theorem b elow refers to unimo dular L awr enc e ide als . This class of toric ideals was in tro duced and studied by Bay er et al. in [3]. The asso ciated toric v arieties liv e naturally in a pro duct of pro jective lines P 1 × · · · × P 1 . The case of interest here is that of unimo dular La wrence ideals arising from graphs. F or these ideals and their syzygies w e refer to [3, §5]. Comm utative Algebra of Statistical Ranking 23 Theorem 7.7. The Br ad ley-T erry mo del with c onstr aints P is toric. It is deﬁne d by the unimo dular L awr enc e ide al whose gener ators ar e the cir cuits (24) in the bidir e cte d gr aph P c . F rom this result we can now determine the commutativ e algebra inv ariants of the Bradley-T erry model, suc h as its Hilb ert series in the Z n -grading and its m ultidegree. Pr o of. F ollowing [21], the parametrization of the Bradley-T erry mo del can b e written as (25) q ij 7→ θ j θ i + θ j for i, j incomparable in P . Let ρ { i,j } b e new unknowns indexed b y unordered pairs { i, j } ⊂ [ n ] . The unimodular La wrence ideal associated with the bidirected graph P c is the k ernel of the monomial map (26) q ij 7→ ρ { i,j } · θ j for i, j incomparable in P . The sp ecialization ρ { i,j } = ( θ i + θ j ) − 1 sho ws that the ideal I BT P of the Bradley-T erry mo del is con tained the unimodular Lawrence ideal generated by the circuits (24). In addition, the ideal I BT P con tains the linear p olynomials q ij + q j i − 1 . These represen t the fact that, in an y compatible ranking π , either item i ranks b efore item j or vice v ersa, but not b oth. Let J b e the ideal generated b y the circuits (24) and these linear polynomials. W e ha ve seen that J ⊆ I BT P , and we are claiming that equality holds. But this follows by observing that b oth ideals are prime, and their v arieties ha ve the same dimension, namely n − 1 . Indeed, I BT P is prime b y deﬁnition, and J is prime b ecause adding the linear forms q ij + q j i − 1 to the unimo dular Lawrence ideal simply amounts to dehomogenizing from P 1 to A 1 in each factor. Geometrically , this op eration preserves the dimension of the v ariet y .  A cknowledgments W e very grateful to Winfried Bruns and Ra ymond Hemmec ke for their substantial help with the computational results in Theorem 6.1. Using the dev elop ers’ versions of Normaliz [5] and 4ti2 [16] resp ectively , they succeeded in computing the Hilb ert series of the in- v ersion model for n = 6 and in ﬁnding the cubic Mark ov basis element (15). W e also thank Eyke Hüllermeier and Seth Sulliv an t for helpful conv ersations and the referees for man y suggestions that helped us to impro ve the pap er. Bernd Sturmfels w as partially supp orted by the U.S. National Science F oundation (DMS-0757207 and DMS-0968882). V olkmar W elk er was partially supp orted by MSRI Berk eley . References [1] A. Barvinok: A Course in Convexity , Graduate Studies in Mathematics, 54 , AMS, Providence, 2002. [2] A. Boussicault, V. F eray , A. Lascoux and V. Reiner: Linear extension sums as v aluations of cones, arXiv:1008.3278 . [3] D. Bay er, S. Popescu and B. Sturmfels: Syzygies of unimo dular Lawrence ideals, J. R eine A ngew. Math. 534 (2001) 169–186. [4] N. Beerenwink el, N. Eriksson and B. Sturmfels: Evolution on distributive lattices, Journal of The o- r etic al Biolo gy 242 (2006) 409–420. [5] W. Bruns, B. Ichim and C. Söger: Normaliz – softw are for aﬃne monoids, vector conﬁgurations, lat- tice p olytop es, and rational cones, http://www.mathematik.uni-osnabrueck.de/normaliz/ , 2010. 24 Bernd Sturmfels and V olkmar W elker [6] E.R. Canﬁeld and B.D. McKay: The asymptotic volume of the Birkhoﬀ p olytop e, J. Analytic Comb. 4 (2009) article #2. [7] C.S. Chan, D.P . Robbins and D.S. Y uen: On the volume of a certain p olytop e, Exp eriment. Math. 9 (2000) 91–99. [8] W. Cheng, K. Dembczynski and E. Hüllermeier: Lab el ranking based on the Plac ket-Luce mo del, Pr o c. ICML-2010, International Confer enc e on Machine L e arning , Haifa, Israel, June 2010. [9] V. Csiszár: Mark o v bases of conditional indep endence models for permutations, Kyb ernetic a 45 (2009) 249-260. [10] V. Csiszár: On L-decomp osabilit y of random p erm utations, J. Math. Psycholo gy 53 (2009) 294-297. [11] P . Diaconis and N. Eriksson: Mark o v bases for noncomm utative F ourier analysis of ranked data, J. of Symb olic Computation 41 (2006) 182–195. [12] P . Diaconis and B. Sturmfels: Algebraic algorithms for sampling from conditional distributions, Ann. Stat. 26 (1998) 363-397. [13] M. Drton, B. Sturmfels and S. Sulliv ant: L e ctur es on Algebr aic Statistics , Ob erw olfac h Seminars, V ol 39, Birkhäuser, Basel, 2009. [14] S.E. Fien b erg, S. P etrović and A. Rinaldo: Algebraic statistics for a directed random graph mo del with recipro cation, Algebr aic Metho ds in Statistics and Pr ob ability II , pp. 261–283, Contemporary Math. 516 , Amer. Math. So c., Pro vidence, 2010. [15] S. Fiorini: { 0 , 1 2 } -cuts and the linear ordering problem: surfaces that deﬁne facets, SIAM J. Discr ete Math. 20 (2006), 893–912. [16] 4ti2 team: 4ti2 – A soft ware pack age for algebraic, geometric and combinatorial problems in linear spaces, a v ailable at www.4ti2.de . [17] D. Geiger, C. Meek and B. Sturmfels: On the toric algebra of graphical models, Ann. Statist. 34 (2006), 1463-1492. [18] M. Grötschel, M. Jünger and G. Reinelt: F acets of the linear ordering p olytop e, Math. Pr o gr am. 33 (1985), 43–60. [19] M. Ho c hster: Rings of inv ariants, Cohen-Macaula y rings generated by monomials, and p olytop es, A nn. Math. 96 (1972) 318–338. [20] G. Hommel, F. Bretz and W. Maurer: Po werful short-cuts for m ultiple testing pro cedures with sp ecial reference to gatek eeping strategies, Statist. Me d. 26 (2007) 4063-4073. [21] D.R. Hun ter: MM algorithms for generalized Bradley-T erry mo dels, A nn. Stat. 32 (2004) 384-406. [22] A. Katsab ekis and A. Thoma: Parametrizations of toric v arieties ov er an y ﬁeld, J. Algebr a 308 (2007) 751–763. [23] L. Katthän: Decomp osing sets of in v ersions, . [24] R.D. Luce: Individual Choic e Behavior , Wiley , New Y ork, 1959. [25] J.I. Marden: Analyzing and Mo deling R ank Data , Monographs on Statistics and Applied Probability , 64 , Chapman & Hall, London, 1995 [26] H. Ohsugi and T. Hibi: Normal p olytop es arising from ﬁnite graphs, J. Algebr a 207 (1998) 409–426. [27] H. Ohsugi and T. Hibi: T oric ideals generated b y quadratic binomials, J. Algebr a 218 (1999) 509–527. [28] L. Pac hter and B. Sturmfels: Algebr aic Statistics for Computational Biolo gy , Cam bridge Univ ersity Press, Cam bridge, 2005. [29] R.L. Plac k ett: Random p erm utations. J. R. Stat. So c., Ser. B 30 (1968) 517–534. [30] V. Reiner, F. Saliola and V. W elker: Sp ectra of symmetrized shuﬄing op erators, . [31] B. Sturmfels: Gröbner bases and con v ex p olytop es, Univ. Lect. Ser. 8 , AMS, Providence, 1996. [32] S. Sulliv an t: T oric ﬁb er pro ducts, J. Algebr a 316 (2007) 560–577. [33] R.H. Villarreal: Monomial A lgebr as , Pure and Appl. Math. 238 , Marcel Dekker, New Y ork, 2001. [34] D. Zeilberger: Pro of of a conjecture of Chan, Robbins, and Y uen. In: Orthogonal p olynomials: n u- merical and sym b olic algorithms (Leganés, 1998), Ele ctr on. T r ans. Numer. A nal. 9 (1999). Comm utative Algebra of Statistical Ranking 25 Dep ar tment of Ma thema tics, University of California, Berkeley, CA 94720, USA E-mail addr ess : bernd@math.berkeley.edu F a chbereich Ma thema tik und Informa tik, Philipps-Universit ä t, 35032 Marburg, Germany E-mail addr ess : welker@mathematik.uni-marburg.de

Commutative Algebra of Statistical Ranking

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment