The Quadratic Graver Cone, Quadratic Integer Minimization, and Extensions
We consider the nonlinear integer programming problem of minimizing a quadratic function over the integer points in variable dimension satisfying a system of linear inequalities. We show that when the Graver basis of the matrix defining the system is…
Authors: Jon Lee, Shmuel Onn, Lyubov Romanchuk
The Quadratic Gra v er Cone, Quadratic In teger Minimization, and Extensions Jon Lee, Shm uel Onn, Lyub o v Roma nc h uk, Rob ert W eismantel Abstract W e consider the nonlinear int eger programming problem of minimizing a quadratic function ov er the inte ger p oints in v ariable dimen s io n satisfying a system of linear inequaliti es. W e sh o w that when the Gra ver basis of the m a - trix defining the system is giv en, and the qu ad r at ic fu ncti on lies in a suitable dual Gr aver c one , the p roblem can b e solv ed in p olynomial time. W e discuss the relation b et ween this cone and the cone of p ositiv e s e midefinite matrices, and sho w that n o n e con tains th e other. So w e can minimize in p olynomial time some non-con v ex and some (including all separable) con v ex quadrics. W e conclude by extendin g our results to efficien t int eger minimization of m u ltiv ariate p olynomial fu nctio n s of arbitrary degree lying in s uita ble cones. 1 In tro du c tion Consider the general nonlinear inte ger minimization problem in standar d form, min { f ( x ) : x ∈ Z n , Ax = b , l ≤ x ≤ u } , (1) with A ∈ Z m × n , b ∈ Z m , l , u ∈ Z n ∞ with Z ∞ := Z ⊎ {±∞} , and f : R n → R . It is w ell kno wn to b e NP-hard already for linear functions. Ho we ver, recen tly it w as sho wn that, if the Gr aver b asis G ( A ) of A is g iven as part of the input, then the problem can b e solv ed in p olynomial time for the follo wing classes of functions. First, in [1], f o r composite concav e functions f ( x ) = g ( W x ), with W ∈ Z d × n , g : R d → R conca v e, and d fixed. Second, in [3], for separable conv ex functions f ( x ) = P i f i ( x i ) with eac h f i univ ariate con v ex, and in particular for linear functions f ( x ) = w ⊺ x . While the Grav er basis is a complex ob ject, it can b e computed in p olynomial time from A for man y natural and useful classes of matrices a s demonstrated in [1, 3]. Moreo ve r , the results o f [2] imply that t here is a para meterized sc heme that enables to construct increasingly better approximations of the Grav er basis of a n y matrix A and obtain increasingly b etter appro ximations to problem (1), see [4] for details. In this article we contin ue this line of in ves t igation a nd consider problem (1) for quadratic f unc t io ns f ( x ) = x ⊺ V x + w ⊺ x + a with V ∈ R n × n , w ∈ R n , and a ∈ R . W e a lso discus s extensions to multiv ariate po ly no mial functions of arbitrary degree. W e b egin b y noting that problem (1 ) remains NP-hard ev en if the Grav er basis is part o f the input and ev en if the ob jectiv e function is quadratic con vex of ra nk 1. 1 2 Prop osition 1.1 It is NP-har d to determine the optimal value of the pr oblem min x ⊺ V x + w ⊺ x + a : x ∈ Z n , Ax = b , l ≤ x ≤ u (2) even when G ( A ) is giv en and the function is c onvex quadr atic with m atrix V = v v ⊺ . Pr o of. Let v ∈ Z n + and v 0 ∈ Z + b e input to the subset sum problem of deciding if there exists x ∈ { 0 , 1 } n with v ⊺ x = v 0 . Let A := 0 b e the zero 1 × n ma t rix , whose Gra v er basis G ( A ) = {± 1 i : i = 1 , . . . , n } consists of the n unit v ectors and their negations. Let l := 0 and u := 1 b e the zero and a ll-ones v ectors in Z n , and let b := 0 in Z m . Let V := v v ⊺ , w := − 2 v 0 v , and a := v 2 0 . Then problem (2) b ecomes min n v ⊺ x − v 0 2 : x ∈ { 0 , 1 } n o , whose optimal v alue is 0 if and only if there is a subset sum, pro ving the claim. This shows that to solv e problem (2) in p olynomial time, ev en when the Grav er basis is g iv en, some restrictions on the c lass of quadratic functions m ust be enforced. In Section 2 w e introduce the quadr atic Gr aver c one Q ( A ), whic h is a cone of n × n matrices defined via the Gr av er basis of A , a nd the diagon al Gr aver c one D ( A ) whic h is the diagonal pro jection of Q ( A ) in to R n + . W e discus s some elemen t a ry prop erties of these cones and their duals Q ∗ ( A ) and D ∗ ( A ) and give some examples. In Section 3 w e pro ve the following algo r it hm ic result a bout the solv abilit y of problem (1) for ev ery quadratic function (p ossibly indefinite, neither con vex nor conca v e) whose defining matrix lies in the dual quadratic Grav er cone. Theorem 1.2 Ther e is an algorithm that, given G ( A ) , solves the quadr atic pr oblem min x ⊺ V x + w ⊺ x + a : x ∈ Z n , Ax = b , l ≤ x ≤ u (3) in p olynomial time for every inte ger matrix V lying in the c one Q ∗ ( A ) dual to Q ( A ) . W e p oin t out that, in practice, the algor it hm that underlies Theorem 1.2 can b e applied to any quadratic function. The algorithm will alwa ys stop a nd output a feasible solution if one exists, whic h can b e used as a n approxim ation of the optimal one. And, whenev er V lies in Q ∗ ( A ), the solution pro duced will b e t r ue optimal. As a sp ecial case w e obtain the following result on separable quadratic functions. Theorem 1.3 Ther e is an algorithm that, give n G ( A ) , s o lves the sep ar able p r oblem min { n X i =1 v i x 2 i + w i x i + a i : x ∈ Z n , Ax = b , l ≤ x ≤ u } (4) in p olynomial time for every inte ger ve ctor v lying in the c one D ∗ ( A ) dual to D ( A ) . In p articular, this applies to any c onvex sep ar able quadr atic, that is, with v ∈ Z n + . 3 In particular, Theorem 1.3 enables us to solv e the problem with an y linear ob jectiv e function f ( x ) = w ⊺ x , whic h is the sp ecial case with v = 0, whic h is alwa ys in D ∗ ( A ). In Section 4 w e pro ce ed with a discus sion of the relation b e t we en the dual quadratic Grav er cone Q ∗ ( A ) a nd the cone S n + of symmetric p ositiv e semidefinite matrices, and establish Theorem 4.2 whic h pro vides a c haracterization, in terms of their matr o ids only , of those matrices A for whic h the dual dia gonal Grav er cone D ∗ ( A ) strictly contains R n + and for which Theorem 1.3 a ssures efficien t solution of problem (4) for all separable conv ex as w ell as some noncon v ex quadra t ic functions. In the final Section 5 w e extend our results to m ultiv ariate p olynomial functions of arbitrary degree. W e define a hierarch y of higher degree analogues P k ( A ) of the quadratic Grav er cone, and sho w that the iterative algorithm of Theorem 1.2 solve s the p olynomial in teger minimization problem (1) in p olynomial time for every degree d form f that lies in a cone K d ( A ) defined in terms of the dual G ra v er cones P ∗ k ( A ). Theorem 1.4 F or every fixe d d ther e is a n algorithm that, given G ( A ) , solv es min { f ( x ) : x ∈ Z n , Ax = b , x ≥ 0 } (5) in p olynomia l time for every de gr e e d in te ger homo genous p olynomial f in K d ( A ) . 2 The quadrati c and di ag o nal Gra v er co nes W e b egin with some notatio n. The inner pro duct of tw o m × n matrices U, V is U · V := P i,j U i,j V i,j . The diagonal of n × n matrix V is the v ector v := diag( V ) ∈ R n defined b y v i := V i,i for all i . F or u ∈ R n w e denote b y U := D iag( u ) the n × n diagonal matrix with diag( U ) = u . The p ointwise p r o duct of v ectors g , h ∈ R n is the v ector g ◦ h in R n with ( g ◦ h ) i := g i h i for all i . Note t hat g , h lie in t he same o r t han t of R n if and only if g ◦ h ≥ 0. The tensor pr o duct of g , h ∈ R n is the n × n matrix g ⊗ h = g h ⊺ with ( g ⊗ h ) i,j := ( g h ⊺ ) i,j = g i h j for all i, j . W e will use the nota t io n g ⊗ h and g h ⊺ in terc hangeably as w e find appropriate. Note that for all g , h ∈ R n and V ∈ R n × n , w e hav e g ◦ h = diag( g ⊗ h ) and ( g ⊗ h ) · V = g ⊺ V h . An y quadratic function f ( x ) = x ⊺ V x + w ⊺ x + a has an equiv alen t description f ( x ) = x ⊺ U x + w ⊺ x + a with U := 1 2 ( V + V ⊺ ) symmetric matrix. W e therefore can and will b e w orking with symmetric matrices whic h are m uc h b etter b eha v ed than arbitrary square matrices. W e denote b y S n ⊂ R n × n the linear subspace of symmetric n × n matrices. A c one is a subset P of real v ector space such that αx + β y ∈ P for all x, y ∈ P a nd α , β ∈ R + . The c one gene r ate d by a set V o f v ectors is the set cone( V ) of nonnegativ e linear combinations of finitely man y ve ctors from V . In particular, cone( ∅ ) := { 0 } . W e will b e using cones D ⊆ R n of v ectors and cones Q ⊆ S n of n × n symmetric matrices. The dual of a cone D ⊆ R n and the (symmetric) dual of a cone Q ⊆ S n are, respectiv ely , the cones D ∗ := { v ∈ R n : u ⊺ v ≥ 0 , u ∈ D } , Q ∗ := { V ∈ S n : U · V ≥ 0 , U ∈ Q} . Dualit y rev erses inclusions, that is, if P ⊆ K are cones in R n or S n then K ∗ ⊆ P ∗ . 4 W e pro ceed with the definition of the G r av er basis of an in teger matrix. The lattice of an inte ger m × n matrix A is the set L ( A ) := { x ∈ Z n : Ax = 0 } . W e denote b y L ∗ ( A ) the set of nonzero elemen ts in L ( A ). W e use a partia l order ⊑ on R n whic h extends the co ordinate-wise partial order ≤ o n the nonnegative or t ha n t R n + and is defined as fo llo ws. F or x, y ∈ R n w e write x ⊑ y and say that x is c onformal to y if x ◦ y ≥ 0 (that is, x, y lie in the same o rthan t) and | x i | ≤ | y i | fo r all i . W e write x ⊏ y if x ⊑ y and x 6 = y . A simple extension of the classical Gordan Lemma implies that ev ery subset of Z n has finitely man y ⊑ -minimal elemen ts. Definition 2.1 The Gr aver b asis of an intege r matrix A is defined t o b e the finite set G ( A ) ⊂ Z n of ⊑ -minimal elemen ts in L ∗ ( A ) = { x ∈ Z n : Ax = 0 , x 6 = 0 } . In this article w e in tro duce the follow ing ob jects defined via the Grav er basis. Definition 2.2 The quadr atic Gr aver c one of an in teger m × n matrix A is defined to b e the cone Q ( A ) ⊆ S n of n × n ma t r ic es generated b y the matrices g ⊗ h + h ⊗ g o ve r all pairs of distinct elemen ts g , h ∈ G ( A ) that lie in the same orthant, that is, Q ( A ) := cone { g ⊗ h + h ⊗ g : g , h ∈ G ( A ) , g 6 = h , g ◦ h ≥ 0 } ⊆ S n . The dual q u adr atic Gr aver c one is its (symmetric) dual Q ∗ ( A ) in S n giv en b y Q ∗ ( A ) = { V ∈ S n : U · V ≥ 0 , U ∈ Q ( A ) } (6) = { V ∈ S n : ( g h ⊺ + hg ⊺ ) · V ≥ 0 , g , h ∈ G ( A ) , g 6 = h , g ◦ h ≥ 0 } = { V ∈ S n : g ⊺ V h ≥ 0 , g , h ∈ G ( A ) , g 6 = h , g ◦ h ≥ 0 } . W e are also in terested is the follow ing cone of diagonals of matrices in Q ( A ). Definition 2.3 The diagonal Gr aver c one of A is the cone of nonnegative v ectors D ( A ) := cone { g ◦ h : g , h ∈ G ( A ) , g 6 = h , g ◦ h ≥ 0 } ⊆ R n + . The dual d iagonal Gr aver c one is its dual D ∗ ( A ) in R n giv en b y D ∗ ( A ) = { v : u ⊺ v ≥ 0 , u ∈ D ( A ) } (7) = { v : ( g ◦ h ) ⊺ v ≥ 0 , g , h ∈ G ( A ) , g 6 = h , g ◦ h ≥ 0 } = { v : X g i h i v i ≥ 0 , g , h ∈ G ( A ) , g 6 = h , g ◦ h ≥ 0 } . The follow ing lemma provides some basic relations among the ab ov e cones and more. All inclusions c a n b e strict, as is demonstrated in Examples 2.5 and 2.6 below . In particular, it is in teresting to not e that D ( A ) is the diagona l pro jection of Q ( A ), but D ∗ ( A ) is generally strictly con tained in the diago nal pro jection of Q ∗ ( A ). Lemma 2.4 The quadr atic a nd diagonal Gr aver c ones and their duals satisfy R n + ⊇ D ( A ) = { diag ( U ) : U ∈ Q ( A ) } ⊇ { u : Diag ( u ) ∈ Q ( A ) } , (8) R n + ⊆ D ∗ ( A ) = { v : Diag( v ) ∈ Q ∗ ( A ) } ⊆ { diag ( V ) : V ∈ Q ∗ ( A ) } . 5 Pr o of. First, D ( A ) ⊆ R n + b ecaus e it is generated by nonnegativ e v ectors. Therefore D ∗ ( A ) ⊇ ( R n + ) ∗ = R n + . T o establish the top equalit y note that the followin g are equiv alen t: u ∈ D ( A ); u = P k µ k ( g k ◦ h k ) for some suitable µ k ≥ 0, g k , h k ∈ G ( A ); u = diag ( U ) with U = 1 2 P k µ k ( g k ⊗ h k + h k ⊗ g k ); and u = diag ( U ) with U ∈ Q ( A ). T o establish t he b ottom equalit y note that the fo llo wing are equiv alen t: v ∈ D ∗ ( A ); ( g ◦ h ) ⊺ v ≥ 0 for all suitable g , h ∈ G ( A ); V = Diag( v ) with g ⊺ V h ≥ 0 for all g , h ; and V = Diag( v ) with V ∈ Q ∗ ( A ). The t wo remaining inclusions on the right- hand sides follo w from diag (Diag( x )) = x . This completes the pro of o f the lemma. The next t wo example s show t ha t all inclusions in Lemma 2.4 can b e strict. Example 2.5 Consider the zer o 1 × n matrix A := 0 , whose Gr aver b asis is given by G ( A ) = { ± 1 i : i = 1 , . . . , n } . Then g ◦ h = 0 is the ze r o ve ctor fo r al l distinct g , h ∈ G ( A ) in the sam e orthant. So the diagonal Gr aver c one and its d ual ar e D ( A ) = { 0 } ( R n + and D ∗ ( A ) = R n ) R n + so the lef t inc lusions in (8 ) ar e strict. Example 2.6 Consider the 1 × 3 matrix A := (1 1 1) with Gr aver b asis G ( A ) = ±{ (1 , − 1 , 0) , (1 , 0 , − 1) , (0 , 1 , − 1) } . The quadr atic Gr aver c one and its dual satisfy Q ( A ) = cone 2 − 1 − 1 − 1 0 1 − 1 1 0 , 0 − 1 1 − 1 2 − 1 1 − 1 0 , 0 1 − 1 1 0 − 1 − 1 − 1 2 , Q ∗ ( A ) = a d e d b f e f c : a − d − e + f ≥ 0 b − d + e − f ≥ 0 c + d − e − f ≥ 0 ⊇ 2 a a + b a + c a + b 2 b b + c a + c b + c 2 c : a, b, c ∈ R . (9) The dia g onal Gr aver c one and its dual ar e D ( A ) = R n + and D ∗ ( A ) = R n + . The r efor e, the top and b ottom inclusions on the right-han d side of e quation (8) a r e strict, D ( A ) = R n + ) { 0 } = { u : D iag( u ) ∈ Q ( A ) } , D ∗ ( A ) = R n + ( R n = { diag ( V ) : V ∈ Q ∗ ( A ) } . 3 Quadratic i nteger minimization W e pro ceed to establish our a lgorithmic Theorems 1 .2 and 1.3. W e fo cus o n the situation of finite f e asible sets, wh ic h is natura l in m ost applications. But we do allow the low er and upp er b ounds l , u ∈ Z n ∞ to hav e infinite comp onen ts for flexibilit y of mo deling (f or instance, it is quite common in applications to ha ve l i = 0 and u i = ∞ for all i , with the resulting feasible set typ ically still finite). W e also require our algorithms to iden tify a nd prop erly stop when the set is infinite. So in all algor it hm ic 6 statemen ts, an alg o rithm is said to solve a (nonlinear) discrete optimization problem, if for ev ery input, it either finds a n optimal solution, or a s serts that the problem is infeasible or the feasible set is infinite. W e b egin with a simple lemma that sho ws that w e can quic kly minimize a given quadratic function in a giv en direction. Lemma 3.1 Ther e is an algorithm that, given b ounds l , u ∈ Z n ∞ , dir e ction g ∈ Z n , p oint z ∈ Z n with l ≤ z ≤ u , and quadr atic function f ( x ) = x ⊺ V x + w ⊺ x + a w i th V ∈ Z n × n , w ∈ Z n , and a ∈ Z , solves in p olynomia l time the univariate pr oblem min { f ( z + µg ) : µ ∈ Z + , l ≤ z + µg ≤ u } . (10) Pr o of. Let S := { µ ∈ Z + : l ≤ z + µg ≤ u } , and let s := sup S whic h is easy to determine. If s = ∞ the n we conclude that S is infinite and stop. Otherwise w e need to minimize the univ ariate quadratic function h ( µ ) := f ( z + µg ) = h 2 µ 2 + h 1 µ + h 0 with h 2 := g ⊺ V g , h 1 := z ⊺ V g + g ⊺ V z + w ⊺ g , and h 0 := z ⊺ V z + w ⊺ z + a ov er S = { 0 , 1 , . . . , s } . If h 2 ≤ 0, then h is conca v e, and the minim um o ver S is attained at µ = 0 or µ = s . If h 2 > 0 then h is con v ex with real minimum a t µ ∗ := − h 1 2 h 2 . Then minimizing h o ver S reduces to minimizing h ov er S ∩ { 0 , ⌊ µ ∗ ⌋ , ⌈ µ ∗ ⌉ , s } . A finite sum u := P i v i of ve ctors in R n is called c onformal if v i ⊑ u for all i , and hence all summands lie in the same orthant. The follow ing lemma sho ws tha t quadratic f with defining matrix in the dua l quadratic Grav er cone is sup ermodular on conformal sums of nonnegativ e com binatio ns of elemen ts of the Grav er basis. Lemma 3.2 L et A b e any inte ger m × n matrix wi t h quadr atic Gr aver c one Q ( A ) . L et f : R n → R b e any quadr atic f unc tion f ( x ) = x ⊺ V x + w ⊺ x + a with V ∈ Q ∗ ( A ) . L et x ∈ R n b e any p oint, and let P µ i g i b e any c onformal sum in R n with g i ∈ G ( A ) distinct elem ents in the Gr aver b asis of A and µ i ≥ 0 nonne gative sc alars. Then ∆ := f x + X µ i g i − f ( x ) − X ( f ( x + µ i g i ) − f ( x )) ≥ 0 . Pr o of. W e hav e f x + X µ i g i − f ( x ) = X x ⊺ V µ j g j + X µ i g ⊺ i V x + X i,j µ i g ⊺ i V µ j g j + X w ⊺ µ i g i , and X ( f ( x + µ i g i ) − f ( x )) = X x ⊺ V µ i g i + µ i g ⊺ i V x + µ i g ⊺ i V µ i g i + w ⊺ µ i g i . Therefore w e obta in ∆ = X i,j µ i g ⊺ i V µ j g j − X µ i g ⊺ i V µ i g i = X i 6 = j µ i g ⊺ i V µ j g j = X i 6 = j µ i µ j g ⊺ i V g j ≥ 0 , b ecaus e g i , g j ∈ G ( A ) satisfy g i ◦ g j ≥ 0 and g i 6 = g j for i 6 = j , a nd V is in Q ∗ ( A ). W e need tw o more useful prop erties of G ra v er bases. F irst w e need the following in teger analo gue of Carat h ´ eo dory’s theorem of [6 ] whic h w e state without pro of. 7 Lemma 3.3 L et A b e an inte ger m × n matrix, an d let G ( A ) b e its Gr aver b asis . Then every x ∈ L ∗ ( A ) is a c onformal sum x = P t i =1 µ i g i that involves t ≤ 2 n − 2 Gr aver b asis elemen ts g i ∈ G ( A ) and nonne gative in t e ger c o efficients µ i ∈ Z + . The next lemma prov ides a G ra v er basis criterion for finiteness o f in teger programs. Lemma 3.4 L et G ( A ) b e the Gr aver b asis of matrix A , and let l, u ∈ Z n ∞ . I f ther e is som e g ∈ G ( A ) satisfying g i ≤ 0 whenever u i < ∞ and g i ≥ 0 whenev e r l i > −∞ then every set o f the form S := { x ∈ Z n : Ax = b , l ≤ x ≤ u } is either empty or infinite, wher e as if ther e is no such g , then every set S o f this form is finite. Cle arly, given the Gr aver b asis, the existenc e of such g c an b e che cke d in p olynomial time . Pr o of. Supp ose there is suc h g and consider suc h S containing a p oin t x . Then for all λ ∈ Z + w e hav e l ≤ x + λg ≤ u and A ( x + λg ) = Ax = b , and hence x + λg ∈ S so S is infinite. Next supp ose S is infinite. Then P := { x ∈ R n : Ax = b, l ≤ x ≤ u } is un b ounded, and hence has a r ecession v ector, whic h w e ma y assume is inte ger, that is, a nonzero h suc h tha t x + α h ∈ P for a ll x ∈ P and α ≥ 0. Then h ∈ L ∗ ( A ) a nd h i ≤ 0 whenev er u i < ∞ a nd h i ≥ 0 whenev er l i > −∞ . By Lemma 3.3, the vec tor h is a conformal sum h = P g i of v ectors g i ∈ G ( A ), each of whic h also satisfies g i ≤ 0 whenev er u i < ∞ and g i ≥ 0 whenev er l i > −∞ , providing suc h g . Next w e prov e the main lemma underlying our algorithm, which sho ws tha t, giv en the Grav er basis, and an initial f e asible p oin t, w e can minimize a quadratic function with defining matrix in t he dual quadratic Gr a v er cone in p olynomial time. Lemma 3.5 Ther e is an algo rit hm that, given A ∈ Z m × n , its Gr aver b asis G ( A ) , b ounds l, u ∈ Z n ∞ , p oint z ∈ Z n with l ≤ z ≤ u , and q u adr atic f ( x ) = x ⊺ V x + w ⊺ x + a with inte ger V ∈ Q ∗ ( A ) , w ∈ Z n , and a ∈ Z , solves in p olynomial time the pr o gr am min { f ( x ) = x ⊺ V x + w ⊺ x + a : x ∈ Z n , Ax = b , l ≤ x ≤ u } , b := Az . (11) Pr o of. First, apply the algorit hm o f Lemma 3.4 to G ( A ) and l , u and either detect that the f easible set is infinite and stop, or conclude it is finite and con tinue. Next pro duce a sequence of feasible p oin ts x 0 , x 1 , . . . , x s with x 0 := z the g iv en input p oin t, as follo ws. Having obtained x k , solv e the minimization problem min { f ( x k + µg ) : µ ∈ Z + , g ∈ G ( A ) , l ≤ x k + µg ≤ u } (12) b y applying t he algorithm of Lemma 3.1 for eac h g ∈ G ( A ). If the minimal v alue in (12) satisfies f ( x k + µg ) < f ( x k ) then set x k +1 := x k + µg a nd r ep eat, else stop and output the last p oin t x s in the sequence. Now, Ax k +1 = A ( x k + λg ) = Ax k = b b y induction on k , so each x k is feasible. Because the feasible set is finite and the x k ha ve decreasing ob jectiv e v alues and hence distinct, the algo rithm t e rminates. W e now show that the p oin t x s output by the alg orithm is optimal. Let x ∗ b e an y optimal solution to (11). Consider any p oin t x k in t he sequence, and supp o s e 8 that it is not optimal. W e claim that a new p oin t x k +1 will b e pro duced and will satisfy f ( x k +1 ) − f ( x ∗ ) ≤ 2 n − 3 2 n − 2 ( f ( x k ) − f ( x ∗ )) . (13) By Lemma 3.3, we can write the difference x ∗ − x k = P t i =1 µ i g i as conformal sum in v olving 1 ≤ t ≤ 2 n − 2 elemen ts g i ∈ G ( A ) with all µ i ∈ Z + . By Lemma 3.2, f ( x ∗ ) − f ( x k ) = f x k + t X i =1 µ i g i ! − f ( x k ) ≥ t X i =1 ( f ( x k + µ i g i ) − f ( x k )) . Adding t ( f ( x k ) − f ( x ∗ )) on b oth sides and rearranging terms, w e obtain t X i =1 ( f ( x k + µ i g i ) − f ( x ∗ )) ≤ ( t − 1) ( f ( x k ) − f ( x ∗ )) . Therefore there is some summand on the left-hand side satisfying f ( x k + µ i g i ) − f ( x ∗ ) ≤ t − 1 t ( f ( x k ) − f ( x ∗ )) ≤ 2 n − 3 2 n − 2 ( f ( x k ) − f ( x ∗ )) . So the p oin t x k + µg attaining minimu m in (12) satisfies f ( x k + µg ) − f ( x ∗ ) ≤ f ( x k + µ i g i ) − f ( x ∗ ) ≤ 2 n − 3 2 n − 2 ( f ( x k ) − f ( x ∗ )) , and so indeed x k +1 := x k + µg will b e pro duced a nd will satisfy (13). This sho ws that the last p oin t x s pro duced and output b y the algorit hm is indeed optimal. W e pro ceed to b ound the n umber s of p oin t s. C o ns ider an y i < s and the in termediate non- optimal p oin t x i in the sequen ce pro duced b y the algo r it hm. Then f ( x i ) > f ( x ∗ ) with b oth v a lue s in t eger, and so rep eated use of (13) giv es 1 ≤ f ( x i ) − f ( x ∗ ) = i − 1 Y k =0 f ( x k +1 ) − f ( x ∗ ) f ( x k ) − f ( x ∗ ) ( f ( x ) − f ( x ∗ )) ≤ 2 n − 3 2 n − 2 i ( f ( x ) − f ( x ∗ )) , and therefore i ≤ log 2 n − 2 2 n − 3 − 1 log ( f ( x ) − f ( x ∗ )) . Therefore the n um b er s of p oin ts pro duced by the a lgorithm is at most one unit larger than this b ound, and using a simple b ound on the logarithm, w e obtain s = O ( n lo g ( f ( x ) − f ( x ∗ ))) . Th us, the n umber of p oin ts pro duced and the total running time are p olynomial. Next w e show that, given the Gr av er basis, we can also find an initial feasible p oin t for assert that the giv en set is empt y or infinite, in p olynomial time. 9 Lemma 3.6 Ther e is an algori thm that, g iven inte ger m × n matrix A , i ts Gr aver b asis G ( A ) , l, u ∈ Z n ∞ , and b ∈ Z m , in p olynomia l time, either finds a fe asibl e p oint in the set S := { x ∈ Z n : Ax = b, l ≤ x ≤ u } or asserts that S i s empty or infinite. Pr o of. Assume that l ≤ u and that l j < ∞ and u j > −∞ for a ll j , b ecause otherwis e there is no feasible p oin t. Also assume that there is no g ∈ G ( A ) satisfying g j ≤ 0 whenev er u j < ∞ and g j ≥ 0 whenev er l j > −∞ , b ecause otherwise S is empt y or infinite b y Lemma 3.4. No w, either detect there is no integer solution to the system of equations Ax = b (without the lo w er and upp er b ound constrain ts) and stop, or determine some suc h solution x ∈ Z n and con tinue ; it is well kno wn that this can b e done in p olynomial time, say , using the Hermite normal form of A , see [5]. Let I := { j : l j ≤ x j ≤ u j } ⊆ { 1 , . . . , n } b e the set of indices of en tr ie s of x that satisfy their lo w er and upp er b ounds. While I ( { 1 , . . . , n } rep eat the follow ing pro cedure. Pic k any index i / ∈ I . Then either x i < l i or x i > u i . W e describ e the pro cedure only in the former case, the latter b eing symmetric. Up date the low er and upp er bounds b y setting ˆ l j := min { l j , x j } , ˆ u j := max { u j , x j } , j = 1 , . . . , n . Solv e in p olynomial time the follow ing linear in teger program, for which x is feasible , max { z i : z ∈ Z n , Az = b , ˆ l ≤ z ≤ ˆ u , z i ≤ u i } , (14) b y applying the algo r it hm of Lemma 3.5 using the function f ( z ) := z ⊺ 0 z + 1 ⊺ i z + 0 with V = 0 t he zero ma t r ix whic h is alw ays in Q ∗ ( A ). Now ˆ l j > −∞ if and only if l j > −∞ , and ˆ u j < ∞ if a nd o nly if u j < ∞ . So there is no g ∈ G ( A ) satisfying g j ≤ 0 whenev er ˆ u j < ∞ and g j ≥ 0 whenev er ˆ l j > −∞ , and hence the feasible set of (14) is finite by Lemma 3.4 a nd has an o ptimal solution z . If z i < l i then a s sert that the set S is empty and stop. Otherwise , set x := z , I := { j : l j ≤ x j ≤ u j } , and r ep eat. No t e that in eac h iteration, the cardinality of I increases by at least one. Therefore, after at most n iterations, either the algorithm detects inf e asibilit y , or I = { 1 , . . . , n } is obtained, in whic h case the current p oin t x is feasible. W e are no w in p osition to establish o ur theorem. Theorem 1.2 Ther e is an algorithm that, g i v en A ∈ Z m × n , its Gr aver b asis G ( A ) , b ounds l , u ∈ Z n ∞ , b ∈ Z m , inte ger matrix V ∈ Q ∗ ( A ) in the dual quadr atic Gr aver c one, w ∈ Z n , and a ∈ Z , solves in p olynomia l time the quadr atic inte ger pr o gr am min { x ⊺ V x + w ⊺ x + a : x ∈ Z n , Ax = b , l ≤ x ≤ u } . Pr o of. Use the alg o rithm underly ing Lemma 3.6 to either detect that t he problem is infeasible or that the feasible set is infinite and stop, or obtain a feasible p oin t and use the algorithm underlying Lemma 3.5 to obta in an optimal solution. 10 An imp ortan t immediate consequence of Theorem 1.2 is that we can efficien tly minimize separable quadratic functions defined by v ectors in the dual diagona l Gra v er cone. In particular, it a pplie s to ev ery con v ex separable quadratic function (whic h can also b e deduced from the results of [3] on separable conv ex functions). Theorem 1.3 Ther e is an algorithm that, g i v en A ∈ Z m × n , its Gr aver b asis G ( A ) , b ounds l , u ∈ Z n ∞ , b ∈ Z m , inte ger ve ctor v ∈ D ∗ ( A ) in the dual diagonal Gr aver c one, and w , a ∈ Z n , solves in p olynomial time the se p ar able quadr atic p r o gr am min { n X i =1 v i x 2 i + w i x i + a i : x ∈ Z n , Ax = b , l ≤ x ≤ u } . In p articular, this applies to any c onvex sep ar able quadr atic, that is, with v ∈ Z n + . Pr o of. First, for any v ∈ D ∗ ( A ) we ha ve V := Dia g( v ) ∈ Q ∗ ( A ) b y Lemma 2.4. Hence, b y Theorem 1 .2 , w e can minimize in p olynomial time the quadratic function n X i =1 v i x 2 i + w i x i + a i = x ⊺ V x + w ⊺ x + n X i =1 a i . Second, if the separable quadratic function is conv ex, whic h is equiv alen t to its defining v ector v b eing nonnegativ e, t he n v ∈ R n + ⊆ D ∗ ( A ) b y L emma 2.4 again. Hence the sec o nd statemen t of the theorem now follow s from the first statemen t. 4 Noncon v ex solv able quadratic s and matroi d s Consider the quadratic minimization pro blem, with the G ra v er basis of A g iven, min f ( x ) = x ⊺ V x + w ⊺ x + a : x ∈ Z n , Ax = b , l ≤ x ≤ u . (15) The function f is conv ex if and only if its defining mat r ix V is p ositiv e semidefinite, that is, if x ⊺ V x ≥ 0 for all x ∈ R n . Let S n + ⊂ S n denote the cone of symmetric p ositiv e semidefi nite matrices. No w, on the one hand, if V ∈ Q ∗ ( A ) then, b y Theorem 1.2, we can solv e problem (15) efficien tly . On the other hand, if V ∈ S n + then f is con v ex, and pr o ble m (1 5) may seem to be easie r, but re mains NP-hard ev en for rank-1 matrices V = v v ⊺ ∈ S n + b y Prop osition 1.1. So it is unlik ely that Q ∗ ( A ) con tains S n + , and it is in teresting to consider the relation b et w een these m atrix c ones. F or this, w e need a couple of basic f acts ab out p ositiv e semidefinite ma t r ices. First, Note that for an y v ector u ∈ R n , the r a nk - 1 matrix u u ⊺ is in S n + b ecaus e x ⊺ ( uu ⊺ ) x = ( u ⊺ x ) 2 ≥ 0 for a ll x ∈ R n , whereas fo r any t wo linearly indep enden t v ectors g , h ∈ R n , the rank-2 ma t r ix g h ⊺ + hg ⊺ is in S n \ S n + b ecaus e there is an x ∈ R n with g ⊺ x = 1 and h ⊺ x = − 1 and hence x ⊺ ( g h ⊺ + hg ⊺ ) x = 2 ( g ⊺ x )( h ⊺ x ) = − 2 < 0. Second, the cone o f symmetric p ositiv e semidefinite matrices is self dual, that is, ( S n + ) ∗ = S n + . T o see this, no te t ha t if U ∈ S n \ S n + then there is an x ∈ R n with ( x ⊗ x ) · U = x ⊺ U x < 0 so U / ∈ ( S n + ) ∗ ; and if V ∈ S n + has rank r , then V = P r i =1 x i ⊗ x i for some x i ∈ R n and hence U · V = P r i =1 x ⊺ i U x i ≥ 0 for all U ∈ S n + , so V ∈ ( S n + ) ∗ . 11 So we can conclude the fo llo wing. In the rare situation where each orthan t of R n con tains at most one elemen t of G ( A ), w e hav e Q ( A ) = { 0 } and Q ∗ ( A ) = S n , so Theorem 1.2 enables to solv e problem (15) for any quadratic function. In the more t ypical situation, where some ort han t do es con t a in tw o elemen ts g , h ∈ G ( A ), the corresp onding generator of Q ( A ) satisfies g h ⊺ + hg ⊺ ∈ S n \ S n + and he nce Q ( A ) * S n + . By self duality of S n + , we obtain S n + = ( S n + ) ∗ * Q ∗ ( A ). So w e cannot solve problem (15) for all con v ex quadratics, reflecting the NP-ha r dne ss of the con ve x problem. But w e do t ypically also hav e Q ∗ ( A ) * S n + , that is, we can solv e problem (15) in p olynomial time for v arious noncon v ex quadratics. F or instance, in Example 2.6, the matrix in (9) is no t p ositiv e semidefinite for all a, b, c < 0 . Moreo v er, b y Lemma 2.4, R n + ⊆ D ∗ ( A ) = { v : D iag( v ) ∈ Q ∗ ( A ) } , s o Q ∗ ( A ) \ S n + 6 = ∅ whenev er D ∗ ( A ) \ R n + 6 = ∅ . W e pro ceed to discuss t his diagonal case, where the function f is defined b y a diagonal ma t r ix V = Diag( v ) for some v ∈ R n , that is, f is separable of t he form f ( x ) = P i ( v i x 2 i + w i x i + a i ). In this case, f is con vex if and only if v is nonnegative. As not ed in Lemma 2.4, the dual diag onal Grav er cone D ∗ ( A ) alw ays contains the nonnegativ e ortha nt R n + . W e pro ceed to characterize those matrices A for whic h this inclusion is strict, so t hat D ∗ ( A ) \ R n + 6 = ∅ and Theorem 1.3 enables to solv e problem (15) in p olynomial time also for v arious noncon v ex separable quadratics. F or this w e need a f ew more definitions. A cir cuit of an integer matrix A is an elemen t c ∈ L ∗ ( A ) whose supp ort supp( c ) is minimal under inclusion and whose en tries are relativ ely prime. W e denote the set o f circuits of A b y C ( A ). It is easy to see that for ev ery integer matrix A , t he set of circuits is con tained in the Gra ver basis, that is, C ( A ) ⊆ G ( A ). Recall that a finite sum u := P i v i of v ectors in R n is c onformal if v i ⊑ u for all i , and hence all summands lie in the same or t ha n t. The follo wing prop ert y of circuits is well known. F or a pro of see, for instance, [4] or [7]. Lemma 4.1 L et A b e an inte ger matrix. Then every x ∈ L ∗ ( A ) is a c onfo rm al sum x = P i α i c i involving cir cuits c i ∈ C ( A ) and n o nne g ative r e al c o efficients α i ∈ R + . It turns out that the matroid of linear dep enden cies o n the columns of t he in teger m × n matrix A (o v er the reals or inte gers) pla ys a cen tral role in the characterization w e a re heading for. A ma tr oid-cir cuit is any set C ⊆ { 1 , . . . , n } that is the supp ort C = supp( c ) of some circuit c ∈ C ( A ) of A . No te that a circuit c is in C ( A ) if and only if its antipo dal − c is, a nd if c, e ∈ C ( A ) are circuits with c 6 = ± e then supp( c ) 6 = supp( e ). W e denote the set of matroid- circ uits of A , that is, the set of supp orts of circuits in C ( A ), by M ( A ) := { supp( c ) : c ∈ C ( A ) } , and refer to it simply as the matr oid of A . F or instance, for the 1 × 3 matrix A := (1 2 1) w e ha v e C ( A ) = ± { (2 , − 1 , 0) , (0 , − 1 , 2) , (1 , 0 , − 1) } , M ( A ) = {{ 1 , 2 } , { 2 , 3 } ) , { 1 , 3 }} . W e no w characterize those matrices A for whic h D ∗ ( A ) strictly con tains R n + . Theorem 4.2 The dual diagonal Gr aver c one of every in t e ger m × n matrix A satisfies D ∗ ( A ) ⊇ R n + , a n d the inclusion is strict i f and only if ther e is 1 ≤ k ≤ n such that C ∩ E 6 = { k } for every two distinct matr oid-cir cuits C , E ∈ M ( A ) of A . Pr o of. W e pro ve the dual statemen ts ab out the diagonal Grav er cone. By definition D ( A ) ⊆ R n + , and the inclusion is strict if and only if some unit v ector 1 k is not in 12 D ( A ). Therefore it suffices to prov e that, for any 1 ≤ k ≤ n , we ha v e 1 k ∈ D ( A ) if and only if there are t w o distinct matr oid-circuits C , E ∈ M ( A ) with C ∩ E = { k } . Supp ose first C , E ∈ M ( A ) are distinct matro id-circuits with C ∩ E = { k } . Then there are c, e ∈ C ( A ) with c 6 = ± e suc h that supp( c ) = C and supp( e ) = E . Replacing e b y − e ∈ C ( A ) if neces sary w e may assum e that c k e k > 0. Then c ◦ e ≥ 0, c 6 = e , and c, e ∈ G ( A ) imply that c ◦ e = c k e k 1 k is a generator of D ( A ), and hence 1 k ∈ D ( A ). Con ve rsely , supp ose 1 k ∈ D ( A ). Because D ( A ) ⊆ R n + , some nonneg- ativ e m ult iple of 1 k m ust b e one o f the generators. So there are g , h ∈ G ( A ) with g ◦ h ≥ 0 and g 6 = h suc h that g ◦ h is a nonnegativ e m ultiple of 1 k , and hence supp( g ) ∩ supp( h ) = { k } . By Lemma 4 .1 w e ha ve g = P i α i c i and h = P j α j e j conformal sums of circuits with nonnegativ e co efficien ts. Then supp( g ) = ∪ supp ( c i ) and supp( h ) = ∪ supp( e j ), and hence there ar e c i and e j among these circuits suc h that supp( c i ) ∩ supp( e j ) = { k } . Let C := supp ( c i ) and E := supp( e j ) b e the cor- resp onding matroid-circuits of A . It remains t o sho w that C and E are distinct. Supp ose indirectly t ha t C = E . Then C = E = C ∩ E = { k } . This implies that the k -th column of A is 0 and c i = e j = ± 1 k . But then c i ⊑ g and e j ⊑ h , and therefore g = c i = e j = h whic h is a con tra dic tion. So C 6 = E , and the pro of is complete. It is interesting to emphasize that the c haracterization in T heorem 4.2 is in terms of only the matroid o f A , that is, the linear dependency structure on the columns o f A . The alg orithm of Theorem 1.3 enables to solve in p olynomial time the program min { n X i =1 v i x 2 i + w i x i + a i : x ∈ Z n , Ax = b , l ≤ x ≤ u } for all separable quadratics with v ∈ D ∗ ( A ) and in particular for all separable con v ex quadratic functions with v ∈ R n + . So the algorithm can solv e the pro gram moreov er for some separable nonconv ex quadratic functions precisely when the matroid of A satisfies the criterion of Theorem 4.2. Here are some concrete simple examples. Example 4.3 Consider again Example 2 .5 with A := 0 the zer o 1 × n ma t rix having Gr aver b asis G ( A ) = {± 1 i : i = 1 , . . . , n } . The n the set of matr oid-cir cuits of A is M ( A ) = {{ 1 } , . . . , { n }} . Ther efor e C ∩ E = ∅ for al l distinct C, E ∈ M ( A ) and the c ondition of The or em 4.2 trivial ly holds, so D ∗ ( A ) ) R n + . In fact, her e D ∗ ( A ) = R n . Example 4.4 Directed graphs. L et G b e a dir e cte d gr aph, and let A b e its V × E incidenc e matrix, with A v,e := 1 if vertex v is the he ad of dir e cte d e dge e , A v,e := − 1 if v is the tail of e , and A v,e := 0 otherwise. The set M ( A ) of matr oid-cir cuits c onsists pr e cisely of a l l subsets C ⊆ E that ar e cir cuits of the undir e cte d gr aph underlying G . The set C ( A ) of cir cuits c onsi s ts of al l ve ctors c ∈ {− 1 , 0 , 1 } E obtaine d fr om some matr oid cir cuit C ⊆ E by cho osing a ny of its two orientations and setting c e := 1 if dir e cte d e dge e ∈ C agr e es with the o rientation, c e := − 1 if e disagr e es, and c e := 0 if e / ∈ C . T he Gr aver b asis is e qual to the s e t of ci r cuits, G ( A ) = C ( A ) . By The or em 4.2 we have D ∗ ( A ) ) R E + if and only if ther e is an e dge e ∈ E such that no two distinct cir cuits C , C ′ of the underlying undir e cte d gr aph satisfy C ∩ C ′ = { e } . 13 Example 4.5 Generic Matrices. L et A b e a generic in t e ger m × n matrix, that is, a matrix for which every set o f m c olumns is line arly indep ende nt, say, the matrix define d by A i,j := j i for al l i, j , whose c olumns ar e dis t inct p oints on the moment curve in R m . Then the matr oid of A is uniform, that is , its matr oid-cir cuits ar e exactly al l ( m + 1 ) -subsets of { 1 , . . . , n } . Supp ose n ≤ 2 m . Then every distinc t C , E ∈ M ( A ) satisfy | C ∩ E | ≥ 2 , and he n c e D ∗ ( A ) ) R n + by The or em 4.2. So, by The or em 1.2, min { n X i =1 v i x 2 i + w i x i + a i : x ∈ Z n , Ax = b , l ≤ x ≤ u } c an b e solve d in p olynomial time for al l such A , al l b ∈ Z m and l , u ∈ Z n ∞ , al l c onvex and s o m e nonc onvex sep ar able quadr atic functions de fi ne d by data v , w , a ∈ Z n . 5 Higher de g ree p olynomial functions The algorithm tha t underlies our algorithmic Theorem 1.2 using the G ra v er basis is conceptually quite simple. Fir st, it finds in p olynomial time a f easible p oin t. Then it k eeps impro ving p oin ts itera t iv ely , as lo ng as p ossible , where, at eac h iteratio n, it tak es the b est p ossible improving step attainable along any Grav er basis elemen t. It outputs the last p oin t from whic h no further Grav er impro ve men t is p ossible. W e now pro ceed to show that the results of the previous s ections can be ex tended to m ultiv ariate p olynomials of higher, arbitrary , degree. W e will define a hierarc hy of cones, and whenev er a p olynomial function will lie in the corresp onding cone, the algorithm outlined ab o v e will con verge to the optimal solution in p olynomial time. It will b e con v enien t no w to mak e more extensiv e use of tensor notation, and to w ork with the tensored, no nsymmetrized form of a p olynomial function. W e use ⊗ d R n := R n ⊗ · · · ⊗ R n , ⊗ d x := x ⊗ · · · ⊗ x , x ∈ R n for the d -fold tensor pro duct of R n with it self and f or the rank-1 t e nsor that is the d -fold pro duct of a v ector x with itself, resp ectiv ely . Note t hat the ( i 1 , . . . , i d )-th en try of ⊗ d x is the pro duct x i 1 · · · x i d of the corr esp onding entries of x . W e denote the standard inner pro duct on the tensor space b y h U, V i := n X i 1 =1 · · · n X i d =1 U i 1 ,...,i d V i 1 ,...,i d , U, V ∈ ⊗ d R n . In particular, in the ve ctor space R n w e ha v e h x, y i = x ⊺ y and in the matrix space R n ⊗ R n w e hav e h U, V i = U · V . Note that for an y tw o rank-1 tensors w e ha ve h x 1 ⊗ · · · ⊗ x d , y 1 ⊗ · · · ⊗ y d i = d Y k =1 h x k , y k i . F or simplicit y , w e restrict att ention to homogeneous p olynomials, also termed forms . A form f ( x ) of degree d in the v ector of n v ariables x = ( x 1 , . . . , x n ) can b e 14 compactly defined b y a single tensor F ∈ ⊗ d R n that collects all co effic ien ts, b y f ( x ) := h F , ⊗ d x i = n X i 1 =1 · · · n X i d =1 F i 1 ,...,i d x i 1 · · · x i d . F or instance, the form f ( x ) = ( x 1 + x 2 + x 3 ) 3 of degree d = 3 in n = 3 v ar iables can b e written as f ( x ) = h F , ⊗ 3 x i = h⊗ 3 1 , ⊗ 3 x i = h 1 , x i 3 with 1 the all-ones ve cto r in R 3 and F = ⊗ 3 1 the all-ones tensor in ⊗ 3 R 3 , with F i 1 ,i 2 ,i 3 = 1 for i 1 , i 2 , i 3 = 1 , 2 , 3. Let A b e an y integer m × n matrix, and let G ( A ) b e its Grav er basis. F o r eac h degree d ≥ 2 w e no w define a cone P d ( A ) in the tensor space ⊗ d R n as follow s. Definition 5.1 The Gr aver c one of de gr e e d of an integer m × n matrix A is the cone P d ( A ) ⊆ ⊗ d R n generated b y the r a nk -1 tensors g 1 ⊗ · · · ⊗ g d where the g i are elemen ts of G ( A ) that lie in the same ort ha n t and are not all the same, that is P d ( A ) := cone { g 1 ⊗ · · · ⊗ g d : g i ∈ G ( A ) , g i ◦ g j ≥ 0 for all i, j , g i 6 = g j for some i, j } . The dual Gr aver c one of de gr e e d is its dual P ∗ d ( A ) in ⊗ d R n giv en b y P ∗ d ( A ) = { V ∈ ⊗ d R n : h U, V i ≥ 0 , U ∈ P d ( A ) } = { V : h g 1 ⊗ · · · ⊗ g d , V i ≥ 0 , g i ∈ G ( A ) , g i ◦ g j ≥ 0 for all i, j , g i 6 = g j for some i, j } . Note that P 2 ( A ) is the nonsym metrized version of Q ( A ), that is, Q ( A ) = P 2 ( A ) ∩ S n . One of the ke y ingredien t in extending our algo rithmic results t o p olynomials of a rbitrary degree is the following analo g ue of Lemma 3.2 whic h establishes the sup erm o dularit y o f p olynomial functions that lie in suitable cones. W e need one more piece of terminology . Let D := { 1 , . . . , d } and for 0 ≤ k ≤ d let D k b e the set of all k -subse ts of D . A k - dime nsional subtensor of a d - dime nsional tensor F = ( F i 1 ,...,i d : 1 ≤ i 1 , . . . , i d ≤ n ) ∈ ⊗ d R n is a n y of the d k n d − k tensors T ∈ ⊗ k R n obtained from F b y c ho osing I ∈ D k , letting eac h index i j with j ∈ I v ary from 1 to n , a nd fixing each index i j with j / ∈ I at some v alue b et w een 1 and n . F or instance, t he k -dimensional tensor obtained b y c ho osing I = { 1 , . . . , k } and fixing some v alues 1 ≤ i k +1 , . . . , i d ≤ n is T = ( T i 1 ,...,i k := F i 1 ,...,i k ,i k +1 ,...,i d : 1 ≤ i 1 , . . . , i k ≤ n ) ∈ ⊗ k R n . F or an integer m × n mat rix A , let K d ( A ) ⊆ ⊗ d R n b e the cone o f those tensors F suc h that, for all 2 ≤ k ≤ d , ev ery k -dimensional subtensor of F is in P ∗ k ( A ). Lemma 5.2 L et A b e inte g er m × n matrix. L et f : R n → R b e de gr e e d form given by f ( x ) = h F , ⊗ d x i with F ∈ K d ( A ) . L et x ∈ R n + b e nonne gative and P t r =1 µ r g r c onformal sum in R n with g r ∈ G ( A ) distinct and µ r ≥ 0 nonne gative sc alars. Then ∆ := f x + t X r =1 µ r g r ! − f ( x ) ! − t X r =1 ( f ( x + µ r g r ) − f ( x )) ≥ 0 . 15 Pr o of. T o simplify the deriv ation w e assume that all µ r = 1. The same a rgume n t go es through in exactly the same w ay for arbitrary nonnegativ e µ r . F or r = 1 , . . . , t , f ( x + g r ) − f ( x ) = h F , ⊗ d ( x + g r ) i − h F , ⊗ d x i = h F , g r ⊗ x ⊗ · · · ⊗ x i + · · · + h F , x ⊗ · · · ⊗ x ⊗ g r i + d X k =2 X h F , u 1 ⊗ · · · ⊗ u d i : I ∈ D k , u i = g r , i ∈ I x, i / ∈ I . Similarly , f ( x + t X r =1 g r ) − f ( x ) = * F , ⊗ d x + t X r =1 g r !+ − h F, ⊗ d x i = h F , t X r =1 g r ⊗ x ⊗ · · · ⊗ x i + · · · + h F , x ⊗ · · · ⊗ x ⊗ t X r =1 g r i + d X k =2 X h F , u 1 ⊗ · · · ⊗ u d i : I ∈ D k , u i = P t r =1 g r , i ∈ I x, i / ∈ I . Therefore, ∆ = d X k =2 X ( h F , u 1 ⊗ · · · ⊗ u d i − t X r =1 h F , v r, 1 ⊗ · · · ⊗ v r,d i : I ∈ D k , (16) u i = P t r =1 g r , i ∈ I x, i / ∈ I , v r,i = g r , i ∈ I x, i / ∈ I . No w, consider any 2 ≤ k ≤ d a nd any I ∈ D k . F or simplicit y of the indexation, w e assume that I = { 1 , . . . , k } . The deriv ation fo r other I is completely a na logous. F or eac h c hoice of indices 1 ≤ i k +1 , . . . , i d ≤ n let T ( i k +1 , . . . , i d ) b e the k -dimensional subtensor of F obtained by letting i 1 , . . . , i k v ary and fixing i k +1 , . . . , i d as c ho sen. Then the correspo nding summand of ∆ in the expression (16 ) ab o ve satisfies * F , ⊗ k t X r =1 g r ! ⊗ ( ⊗ d − k x ) + − t X r =1 h F , ( ⊗ k g r ) ⊗ ( ⊗ d − k x ) i (17) = n X i k +1 =1 · · · n X i d =1 x i k +1 · · · x i d * T ( i k +1 , . . . , i d ) , ⊗ k t X r =1 g r ! − t X r =1 ⊗ k g r + . The summand in (17) ab o v e whic h corresp onds to 1 ≤ i k +1 , . . . , i d ≤ n satisfies * T ( i k +1 , . . . , i d ) , ⊗ k t X r =1 g r ! − t X r =1 ⊗ k g r + = (18) X {h T ( i k +1 , . . . , i d ) , g r 1 ⊗ · · · ⊗ g r k i : 1 ≤ r 1 , . . . , r k ≤ t, not all r i the same } . 16 No w, b ecause all the g r are in the same orthant, and a ll k -dimensional subtensors of F lie in the dual Grav er cone P ∗ k ( A ), each summand on the right-hand side of (18) ab o ve satisfies h T ( i k +1 , . . . , i d ) , g r 1 ⊗ · · · ⊗ g r k i ≥ 0, and so the left-hand side of (18 ) is nonnegativ e as w ell. Because x ∈ R n + is nonnegativ e, each summand on the right-hand side of (17) ab o ve is nonnegativ e, and so t he left-hand side o f (17 ) is nonnegativ e as w ell. Because this holds fo r all 2 ≤ k ≤ d a nd all I ∈ D k , we obtain that eac h summand on the right-hand side of (16) is nonnegativ e, and so ∆ ≥ 0 as claimed. A second k ey ingredien t is the fo llowing analog ue of Lemm a 3.1 which sho ws that w e can efficien tly minimize a giv en form of an y fixed degree d in a giv en direction. Lemma 5.3 F or every fixe d d , ther e is an a l g orithm that, given l, u ∈ Z n ∞ , z , g ∈ Z n with l ≤ z ≤ u , and f ( x ) = h F , ⊗ d x i with F ∈ ⊗ d Z n , solve s in p olynomial time min { f ( z + µg ) : µ ∈ Z + , l ≤ z + µg ≤ u } . (19) Pr o of. Let S := { µ ∈ Z + : l ≤ z + µg ≤ u } , and let s := sup S whic h is easy to determine. If s = ∞ the n we conclude that S is infinite and stop. Otherwise w e need to minimize the univ ariate degree d polynomial h ( µ ) := h F , ⊗ d ( z + µg ) i = P d i =0 h i µ i , whose co e fficien ts h i can b e easily computed from F , o v er S = { 0 , 1 , . . . , s } . Outline: use rep eated bisections and Sturm’s theorem whic h allows us to coun t the n umber of real ro ots of h in an y in terv al using the Euclidean algo r it hm on h ( µ ) = P d i =0 h i µ i and its deriv ativ e h ′ ( µ ) = P d − 1 i =0 ( i + 1 ) h i +1 µ i , to find in terv als [ r i , s i ], i = 1 , . . . , d (p ossibly with rep etitions if h has m ultiple ro ots) containing eac h real ro ot of h , a nd suc h that s i − r i < 1 for all i . Then minimizing h ov er S reduces to minimizing h o v er S ∩ { 0 , ⌈ r 1 ⌉ , ⌊ s 1 ⌋ , . . . ⌈ r d ⌉ , ⌊ s d ⌋ , s } . W e can no w establish our theorem on p olynomial in teger minimization. Theorem 1.4 F or e v ery fixe d d ther e is an algorithm that, given inte ger m × n matrix A , its Gr aver b asis G ( A ) , b ∈ Z m , and de gr e e d inte ger homo genous p olynomial f ( x ) = h F , ⊗ d x i with F ∈ K d ( A ) , solves in p olynomial time the p olynomial pr o gr am min { f ( x ) = h F , ⊗ d x i : x ∈ Z n , Ax = b , x ≥ 0 } . Pr o of. First, use the algorithm of Lemma 3.6 to either detect t hat the problem is infeasible or that the feasible set is infinite and stop, or obta in a feasible p oin t and con tinue . No w, apply the a lg orithm of Lemma 3.5 precisely as it is, using the giv en form f ( x ) instead of a quadratic. Lemmas 5.2 and 5.3 now assure that the analy- sis of this algorithm in the pro of o f Lemma 3.5 carries through precisely as b efore, and guara ntee that the algorithm will find an optimal solution in p olynomial time. References [1] D e Lo era, J., Hemmec k e, R., Onn, S., Roth blum, U.G., W eisman tel, R.: Conv ex in teger maximization via Grav er bases. J. Pur e App. Alg. 213:1569–157 7, 200 9. 17 [2] D e Lo era, J., Onn, S.: All linear and in teger programs are slim 3-w ay trans- p ortation pro grams. SIAM J. Optim. 17: 806–82 1, 2006. [3] Hemmec ke, R., Onn, S., W eisman tel, R.: A p olynomial oracle-time algorithm for con v ex in teger minimization. Math. Pr o g. (to app ear). [4] O nn, S.: Nonlinear D iscrete Optimizatio n: An Algorithmic Theory , Nachdiplom L e ctur es , ETH Z ¨ uric h, pp. 1–143. [5] Schrijv er, A.: “Theory of Linear and In teger Progra mming,” 1986. Wiley . [6] Seb¨ o, A.: Hilb ert bases, Cara t h´ eo dory’s theorem and com binatorial optimiza- tion. In: Pro c. IPCO 1 - 1 s t Conference on Integer Programming and Com bina- torial Optimization, 431–4 55, 1990. Univ ersity of W aterlo o Press. [7] Sturmf els, B.: Gr¨ o bne r Bases and Con v ex Polytopes. Univ. L e c. Ser. V olume 8, 1996. American Mathematical So ciet y . Jon Lee IBM T.J. Watson R ese ar ch Center, Y orktown Heights, USA jonlee@us.i bm.com Shmuel Onn T e chnion - I sr ae l Institute of T e chnolo gy, Haifa, I sr a e l onn@ie.tech nion.ac.il Lyub o v Romanc h u k T e chnion - I sr ae l Institute of T e chnolo gy, Haifa, I sr a e l lyuba@techu nix.technion.ac.il Rob ert W eismantel ETH, Z¨ urich, Sw i tzerla nd robert.weis mantel@ifor.math.ethz.ch
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment