Multigrid preconditioning of linear systems for interior point methods applied to a class of box-constrained optimal control problems

MUL TIGRID PRECONDITIONING OF LINEAR SYSTEMS F OR INTERIOR POINT METHODS APPL IED TO A CLASS OF BO X-CONSTRAINED OPTIMAL CONTR OL PR OBLEMS ANDREI DR ˘ AG ˘ ANESCU ∗ AND COSMIN PETRA † Abstract. In this article we construct and analyze multigrid preconditioners for discretizations of operators of the form D λ + K ∗ K , where D λ is the multiplication with a relatively smo oth func- tion λ > 0 and K i s a compact linear operator. These systems arise when applying in terior p oint methods to the minimization problem min u 1 2 ( | |K u − f | | 2 + β | | u | | 2 ) with box-constrain ts u 6 u 6 u on the con trols. The presen ted preconditioning tec hnique is closely related to the one dev elop ed b y Dr ˘ ag˘ anescu and Dupont in [13] for the asso ciated unconstrained problem, and is i n tended for large-scale problems. As in [13], the quality of the resulting preconditioners i s s ho wn to increase as h ↓ 0, but decreases as the smo othne ss of λ declines. W e test this algorithm ﬁrst on a Tikhono v- regularized back ward parab olic equation with b o x-constraint s on the control, and then on a standard elliptic-constrained optimization problem. In b oth cases it is s hown that the nu mber of linear iter- ations p er optimization step, as well as the total n umber of ﬁne-scale matrix- v ector multiplications is decreasing w i th increasing r esolution, th us showing the metho d to b e p oten tially very eﬃcient for truly l arge-scale problems. Key words. multigrid, int erior p oin t methods, PDE-constrained optimization AMS sub ject classiﬁcatio ns. 65M55, 90C51, 65K10, 65M32, 90C06 1. In tro duction. In this work we present a multigrid preconditioning tech nique for so lv ing linear systems ar ising when applying interior po in t metho ds to the co n trol- constrained optimal control problem minimize J β ( u ) def = 1 2 | |K u − f | | 2 + β 2 | | u | | 2 , u ∈ U ad , (1.1) where Ω ⊂ R d is a bounded domain, β > 0, K : L 2 (Ω) → L 2 (Ω) is a linear, compact op erator, and the set of admissible solutions is given by U ad = { u ∈ L 2 (Ω) : u 6 u 6 u a.e. } , with u , u ∈ L 2 (Ω), and u 6 u . The following examples form the ma in motiv ation for the pr e s en t work: Example A: Box-constrained time-reversal for a parab olic equation. Con- sider the linear parab olic initial v alue pro blem    ∂ t y + A y = 0 , on Ω × (0 , ∞ ) , y ( x, t ) = 0 , on ∂ Ω × (0 , ∞ ) , y ( x, 0) = u ( x ) , for x ∈ Ω , (1.2) where A is a linear elliptic oper ator, a nd denote the s olution ma p by S ( t ) u def = y ( · , t ). T o for mulate a control problem we deﬁne K = S ( T ), where T > 0 is a ﬁx ed “end- time” T > 0; the res ulting optimization problem is controlled by the initial v alue u . ∗ Departmen t of Mathematics and Sta tistics, Unive rsity o f Maryland, Baltimor e Count y , 1000 Hill - top Circle, Baltimore, Maryland 21250 ( drag a@umbc.edu ) . The wo rk of this autho r wa s supp orted i n part by the Department of Energy under cont ract no. DE-SC0005455, and by the N ational Science F oundation under aw ards DMS- 1016 177 and DMS- 0821 311. † Mathematics and Computer Science Division, Argonne National La b oratory , 9700 S Cass Av en ue, Argonne, IL 60439 ( pet ra@mcs.anl.g ov ). The work of this author was supp orted in part by the National Science F oundation under aw ard CCF-0728878. 1 2 A. DR ˘ AG ˘ ANESCU AND C. P E T R A Note that if the box constraints in (1.1) are left out, i.e., U ad = L 2 (Ω), then (1.1) is the Tikho no v-reg ularized formulation of the ill-p osed problem K u = f . (1.3) Example B: Ellipti c-constrained distribute d optimal con trol problem. In this example w e let K = ∆ − 1 , where ∆ is the Laplace op erator a cting on H 1 0 (Ω). In this case the problem (1.1) is usually formulated as the P DE-constrained optimization problem (e.g ., see Borzi a nd Sch ulz [8]) minimize 1 2 | | y − f | | 2 + β 2 | | u | | 2 sub j. to ∆ y = u , y ∈ H 1 0 (Ω) , u 6 u 6 u a.e. in Ω . (1.4) Our pr imary motiv ation is r o oted in solving large-sca le inv erse problems like the one in E xample A, whic h is a simpliﬁed version of the problem considered by Ak¸ celik et al. in [2]. Ther e the question was to ident ify the initia l concentration u = y ( · , 0) of a contaminan t release d in the atmosphere in a given geog raphic area (the Los Angeles Basin) given later-time measur emen ts at v arious ﬁxed lo catio ns. The spatio-temp oral evolution of the concen tration y of the cont aminant is assumed, like in Example A, to b e governed by an advection-diﬀusion equation with known wind-v elo cities and contaminan t diﬀusivity that can b e fo r m ulated as (1.2). In Example A we consider the ca se wher e the measurements are taken at all points in spa c e but at a single mo- men t in time T > 0, therefore the data f is the entire state at time T . The pro blem th us b ecomes to invert a compact o p erato r which, as is w ell known , is not contin- uously inv ertible. As a result, a naive approach to in verting K is unstable, in that small p e rturbations in the “measurements” f result in exp onen tially large error s in the solutio n. If the “ex act” meas uremen ts f are resulted fr om a pplying K to a “true” initial v a lue u , that is, f = K u , then v arious regula rization techniques [15] are em- ploy ed so that the computed so lutio n u δ of the δ - p erturb ed problem K u δ = f δ (where | | f − f δ | | 6 δ ) conv erge s to the “true” solution u as δ ↓ 0 , the most commonly used being the Tikhonov-regular ization. One issue not resolved by the Tik ho no v regular- ization is that the so lution u δ of the regula rized problem may exhibit non- ph ysical, or otherwise qualita tively incorrec t b ehavior: for example, if the concentration needs to hav e v alue s in [0 , 1], it is well known that u δ may e x ceed these limits. In addition, if the true initial contamination even t is lo calized, then u δ oscillates ar o und zero, and is not lo calize d. Ho wev er, if explicit constra in ts u = 0, u = 1 are s e t, then the solution is physically relev a n t, and is a lso lo c alized. In Figure 1.1 we show the so lution of the inv er se problem where the target solution (true initial concentration) is represented by t wo lo calized sources of co n tamination, with one dominating the other in size. The s o lution of the unconstra ined problem (with δ down to roundo ﬀ error and β opti- mized) do es not s ho w tw o sources of p ollution, unlike the solution of the constrained problem, fro m which one can clearly identify tw o s eparate c o mponents in the sup- po rt of the initial v alue. In addition, in the presence o f nonlinearities, with r eaction terms that ar e sensitive to the sign of their arguments, the importance o f physically meaningful b o x constra in ts ca nnot be o verstated. A similar s ituation is encount ered in imag e-deblurring, wher e the target (gray-scale) solution tak es v alue s in [0 , 1 ]; if MUL TIGRID PRECONDITIONING FOR IPMs 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 concentration initial concentration recovered − unconstrained recovered − constrained end state (data) Fig. 1.1 . Solution of t he c onstr aine d vs. unc onstr aine d inverse pr oblem, with the tar ge t solution ( initial c onc entr ation ) b eing a smo oth funct ion supp orte d on two disjoint intervals. these bo undaries are strictly enforced, then the quality of inv ersion (deblurring) is signiﬁcantly improv ed [34]. The unconstraine d optimization problem, where U ad = L 2 (Ω) in (1.1), ultimately reduces to solving the norma l equa tions ( β I + K ∗ K ) u = K ∗ f . (1.5) F or compact op erators these systems either represent or resemble very w ell in te- gral equations of the second kind. Starting with the works o f Hackbusch [20] (see also [2 2]) muc h e ﬀo r t ha s b een devoted to dev eloping eﬃcient multigrid metho ds for solving (1 .5), e.g ., see [2 8, 33, 23, 27, 5, 13] and the references therein. W e should po in t out that m ultigrid metho ds were orig inally develop ed for solving elliptic eq ua- tions [21, 10], and later signiﬁcant eﬀorts were devoted to e x tending these metho ds to o ther imp ortant diﬀere n tial e q uations such a s advection-diﬀusion and the Navier- Stokes equations (e.g., see [14] and the reference s therein). F or elliptic equations, t ypically the goal is to reduce the condition n umber of the discrete system from O ( h − 2 ) to O (1), which results in a solution pro cess that so lv es the equation in a num- ber of iteratio ns that is mesh-indep enden t. How ever, the discrete version of (1.5) has a condition n umber which is O ( β − 1 ), with the b ound b eing independent of h . Mor e - ov er, ev en for β = 0 , conjugate gradient (used as a r egularizer) already so lv es (1.5 ) in a mesh-indep endent n umber of iter ations [15]; in other w ords, mesh-indep e ndence is nea rly eﬀortless for integral equations . In stead, fo r systems lik e (1.5 ) multigrid is used to further reduce the condition num ber of the preco nditioned s ystem. F or example, in [13] it is shown that by using spe cially designed multigrid precondition- ers, the preconditioned version o f (1.5) ha s a condition num b er b ounded b y O ( h 2 /β ), the consequence of which is in teresting at least from a theoretical p oint o f view: if β > 0 is kept ﬁxed (whic h is nor mally not the ca se in practica l applications) the nu mber of iter a tions re q uired to solve the problem decreases wi th h ↓ 0 to the po in t where only o ne iteration would b e enough to s olv e the problem with s uﬃcien t 4 A. DR ˘ AG ˘ ANESCU AND C. P E T R A accuracy . This fact, alre a dy known to Hackbusc h [20], constitutes a departure from the usual m ultigrid framework, where, as sta ted befor e, the goa l is to achiev e mesh- independenc e ( b ounde d num ber of iteratio ns as h ↓ 0). Hence, a s a res ult of sp eciﬁc m ultigrid preconditioning, the solution pro cess for the unconstr ained problem (1.5) requires fewer a nd few er ﬁne-s c a le matrix-ve ctor multiplic ations (mat-v ecs) as h ↓ 0 . The main co n tribution o f the present work is to show that such p erfor ma nce c a n also be achiev ed in the prese nce of explicit b ox constraints on the cont rol, a s formulated in (1.1) . Multigrid methods hav e lo ng b een asso ciated with solving large-s cale problems, and beginning with the work of Ha ckbusch [18, 19], a nd esp ecially over the last decade, signiﬁcant eﬀorts were c o ncen trated on dev ising eﬃcien t m ultilevel metho ds for op- timal control problems [3, 7, 6, 40, 29, 4 ]. A more detailed discuss ion of the sub ject and ma n y r eferences can be found in the recent revie w ar ticle by Bor zi and Sch ulz [8]. While most – no t a ll – of the a fo remen tioned articles discuss unconstrained problems, the addition of box constraints p ose a dditional challenges asso ciated with the pres ence of the Lagrang e multipliers re lated to the inequality-constraints, which in gener al are less re gular than the solutions. Optimization metho ds for suc h pro blems with b ound constraints typically fall in o ne o f tw o ca tegories: a ctiv e-set type strateg ies, esp ecially semismo oth Newton metho ds (SSNMs) and interior p oint metho ds (IPMs). Over the last decade bo th IPMs and SSNMs hav e consisten tly attracted the atten tion of the sci- ent iﬁc communit y due to their proven eﬃciency in s olving distributed o ptima l co ntrol problems with PDE constr ain ts. Both strategies lead to sup erlinear lo cal conv er- gence [38, 39, 24] and lend themselves to analysis b oth in a ﬁnite dimensional s e tting and in function space [37, 41] (see also [26]), which is a critical stepping s tone to proving mesh-indep endence [25]. Eac h of IPMs and SSNMs consists of an outer it- erative pro cess that further requires so lv ing one or tw o linea r systems at each o uter iteration. F or large- scale pr o blems the so lution of these linear sy s tems often b ecomes the bo ttleneck of the computation. In terms of their linear algebr a needs, IPMs and SSNMs exhibit signiﬁcant diﬀerenc e s a nd r equire sepa rate tre atmen t. F or SSNMs, the linear systems inv olve a subs e t o f unknowns and equations of the Hessian o f the cost functional, while for IPMs the systems have the same structure as the systems arising in the unconstrained pro blem, but contain additiona l ter ms on the diagonal which usually are a so urce of extreme ill-co nditioning. The question of eﬃcient m ulti- grid preconditioning of the linear systems a rising in the semismo oth Newton solution pro cess is the sub ject of cur ren t re s earch [1 1]. The fo cus of the current work addresses the questio n of eﬃcient multigrid precon- ditioning of linear systems arising in the IP M solution pro cess. Multigrid a nd IPMs hav e been shown to work w ell tog e ther for elliptic v ariationa l ineq ualities (obstacle problems) [3], and for some classes of problems w he r e the Hessian of the cost func- tional is elliptic [4]. Instead, the Hessian of the cost functional in (1.1) is a compac t op erator and r equires a signiﬁcantly diﬀerent appr o ach . In this article w e trea t K and its discretizations as black-box oper ators, a nd w e employ the ﬁrst-di scretize-then-optimize s trategy . W e apply sp eciﬁc prima l- dual IPMs to the discrete v ersio n of (1.1), and w e develop multigrid preconditioners for the linea r systems a rising at each IPM iterate. As shown in Section 3, if the discrete optimal con trol problem is formulated appro priately , then the linear s ystems to be solved involv e matrices of the form D + K ∗ K , where D is a diagona l matrix, K is the discrete representation of K , and K ∗ is the adjoint of K with r espect to a certain discrete inner pro duct. When using standard ﬁnite elements, the matr ices K , K ∗ MUL TIGRID PRECONDITIONING FOR IPMs 5 are dense; consequently , for la rge-scale problems, they ca nnot be formed and the systems a re solved using iter ativ e metho ds. Since res idua ls can b e computed at the equiv alent c o st o f tw o applications of K , r e s idual computation is expe c ted to b e very exp ensiv e, therefore eﬃcient preconditioners are cr itical for minimizing the n umber of neces sary mat-vecs. Our strategy in this work is to ada pt the m ultigrid techniques developed in [13], wher e the matrix D had the for m β I . T o ana lyse the resulting m ultigrid preconditioner we in terpret D as b eing the discretization of an oper a tor D λ , where ( D λ u )( x ) = λ ( x ) u ( x ) is the op erator representing po in t wise mult iplication of a function u with a smo oth function 1 λ . F ro m a technical p ersp ective, our main accomplishment consists of showing that the o p erato rs of the type KD λ together with their dis cretizations satisfy a s e t of smo othing c o nditions shown in [13] to b e suﬃcient for the mult igrid preconditioner to hav e the des ir ed qualities. F or simplicit y and concreteness we restrict most of our study to the t wo-dimensional case, and w e consider a sta ndard ﬁnite element discretiza tio n for K using triangular elements and co n tin uous piecewise linear functions. As will result from the analy- sis, these techniques can b e easily genera lized to three dimensions and rectangular elements, howev er, the extensio n to higher deg ree ﬁnite elements is not o b vious and forms the sub ject of curr e n t resea r c h. This a rticle is o rganized as follows: After for mally introducing the discrete opti- mization problem in Section 2, we discuss the sp eciﬁc linear algebra requirements o f the interior p oin t metho ds in Section 3. Sec tio n 4 is central to this work as it presents the ana lysis of the tw o-gr id pr econditioner, the main r e sult being Theorem 4.9. In Section 5 we develop a m ultigrid preco nditio ne r that preser v es the q ualities of the t wo-grid preconditioner. F ur ther, w e a pply the metho ds to Examples A and B in Section 6 and show some n umerical results. 2. Notation and discrete problem formulation. Throug hout this pap er we shall denote b y W m p (Ω) , H m (Ω) , H m 0 (Ω) (with p ∈ [1 , ∞ ] , m ∈ N ) the standard Sob olev spa ces, while | | · | | a nd h· , ·i are the L 2 -norm and inner pro duct, r espectively . Let e H − m (Ω) b e the dua l (with resp ect to the L 2 -inner pr oduct) of H m (Ω) ∩ H 1 0 (Ω) for m > 0. If X is a Banach spa ce then L ( X ) denotes the spa c e of b ounded linear op erators on X . W e regard sq uare n × n matrices as op erators in L ( R n ) and we write matric e s and vectors using bold font. If A is a symmetric p ositive deﬁnite ma- trix, we denote b y ( u , v ) A = v T Au the A -dot pro duct of tw o vectors u , v , and by | u | A = p ( u , u ) A the A -norm; if A = I we drop the subscr ipt fro m the inner pro duct and norm. The spac e of m × n matrices is denoted b y M m × n ; if m = n we write M n instead of M m × n . G iven s ome norm | | · | | s on a vector space X , and T ∈ L ( X ), we denote by | | T | | s the induced op erator-no rm | | T | | s = sup u ∈X , | | u | | s =1 | | T u | | s . Consequently , if T ∈ L ( L 2 (Ω)) then | | T | | (no subsc r ipts) is the L 2 op erator-nor m o f T . If X is a Hilbert space and T ∈ L ( X ) then T ∗ ∈ L ( X ) denotes the adjoint of T . W e a ssume that Ω ⊂ R 2 is a bo unded, p olygonal domain and that K is discretized using co ntin uous piecewise linear functions on triangular elements. W e cons ider the usual multigrid framework where the opera tor is discretized at several resolutions. Let T h 0 be a tr iangulation o f the domain Ω, and deﬁne T h/ 2 inductively to b e the 1 This can alw ay s b e accomplished, λ can be an y smo oth (here C 2 is suﬃcient ) interpolant of the discrete function r epresen ting the di agona l of D . 6 A. DR ˘ AG ˘ ANESCU AND C. P E T R A Goursat reﬁnemen t of T h for all h ∈ I with I = { h 0 / 2 i : i = 0 , 1 , 2 , . . . } , where each triangle in T ∈ T h is cut along the three lines obtained by joining the midpo in ts of its e dges. Note that ( T h ) h ∈ I is a quasi-uniform triangulation and the usual appr o ximations hold [9]. W e deﬁne V h = { u ∈ C ( Ω) : ∀ T ∈ T h u | T is linear , and u | ∂ Ω ≡ 0 } , so that V h/ 2 ⊂ V h ⊂ H 1 0 (Ω). W e should note that z ero-b oundary conditions for the controls a r e consistent with the exa mples considered, and present a co nvenien t framework for the analy sis, but alternate b oundary conditions can b e conside r ed. F or Example A, setting u to b e ze r o on ∂ Ω is natural, while in Example B the b oundary v alues of the control do not enter the discrete pro blem unless higher or de r cubatures are us e d in the discre tiza tion. If N h = dim( V h ) and P h 1 , . . . , P h N h are the no des o f T h that lie in the interior of Ω let I h : C (Ω) → V h be the standard in terp olation op erator I h ( u ) = N h X i =1 u ( P h i ) ϕ h i , where ϕ h i , i = 1 , . . . , N h are the standard noda l bas is functions. Given a family of po sitiv e weigh t-functions ( w h ) h ∈ I ⊂ V h we deﬁne the mes h-dependent inner pro ducts h u, v i h = N h X i =1 w h ( P h i ) u ( P h i ) v ( P h i ) , for u, v ∈ V h , and let | | | u | | | h = p h u, u i h . In o r der to sa tisfy h· , ·i h ≈ h· , ·i as close as p ossible we replace exact integration on each triangle ∆ P 1 P 2 P 3 with the cubature Z T f ( x ) dx ≈ Q ( f ) = area( T ) 3 3 X i =1 f ( P i ) . This de ﬁnes the weight functions w h w h ( P h i ) = 1 3 X P h i ∈ T ∈T h area( T ) . (2.1) Since the cubature Q is exact for linear functions [35] we have h u, v i h = Z Ω I h ( uv ) , for all u, v ∈ V h . Moreov er, if the grids are quas i-uniform, then h − 2 w h are uniformly b ounded and bo unded aw ay from 0 with resp ect to h ∈ I , therefore by L e mma 6.2.7 in [9] there exist p ositive constants C 1 , C 2 independent of h such that C 1 | | u | | 6 | | | u | | | h 6 C 2 | | u | | , ∀ u ∈ V h . (2.2) W e should point out that the norm- e quiv alence (2.2) extends to op erator norms for op erators in L ( L 2 (Ω)), which allows us to in terchange | | T | | w ith | | | T | | | h when needed as long a s we facto r in a mesh-indep enden t constant. MUL TIGRID PRECONDITIONING FOR IPMs 7 W e assume that for each h ∈ I is given a natural discretization K h ∈ L ( V h ) of K , so that K , K h satisfy the Smo othe d Appr oximation Condition (SA C): Condition 2.1 ( SAC ). An op er ator M t o gether with its discr etization M h is said to satisfy the Smo othe d Appr oximation Condition if ther e exists a c onstant C ( M ) dep ending on M , Ω , T h 0 and indep endent of h so that [a] smo othing: | |M u | | H m (Ω) 6 C ( M ) | | u | | , ∀ u ∈ L 2 (Ω) , m = 0 , 1 , 2 ; (2.3) [b] smo othe d appr oximation: | |M u − M h u | | H m (Ω) 6 C ( M ) h 2 − m | | u | | ∀ u ∈ V h , m = 0 , 1 , h ∈ I . (2.4) Given tw o discrete functions u h , u h ∈ V h representing u, u , r e spectively , w e now deﬁne the discrete optimization problem minimize J h β ( u ) def = 1 2 | | |K h u − f h | | | 2 h + β 2 | | | u | | | 2 h , u ∈ U h ad , (2.5) where f h ∈ V h represents f and the set of discr ete admissible so lutions is given by U h ad = { u ∈ V h : u h ( P h i ) 6 u ( P h i ) 6 u ( P h i ) ∀ i = 1 , . . . , N h } . The for m ulation (2.5) needs a few comments. Firs t we remark that the use o f the discrete norm | | | · | | | h instead of | |·| | is essential in order for the linear systems to b e solved at e a c h o uter iteration to hav e a form amenable to eﬃcient multigrid pr econditioning. In other words, we have chosen a discretization that a llo ws for an eﬃcie nt solution pro cess. Second, if u and − u are conv ex and con tinuous (e.g., when they are constant), then the choice u h = I h ( u ) , u h = I h ( u ) (2.6) implies U h ad ⊂ U ad . This co nstruction can b e easily gener a lized to three dimensions and/or tenso r-pro duct ﬁnite elements. W e o bta in a matr ix formulation o f (2.5) by represe nting a ll vectors and o pera tors using the s tandard no dal basis functions ϕ h i , i = 1 , . . . , N h . More pr e c isely , if we deﬁne T : R N h → V h by T ( u ) = N h X i =1 u i ϕ h i , where u = [ u 1 , . . . , u N h ] T , then the ma tr ix K h = T − 1 K h T , reg arded a s an op erator in L ( R N h ), represents K h with resp ect to the no dal basis. If W h is the diago nal matrix with diagonal en- tries ( w h ( P h i )) 1 6 i 6 N h , and f h , u h , u h represent f h , u h , u h resp ectively , then (2.5) is equiv alent to minimize J β ( u ) def = 1 2 | K h u − f h | 2 W h + β 2 | u | 2 W h , u h 6 u 6 u h , (2.7) where the inequalit y u 6 v betw een vectors is meant coo rdinate-wise. When o pera ting on a single gr id w e will omit the subscr ipt h for matrices and vectors. Existence a nd uniqueness of solutions for b oth (1.1 ) and (2.5) follo ws fr o m the fact that J β , J h β are unifor mly conv ex and U ad , U h ad are conv ex sets (e.g., see Theo rem 1.43 8 A. DR ˘ AG ˘ ANESCU AND C. P E T R A in [26]). F urthermore, cf. Lemma 1.12 in [26 ], the solution b u of (1.1) is c hara c terized by the following condition: there exist λ , λ ∈ L 2 (Ω) so that        ( β I + K ∗ K ) b u + λ − λ = K ∗ f , b u > u , λ > 0 , λ ( b u − u ) = 0 a.e., b u 6 u , λ > 0 , λ ( u − b u ) = 0 a.e. (2.8) 3. In terior p oin t m etho ds and linear systems . In this sec tio n we brieﬂy discuss the speciﬁcs of the interior p oint metho d we use for solving the discr ete o pti- mization problem (2.7), and we describ e in detail the linear systems that need to be solved at each outer iter ation. If w e denote A def = β W + K T WK , (3.1) then after a r earrangement of the terms in the ob jective function J β and dr opping constant terms w e wr ite (2.7) as a regular conv ex quadr a tic pr o blem with aﬃne co n- straints in sta nda rd fo rm: minimize 1 2 u T Au − ( K T Wf ) T u sub j to: u 6 u 6 u . (3.2) Since W is a diagona l matr ix with p ositive entries, the matrix A is p ositive deﬁnite, therefore the ab ov e problem has a unique so lution (see [32], p. 320). Let us de no te the La grangian corres ponding to the QP (3.2) by L ( u , v 1 , v 2 ) = 1 2 u T Au − ( K T Wf ) T u − v T 2 ( u − u ) − v T 1 ( u − u ) , where v 1 , v 2 are vectors of non-negative multipliers c orresp onding to the inequality constraints. Then the gradient and Hessian of the Lagr a ngian ar e g iv en by ∇ L u ( u , v 1 , v 2 ) = Au − K T Wf + v 2 − v 1 , ∇ 2 L uu ( u , v ) = A . Since the the linea r indepe ndence constra in t qualiﬁcation ho lds, the unique solution b u of (3.2) satisﬁes the Kar ush-Kuhn-T ucker (KK T ) conditions      A b u + b v 2 − b v 1 = K T Wf b u > u , b v 1 > 0 , b v 1 · ( b u − u ) = 0 , b u 6 u , b v 2 > 0 , b v 2 · ( u − b u ) = 0 , (3.3) where b v 1 , b v 2 are the multipliers, and u · s denotes the comp onent-wise pro duct. Mo re- ov er, since ∇ 2 L uu is p ositive deﬁnite, the above KK T conditions are a lso suﬃcient. The pr imal-dual IPM consists of solving the pertur bed KKT sy stem      Au + v 2 − v 1 = K T Wf , u > u , v 1 > 0 , v 1 · ( u − u ) = µ e , u < u , v 2 > 0 , v 2 · ( u − u ) = µ e , (3.4) whose one-par ameter fa mily of solutions ( b u ( µ ) , b v 1 ( µ ) , b v 2 ( µ )) deﬁnes the c entr al p ath . As usual, e = [1 , 1 , . . . , 1] T ∈ R N h . Practical IPM alg o rithms pro duce solutions that MUL TIGRID PRECONDITIONING FOR IPMs 9 lie suﬃciently clos e to the central path and conv erge ra pidly to ( b u , b v 1 , b v 2 ). An exam- ple o f such metho d is Mehro tra’s predictor-cor rector alg orithm. Initially intro duced for linear pr ogramming [31], the metho d was successfully adapted to c on vex Q Ps and emer ged in the las t ﬁfteen y ears as the (arg uably) most pr actical and eﬃcient algorithm for this class of proble ms . F or this pro ject we used Matlab to implement Mehrotra’s metho d for convex QPs from OO QP (see [16] for details). T o describ e the metho d we ﬁrst consider the the linear sys tem deﬁning the Newton direction ( δ u , δ v 1 , δ v 2 ) for (3.4):      A δ u + δ v 2 − δ v 1 = K T Wf − Au − v 2 + v 1 , V 1 δ u + ( U − U ) δ v 1 = µ e − v 1 · ( u − u ) , − V 2 δ u + ( U − U ) δ v 2 = µ e − v 2 · ( u − u ) , (3.5) where U , U , U , V 1 , and V 2 are diagonal matrices with the diagona l given by the vectors u , u , u , v 1 , and v 2 , resp ectively . In Mehrotra ’s alg orithm, given the current iterate ( u , v 1 , v 2 ), one ﬁr s t co mputes the predictor directio n ( δ u a , δ v a 1 , δ v a 2 ) as the solution of (3.5) with µ = 0. Secondly , the corrector direction ( δ u , δ v 1 , δ v 2 ) is the solution of a linear system that diﬀers from (3.5) only in the right-hand side, na mely:      A δ u + δ v 2 − δ v 1 = K T Wf − Au − v 2 + v 1 , V 1 δ u + ( U − U ) δ v 1 = σµ e − v 1 · ( u − u ) + δ u a · δ v a 1 , − V 2 δ u + ( U − U ) δ v 2 = σµ e − v 2 · ( u − u ) + δ u a · δ v a 2 , (3.6) where µ =  ( u − u ) T v 1 + ( u − u ) T v 2  / (2 N h ), and σ > 0 is a centering parameter that is c omputed accordingly to Mehrotra’s heuristic. Ther e fo re, both the predictor and the correcto r step inv olve a system – the augment e d system – of the form    A − I I V 1 ( U − U ) 0 − V 2 0 ( U − U )    ·    δ u δ v 1 δ v 2    =    r u r v 1 r v 2    . (3 .7) Since the matrices on the second and third blo c k-rows of (3.7) ar e diagonal, we can substitute δ v 1 = ( U − U ) − 1 ( r v 1 − V 1 δ u ) and δ v 2 = ( U − U ) − 1 ( r v 2 + V 2 δ u ) in to the ﬁrst blo ck-row to o btain the r e duc e d system  A + ( U − U ) − 1 V 1 + ( U − U ) − 1 V 2  δ u = r (3.8) with r = r u + ( U − U ) − 1 r v 1 − ( U − U ) − 1 r 2 . W e note that the matrix of the r e duced system (3.8) is symmetric positive deﬁnite, while the matrix of (3.7) is simila r to the symmetric indeﬁnite matrix C =     A I I I − V − 1 1 ( U − U ) 0 I 0 − V − 1 2 ( U − U )     . If strict complemen tarity holds for at least one coo rdinate in the pair ( b u , b v 1 ) (resp. ( b u , b v 2 )), then the diagonal matrix V − 1 1 ( U − U ) (resp. V − 1 2 ( U − U )) has increa singly 10 A. DR ˘ AG ˘ ANESCU AND C. P E T R A small and/or larg e entries as the in terior-p oint algorithm appro ac hes the solution. It is easy to s e e that the la rgest eigenv alue of C is larger than any of the diagona l entries of V − 1 1 ( U − U ) or V − 1 2 ( U − U ), while the smallest (in a bsolute v alue) of its eigenv alues are O (1). There fo re, the matr ix C , and hence the system (3.7) is ill-conditioned. Several equiv ale n t reformulations of the system (3.7) are propo sed in [4] (also see [17]). As in [2, 13], in this article we us e the reduced form (3.8) which, for the problem under study , is s ymmetric p ositive deﬁnite. While (3.8) also suﬀers from the well known ill-conditioning of interior p oint methods, and in fact is even mo re ill-co nditioned than (3.7) if β ≪ 1, the condition num b er of (3.8) can b e signiﬁca n tly reduced ﬁrs t by r escaling b oth the unknowns and the equations (left- and right-preconditioning), as shown b elow, a nd then by tw o-grid and m ultigrid preconditioning, a s discussed in Sections 4 and 5 . Given a vector m = [ m 1 , m 2 , . . . ] T we denote by D m the diagonal matrix with ent ries D ii = m i , and by p · / m the v ector [ p 1 /m 1 , p 2 /m 2 , . . . ] T . With A as in (3.1), w = w h = [ w 1 , w 2 , . . . , w N h ] T , and m = v 1 · / ( u − u ) + v 2 · / ( u − u ) , the r educed system (3.8) can b e written a s  D m + β w + K T WK  δ u = r . (3.9) Left-m ultiplication with W − 1 further yields  D ( m / w )+ β e + W − 1 K T WK  δ u = W − 1 r . (3.10) Let p = p ( m / w ) + β e (comp onent -wise). By resca ling δ u ′ = D p δ u , and factoring out D p in (3.1 0), the s ystem be c omes  I + W − 1 L T WL  δ u ′ = D 1 · / p W − 1 r , (3.11) where L = L h = K h D 1 · / p , and w e us ed the commutation of the diag onal matrices W − 1 and D 1 · / p . W e prefer to write (3.11) in compac t form as ( I + H ) δ u ′ = r ′ , (3.12) with H = H h def = ( W h ) − 1 L T h W h L h , and r ′ = D 1 · / p W − 1 r . Note that the matrix in (3.9) is symmetric pos itiv e de ﬁnite, while the matrix H is symmetric (and p ositive deﬁnite) with r espect to the W -dot pro duct. F urthermore, it is easy to see that | | H | | = O ( β − 1 | | K | | 2 ) which implies that cond( I + H ) = O ( β − 1 | | K | | 2 ) , independently o f the mesh parameter h . How ever, for the mo del problems co nsidered, the matrix ( I + H ) is dense (in s tandard r epresentation), and therefor e for la rge-scale problems it cannot b e for med and/o r stored. Ma trix-vector multiplication can b e MUL TIGRID PRECONDITIONING FOR IPMs 11 per formed at a cost equiv a len t to tw o applica tions of the matr ix K , hence residual computations are exp ensive. So (3.12) has to be solved using iterative metho ds, and for inc r eased eﬃciency we need high-qualit y , matrix-free preconditioner s. As it turns out, it is the system (3.12) r ather than (3.9) that render s itself to go o d m ultigrid pr econditioning. In the next sections w e develop a multigrid preconditioner for (3.12) under the assumption that m repres e n ts a po sitiv e a nd r elativ ely “smo oth” function µ h . 4. The t w o-grid preconditi o ner. In this section w e dev elop and analyze a t wo-grid preconditioner for the linear system (3.12). The work r elies on the multigrid techn iques developed b y Dr˘ ag˘ anes cu a nd Dup on t in [1 3] for (1 .5 ). As will be shown, the co nstructed prec onditioner has, under certain hypothese s , optimal order qualit y with r e s pect to the discretization par a meter h . 4.1. Algorithm design. F or the purp ose of alg orithm desig n and ana lysis it is adv antageous to reg a rd (3.1 2) as an equation in V h rather than R N h , so we hav e to identify the opera to r in L ( V h ) that is repre s en ted b y H h . First we deﬁne for λ ∈ L ∞ (Ω) the mu ltiplication-by- λ op erator D λ : L 2 → L 2 by D λ u = λ u , and its discrete version D h λ ∈ L ( V h ) by D h λ u = I h D λ u . Given a vector m ∈ R N h we deﬁne a function µ h ∈ V h by setting µ h ( P h i ) = m i , i = 1 , . . . , N h ; it follows tha t the diagonal matrix D ( m / w )+ β represents the op erator D h ( µ h /w h )+ β = I h D ( µ h /w h )+ β . T o simplify nota tion let λ h = ( µ h /w h ) + β . (4.1) Then L h = K h D 1 · / p represents the op erator L h = K h D h 1 / √ λ h . (4.2) If we denote b y L ∗ h the dual of L h with r e s pect to the h· , ·i h -inner pr o duct, that is, hL h u, v i h = h u, L ∗ h v i h , ∀ u, v ∈ V h , then L ∗ h is r epresented by W − 1 h L T h W h , so H h = ( W h ) − 1 L T h W h L h represents the op erator H h def = L ∗ h L h . (4.3) Hence, the op erator we need to inv ert for solving (3 .1 2) is G h = I + H h . (4.4) Note tha t the op erator G h is s ymmetric with r espect to h· , ·i h , i.e., G h = G ∗ h . 12 A. DR ˘ AG ˘ ANESCU AND C. P E T R A The idea behind the prop osed tw o-grid preconditioner for G h lies in the “ smoo th- ing” pr o perties of L h . Mo r e precisely , we regar d L h as a discretization of L = K D 1 / √ λ for so me function λ for which I h ( λ ) = λ h . If λ (assumed to b e > β > 0) is relatively smo oth (e.g., it can alwa ys b e c hosen to be C 2 ), then the multiplication oper ator D 1 / √ λ is neither s moothing nor roughening, so the follow-on a pplica tion of K r esults in smo othing. An alternative p oint of view is that D 1 / √ λ is b ounded in L ( L 2 ), and if K is c ompact, then KD 1 / √ λ is also compact. Hence, it is na tur al to assume that H h = L ∗ h L h is “ smoo thing”, even thoug h L ∗ h has no direct connection with the dual of KD 1 / √ λ in L ( L 2 ). W e consider the L 2 -orthogo nal splitting of the discrete space V h = V 2 h ⊕ W 2 h , (4.5) and let π = π 2 h be the L 2 -pro jector onto V 2 h . F ollowing [33, 23, 12, 13], w e pr o pose N h = ρ + G 2 h π (4.6 ) as a tw o-grid preconditioner , w he r e ρ = ρ 2 h = I − π 2 h is the pr o jector on W 2 h , and the co arse function λ 2 h ent ering the deﬁnition of G 2 h is given by λ 2 h = I h λ h . (4.7) The op e r ator N h can also be regarded as an additive Sc h wartz preconditio ne r with resp ect to the splitting (4.5) (see [2 3]). Moreover, the inverse of N h is g iv en by S h def = N − 1 h = ρ + G − 1 2 h π . (4 .8) F or developing a m ultigrid alg o rithm of co mparable qualit y with the tw o-grid pre- conditioner we follow the same stra teg y as in [12, 1 3], which w e brie ﬂy outline in Section 5. As s hown in the a na lysis, the use of the L 2 -pro jector π 2 h in the deﬁnition of S h , as oppo sed to o ther pro jectors o r res triction op erator s tur ns o ut to b e critical for the quality of the preconditioner. Unfortunately π 2 h is not symmetric with resp ect to h· , ·i h , therefore N h and S h are s y mmetric ne ither with r espect to h· , ·i h nor to h· , ·i L 2 , but they ar e a lmost symmetric. At the ro ot of this problem lies the use of mesh- depe ndent norms in the formulation of the discre te optimizatio n problem (2.5), which in turn was necessar y for the linear s ystems inner to the interior p oint algor ithms to be of the form (3.9), that is, to ha ve a diag o nal matrix D m + β w added to K T WK . Had we used the exact L 2 -norm in the discrete formulation (2.5), then the matrix in the sys tem (3.9) would have had the form D m + β M + K T MK , where M is the mass matrix, a nd this form is les s conv enient for preconditioning. In or der to des cribe the ma trices repr esen ting N h and S h we co nsider the pr olon- gation op e rator J h ∈ M N h × N 2 h representing the natura l embedding o f V 2 h int o V h , and we deﬁne the restriction R 2 h ∈ M N 2 h × N h by R 2 h = 2 − d J T h , d b e ing the dimensio n of the ambien t space (here d = 2). Then π 2 h is repr esen ted by the matr ix Π 2 h = M − 1 2 h · R 2 h · M h , MUL TIGRID PRECONDITIONING FOR IPMs 13 where M h (resp. M 2 h ) is the rescaled ma s s matrix on the ﬁne (re s p. coar se) mesh, deﬁned by ( M h ) ij = h − d  ϕ h i , ϕ h j  . Note that Π 2 h is a matrix of type N 2 h × N h , and that M 2 h = R 2 h · M h · J h . F urther more, the square N h × N h pro jection matrix is given by P h = J h · Π 2 h , so that P 2 h = P h . The pr o jector ρ = ( I − π ) is r epresented by the matrix Q h = ( I − P h ) ∈ M N h × N h . Finally , S h is repr esen ted by S h = Q h + J h G − 1 2 h Π 2 h . (4.9) Oftent imes in practice the exact pr o jection Π 2 h in (4.9) is replaced b y the restr ic- tion R 2 h , and Q h is taken to be ( I − R 2 h ). While for ce r tain problems this is a viable option [2], for the a pplications consider ed in this w ork the quality of the preconditioner is signiﬁca nly diminished by this c hange. 4.2. Algorithm analysi s. Our a nalysis consis ts of ev aluating the quality of the t wo-grid preconditioner b y event ually estimating the sp ectral dista nc e d σ ( S h , G − 1 h ) (Theorem 4.9), and is pe r formed over three steps. First w e ev alua te the norm-dista nc e | |G h − N h | | under the assumption that H h satisﬁes Condition 4.1 below. Second, we show that if L and L h verify Condition 2.1, then H h satisﬁes Condition 4.1. Third, we show that L , L h satisfy Condition 2.1 (SA C) with a constant C ( L ) dep ending on λ and on C ( K ), that is, the constant as socia ted to K , K h satisfying SA C. F or sp eciﬁc applications, the fact that K , K h satifsy SA C is normally veriﬁable, as is shown in Section 6. Condition 4.1. The op er ators H h ar e s ymm et ric with r esp e ct to h· , ·i h , p ositive semideﬁnite, and uniformly b ounde d with r esp e ct to h ∈ I , that is, ther e exists a c onstant C ( H ) > 0 indep en dent on h such t hat | | |H h | | | h 6 C ( H ) , ∀ h ∈ I . (4.10) Mor e over, ther e exists p > 0 so that | | ( H h − H 2 h π 2 h ) u | | 6 C ( H ) h p | | u | | , for all u ∈ V h , h ∈ I . (4.11) F or linear splines the optimal approximation o r der is p = 2, but for certa in problems and discre tiza tions the actual rate may b e suboptima l. Lemma 4.2. If H h satisﬁes Condition 4.1 , then | |G h − N h | | 6 C ( H ) h p . (4.12) Pr o of . This result is an immedia te conseque nce of (4 .1 1), since G h − N h = ( I + H h ) − ( ρ + ( I + H 2 h ) π ) = H h − H 2 h π . V erifying that (4.11) o f Condition 4 .1 ho lds under s ome general hypotheses is non- trivial for this problem due to the presenc e of multiple inner pr oducts that hav e to be taken in to co nsideration. Mo re precisely , if L ∗ h were the dual of L h with resp ect to h· , ·i ins tead o f h· , ·i h , then Condition 4.1 would follow from the approximability of L by L h together with the smo othing prop erties o f L , as is s hown in [13] (pro of of Theorem 4 .1). Hence a natural requirement is that h· , ·i h approximates h· , ·i well. Lemma 4.3. With w h chosen as in (2 .1) ther e ex ist s a c onstant C = C ( T 0 ) > 0 indep endent of h such that |h u, v i h − h u, v i| 6 C h 2 | | u | | H 1 (Ω) · | | v | | H 1 (Ω) , ∀ u , v ∈ V h . (4.1 3) 14 A. DR ˘ AG ˘ ANESCU AND C. P E T R A Pr o of . By Theor em 4.4.4 in [9], given d = 2, there exists a constant C > 0 that depe nds on the “ch unkiness” of the initial triangula tion T 0 but is indepe ndent of h and of T ∈ T h , so that for all u, v ∈ V h | | uv − I h ( uv ) | | L 1 ( T ) 6 C h 2 | uv | W 2 1 ( T ) u,v line ar on T = C h 2 Z T |∇ u · ∇ v | . After summing ov er all triangles T ∈ T h we o bta in | | uv − I h ( uv ) | | L 1 (Ω) 6 C h 2 | |∇ u · ∇ v | | L 1 (Ω) 6 C h 2 | u | H 1 (Ω) · | v | H 1 (Ω) . The co nclusion follows from |h u, v i − h u, v i h | =     Z Ω ( uv − I h ( uv ))     6 | | uv − I h ( uv ) | | L 1 (Ω) . Throughout this section w e denote b y C ( L ) a generic constant that is prop or- tional to the constant C ( L ) o f Condition 2.1 if the prop ortionality dep ends only o n the domain Ω and the initial triangulation T 0 . W e deﬁne the restrictio n ope rator R w 2 h : V h → V 2 h by h ι h u, v i h = h u, R w 2 h v i 2 h , ∀ u ∈ V 2 h , v ∈ V h , where ι h : V 2 h → V h is the inclusion ope r ator. It follows that R w 2 h is uniformly bo unded with resp ect to h ∈ I , that is, there exis ts C indep enden t of h ∈ I so that | |R w 2 h u | | 6 C | | u | | , ∀ u ∈ V h . (4.14) W e call a triangulatio n T lo c al ly symmetric if for every vertex P the union of triangles in T having P as a corner is inv ariant with r e s pect to the reﬂec tio n through P given b y r P ( x ) = (2 P − x ). If T is lo cally symmetric and ϕ P is the no dal basis function at P , then ϕ P ◦ r P = ϕ P . F ur thermore, a simple calculation shows that for any linear map L ( x ) = a 1 x 1 + a 2 x 2 we have Z Ω ϕ P ( x ) L ( x − P ) dx = 0 . (4.15) Naturally , a unifor m mesh is lo cally s ymmetric. The following group ed results ar e either simple consequences of Co ndition 2.1 or extracted fr om [13]. Lemma 4.4. If L , L h satisfy Condition 2.1 ther e exist c onstants C ( L ) and C ′ = C ′ (Ω) indep endent of h such that t he fol lowing hold: (a) H 1 , L 2 - uniform st abili ty of L h : | |L h u | | H m (Ω) 6 C ( L ) | | u | | , ∀ u ∈ V h , m = 0 , 1 ; (4.16) (b) smo othing of ne gative-index norm: | |L u | | 6 C ( L ) | | u | | e H − m , ∀ u ∈ V h , m = 1 , 2 ; (4.17) (c) ne gative-index norm appr oximation of the identity by π 2 h , R w 2 h : | | ( I − π 2 h ) u | | e H − 2 (Ω) 6 C ′ h 2 | | u | | , ∀ u ∈ V h ; (4.18) | | ( I − R w 2 h ) u | | e H − p (Ω) 6 C ′ h p | | u | | , ∀ u ∈ V h , (4.19) MUL TIGRID PRECONDITIONING FOR IPMs 15 wher e p = 1 on an unstruct u r e d grid, and p = 2 on a lo c al ly symmetric grid; (d) L diminishes high -fr e quencies: | |L ( I − π 2 h ) u | | 6 C ( L ) h 2 | | u | | , ∀ u ∈ V h ; (4.20) | |L ( I − R w 2 h ) u | | 6 C ( L ) h p | | u | | , ∀ u ∈ V h , (4.21) wher e p = 1 on an unstruct u r e d grid, and p = 2 on a lo c al ly symmetric grid; (e) |hL u, L v i − hL h u, L h v i| 6 C ( L ) h 2 | | u | | · | | v | | , ∀ u ∈ V h . (4.22) Pr o of . The sta bilit y conditions at (a ) a re direc t co nsequences of (2.3) and (2.4), and (b) follows from (2.3) (see also [13] Coro llary 6.2). The estimate (4.18) is a straightforward conseque nce of the Bramble-Hilbert Lemma [9], while (4.19) follows from Theorem 6.6 in [13] (see E xample 6.7 for the unifor m mesh case). The inequalities at (d) follow from (b) and (c), and (e) follows from (2.4) and the uniform b oundedness of L h . Proposition 4.5. I f the op er ators L , L h satisfy Condition 2.1 with the weights given by (2.1) , then Condition 4.1 holds with C ( H ) = C ( L ) and p = 2 if the meshes ar e lo c al ly symmetric, or p = 1 otherwise. Pr o of . T o s implify notation w e write π = π 2 h , R = R w 2 h . Firs t w e hav e hL ∗ 2 h L 2 h π u, v i h = hL ∗ 2 h L 2 h π u, R v i 2 h = hL 2 h π u, L 2 h R v i 2 h , and hL ∗ h L h u, v i h = hL h u, L h v i h . Therefore |hL ∗ 2 h L 2 h π u, v i h − hL ∗ h L h u, v i h | = |hL 2 h π u, L 2 h R v i 2 h − hL h u, L h v i h | 6 |hL 2 h π u, L 2 h R v i 2 h − hL 2 h π u, L 2 h R v i| | {z } A 1 + |hL 2 h π u, L 2 h R v i − hL π u, LR v i| | {z } A 2 + |hL π u, LR v i − hL u, L v i| | {z } A 3 + |hL u, L v i − hL h u, L h v i| | {z } A 4 + |hL h u, L h v i − hL h u, L h v i h | | {z } A 5 . F or A 1 we have A 1 (4.13) 6 C ( L ) h 2 | |L 2 h π u | | H 1 (Ω) · | |L 2 h R v | | H 1 (Ω) (4.16) 6 C ( L ) h 2 | | π u | | · | |R v | | (4.14) 6 C ( L ) h 2 | | u | | · | | v | | , and a similar estimate holds for A 5 . Also (4.22) implies that max( A 2 , A 4 ) 6 C ( L ) h 2 | | u | | · | | v | | . F or A 3 we have A 3 6 |hL ( π − I ) u, LR v i| + |hL u, ( L ( R − I ) v ) i| (4.20) , (4.21) 6 C ( L ) h p | | u | | · | | v | | . 16 A. DR ˘ AG ˘ ANESCU AND C. P E T R A Since p 6 2, A 3 is the w eak link, and w e have |h ( H 2 h π − H h ) u, v i h | = |h ( L ∗ 2 h L 2 h π − L ∗ h L h ) u, v i h | 6 C ( L ) h p | | u | | · | | v | | , ∀ v ∈ V h , and the conclusion fo llows fro m the equiv alence of | | · | | with | | | · | | | h . F or ρ ∈ W 2 ∞ (Ω) we denote b y | | ρ | | W 2 ∞ (Ω) / R = max 1 6 | α | 6 2 | | ∂ α ρ | | L ∞ (Ω) , which eﬀectiv ely is the no rm on the quotient space W 2 ∞ (Ω) / R . Lemma 4 . 6. If ρ ∈ W 2 ∞ (Ω) then ther e exists a c onstant C > 0 indep endent of ρ, h so that | | ( D ρ − D h ρ h ) u | | 6 C h 2 | | ρ | | W 2 ∞ (Ω) / R | | u | | H 1 (Ω) , ∀ u ∈ V h , h ∈ I , (4.23) wher e ρ h = I h ( ρ ) . Pr o of . Note ﬁrst tha t D h ρ = D h ρ h = I h D ρ , since only the no de v alues of ρ en ter the deﬁnition of the op erator D h ρ . Given u ∈ V h , for each triangle T ∈ T h we have | ρ u | H 2 ( T ) 6 C | | ρ | | W 2 ∞ ( T ) / R · | | u | | H 1 ( T ) , (4.24) bec ause u is linear on T . Therefore     ( D ρ − D h ρ ) u     = | | ρ u − I h ( ρ u ) | | 6 C h 2 X T ∈ T h | ρ u | 2 H 2 ( T ) ! 1 2 6 C h 2 X T ∈ T h | | ρ | | 2 W 2 ∞ ( T ) / R | | u | | 2 H 1 ( T ) ! 1 2 6 C h 2 | | ρ | | W 2 ∞ (Ω) / R | | u | | H 1 (Ω) . Lemma 4 . 7. If ρ ∈ W 2 ∞ (Ω) then ther e exists a c onstant C > 0 indep endent of ρ, h so that | | ( D ρ − D h ρ h ) u | | e H − 1 (Ω) 6 C h p | | ρ | | W p ∞ (Ω) / R | | u | | , ∀ u ∈ V h , h ∈ I , (4.25) wher e ρ h = I h ( ρ ) , and p = 2 if the mesh is lo c al ly symmetric, otherwise p = 1 . Pr o of . W e fo cus on the s ituation when the mesh is loca lly symmetric and leav e the general case as an exer cise. L et u ∈ V h , v ∈ H 1 0 (Ω) b e arbitrar y , and denote b y S i = supp( ϕ h i ). The constant C is assumed to be indep enden t of u, v , ρ, h . First remark that each triangle in T h lies in at mo st three of the s ets S i and that diam( S i ) 6 C h, 1 6 i 6 N h (4.26) due to the qua si-uniformit y for the meshes. Also note that ρϕ h i − I h ( ρϕ h i ) = ϕ h i ( ρ − ρ ( P h i )) , 1 6 i 6 N h . (4.27) F urther we deﬁne v i = 1 area( S i ) R S i v , for 1 6 i 6 N h , and u i = u ( P h i ). Then u = P N h i =1 u i ϕ h i and    ( D ρ − D h ρ h ) u, v    =     Z Ω ( ρu − I h ( ρu )) v     (4.27) =      N h X i =1 u i  Z S i ϕ h i ( ρ − ρ ( P h i )) v       6 N h X i =1 | u i |      Z S i ϕ h i ( ρ − ρ ( P h i ))( v − v i )     +     Z S i ϕ h i ( ρ − ρ ( P h i )) v i      . MUL TIGRID PRECONDITIONING FOR IPMs 17 F or the ﬁr st term in the sum ab ov e     Z S i ϕ h i ( ρ − ρ ( P h i ))( v − v i )     6 | | ϕ h i | | · | | ρ − ρ ( P h i ) | | L ∞ ( S i ) · | | v − v i | | L 2 ( S i ) (4.26) 6 C h 2 | | ϕ h i | | · | ρ | W 1 ∞ ( S i ) · | v | H 1 ( S i ) . F or the se cond term in the sum we take a dv antage o f the loca l grid sy mmetry:     Z S i ϕ h i ( ρ − ρ ( P h i )) v i     (4.15) =     Z S i v i ϕ h i ( x )( ρ ( x ) − ρ ( P h i ) − dρ P h i ( x − P h i )) dx     6 | | v i ϕ h i | | L 1 ( S i ) · | | ρ − ρ ( P h i ) − dρ P h i ( x − P h i ) | | L ∞ ( S i ) 6 C h 2 | | v i | | L 2 ( S i ) | | ϕ h i | | · | ρ | W 2 ∞ ( S i ) . Since | | v i | | L 2 ( S i ) 6 | | v | | L 2 ( S i ) we now have     Z Ω ( ρu − I h ( ρu )) v     6 C h 2 N h X i =1 | u i || | ϕ h i | |  | ρ | W 1 ∞ ( S i ) · | v | H 1 ( S i ) + | | v | | L 2 ( S i ) | ρ | W 2 ∞ ( S i )  6 C h 2 | | ρ | | W 2 ∞ (Ω) / R N h X i =1 | u i | · | | ϕ h i | | · | | v | | H 1 ( S i ) 6 C h 2 | | ρ | | W 2 ∞ (Ω) / R N h X i =1 | u i | 2 · | | ϕ h i | | 2 ! 1 2 · N h X i =1 | | v | | 2 H 1 ( S i ) ! 1 2 6 C h 2 | | ρ | | W 2 ∞ (Ω) / R | | u | | · | | v | | H 1 (Ω) , where for the las t inequality we used the quasi-uniformity of the mesh and the fact the each triagle is in at most three of the sets S i . The co nclus ion follo ws after dividing by | | v | | H 1 (Ω) and taking the sup ov er all v ∈ H 1 0 (Ω). Proposition 4. 8. If the op er ators K , K h satisfy Condition 2.1 with p = 2 on t he lo c al ly-symmetric meshes T h with the weights given by (2.1) , then L , L h also satisfy Condition 2.1 with C ( L ) = C ( K ) | | λ − 1 2 | | W 2 ∞ (Ω) . If the meshes ar e not lo c al ly-s ymm et ric, then t he p ower of h in Condition 2 .1 [b] for the op er ators L , L h is 1 for b oth m = 0 , 1 . Pr o of . No te that for ρ ∈ L ∞ (Ω) , | |D ρ u | | 6 | | ρ | | L ∞ (Ω) · | | u | | , therefore | |K D 1 / √ λ u | | H m (Ω) 6 C ( K ) | |D 1 / √ λ u | | 6 C ( K ) | | λ − 1 2 | | L ∞ (Ω) · | | u | | , m = 0 , 1 , 2 , which implies the smoo thing condition (2.3). F o r u ∈ V h | | ( KD 1 / √ λ − K h D h 1 / √ λ ) u | | H 1 (Ω) 6 | |K ( D 1 / √ λ − D h 1 / √ λ ) u | | H 1 (Ω) + | | ( K − K h ) D h 1 / √ λ u | | H 1 (Ω) (2.3) , (2.4) 6 C ( K ) | | ( D 1 / √ λ − D h 1 / √ λ ) u | | + C ( K ) h | |D h 1 / √ λ u | | (4.23) 6 C ( K ) h 2 | | λ − 1 2 | | W 2 ∞ (Ω) / R | | u | | H 1 (Ω) + C ( K ) h | | λ − 1 2 | | L ∞ (Ω) | | u | | 6 C ( K ) h | | λ − 1 2 | | W 2 ∞ (Ω) | | u | | , 18 A. DR ˘ AG ˘ ANESCU AND C. P E T R A where in the la st inequality we hav e used an inv erse estimate. This prov es (2.4) for K , K h with m = 1 and concludes the pro of for the no n-loca lly-symmetric cas e. F or m = 2 and lo cally-sy mmetric mesh | | ( KD 1 / √ λ − K h D h 1 / √ λ ) u | | 6 | |K ( D 1 / √ λ − D h 1 / √ λ ) u | | + | | ( K − K h ) D h 1 / √ λ u | | (4.17) , (2.4) 6 C ( K ) | | ( D 1 / √ λ − D h 1 / √ λ ) u | | e H − 1 (Ω) + C ( K ) h 2 | |D h 1 / √ λ u | | (4.25) 6 C ( K ) h 2 | | λ − 1 2 | | W 2 ∞ (Ω) / R | | u | | + C ( K ) h 2 | | λ − 1 2 | | L ∞ (Ω) | | u | | 6 C ( K ) h 2 | | λ − 1 2 | | W 2 ∞ (Ω) | | u | | . W e co nclude the analysis of the t wo-grid preco nditioner with the computation of the sp e ctr al distanc e (SD) d σ betw een ( G h ) − 1 and the t wo-grid preconditioner S h deﬁned in (4.8). This step facilitates a smo oth transition from the tw o-gr id analys is to the multigrid analys is o f the next section. Given a Hilb ert spa ce ( X , h· , ·i ) we denote by L + ( X ) the s et of o p erato rs with p ositive deﬁnite symmetric part: L + ( X ) = { T ∈ L ( X ) : h T u, u i > 0 , ∀ u ∈ X \ { 0 }} . First we deﬁne the joined n umerical range of S, T ∈ L + ( X ) by W ( S, T ) =  h S C w, w i h T C w, w i : w ∈ X C \ { 0 }  , where T C ( u + i v ) = T ( u ) + i T ( v ) is the complexiﬁcation of T . Note that if T is symmetric pos itiv e deﬁnite, then W ( S, T ) is simply the numerical r ange of T − 1 2 S T − 1 2 . The sp ectral distance betw een S, T ∈ L + ( X ), in tro duced in [13] as a meas ur e o f sp ectral equiv alence b et ween S and T , is deﬁned by d σ ( S, T ) = s up {| ln z | : z ∈ W ( S, T ) } , where ln is the br anc h of the log arithm corresp onding to C \ ( −∞ , 0 ]. F ollowing Lemma 3.2 in [1 3], if W ( S, T ) ⊆ B α (1) = { z ∈ C : | z − 1 | < α } with α ∈ (0 , 1), then d σ ( S, T ) 6 | ln(1 − α ) | α sup {| z − 1 | : z ∈ W ( S, T ) } , (4.28) which oﬀers a practical w ay to estimate the sp ectral distance when it is small. The sp ectral distance serves bo th as a means to quantify the qualit y o f a pr econditioner and also as a co nvenien t analysis to ol for multigrid algorithms. Essentially , if t wo op erators S, T sa tis fy 1 − δ 6     h S C w, w i h T C w, w i     6 1 + δ , ∀ w ∈ X C \ { 0 } , with δ ≪ 1, then d σ ( S, T ) ≈ δ . If N ≈ G − 1 is a pr econditioner for G , then b oth d σ ( N , G − 1 ) and d σ ( N − 1 , G ) (quantities which ar e ar e equal if G, N are symmetric) are shown to con trol the sp ectral ra dius ρ ( I − N G ) (see Lemma A.2 in Appendix A for a precise form ulation), which is an accepted quality-measure for a preconditioner. The adv antage of using d σ ov er ρ ( I − N G ) is that the for mer is a true dis tance function. MUL TIGRID PRECONDITIONING FOR IPMs 19 Theorem 4.9. If the op er ators K , K h satisfy Condition 2.1 on t he lo c al ly- symmetric meshes T h with the weights given by (2.1) , and λ ∈ W 2 ∞ (Ω) satisﬁes I h λ = λ h , ther e exist C, δ > 0 indep endent of h , λ so that for h 2 | | λ − 1 2 | | W 2 ∞ (Ω) 6 δ d σ  G − 1 h , S h  6 C h 2 | | λ − 1 2 | | W 2 ∞ (Ω) , (4.29) wher e G h and S h ar e deﬁne d as in (4.4) and (4.8) . If the meshes ar e not lo c al ly- symmetric then t he p ower of h in (4.2 9 ) is 1 . Pr o of . Aga in, we r estrict our attention to the lo cally-symmetric case. The op er- ator G h is s ymmetric p ositive deﬁnite with resp ect h· , ·i h and satisﬁes hG h u, u i h = | | | u | | | 2 h + | | |L h u | | | 2 h > | | | u | | | 2 h . Therefore σ ( G h ) ⊆ [1 , ∞ ), and | | |G ν h | | | h 6 1 for all ν < 0. Due to the norm equiv alence | | |·| | | h ∼ | | ·| | there exists C 1 > 0 so that | |G − 1 2 h | | 6 C 1 . By Le mma 4.2 and Pro positio ns 4.5 and 4.8 we have | |G h − N h | | 6 C 2 h 2 | | λ − 1 2 | | W 2 ∞ (Ω) for so me constant C 2 > 0. Hence for C 3 = C 2 1 C 2 | | I − G − 1 2 h N h G − 1 2 h | | 6 | |G − 1 2 h | | 2 · | |G h − N h | | 6 C 3 h 2 | | λ − 1 2 | | W 2 ∞ (Ω) . Since for any op erator T we hav e | | I − T − 1 | | 6 (1 − α ) − 1 | | I − T | | if | | I − T | | 6 α , | | I − G 1 2 h S h G 1 2 h | | S h = N − 1 h 6 4 3 | | I − G − 1 2 h N h G − 1 2 h | | 6 4 C 3 3 h 2 | | λ − 1 2 | | W 2 ∞ (Ω) , (4.30) provided h 2 | | λ − 1 2 | | W 2 ∞ (Ω) 6 δ = 1 4 C 3 . By further restr icting δ , we ca n assume the right- hand side of (4.30) to b e 6 1 / 2, which implies that W ( S h , G − 1 h ) ⊂ B 1 2 (1). By (4.28) we o bta in d σ ( S h , G − 1 h ) 6 8 ln 2 3 C 3 h 2 | | λ − 1 2 | | W 2 ∞ (Ω) . The highlight o f the last result is the presence of O ( h 2 ) (or O ( h ) for genera l quasi-uniform meshes) in the right-hand side of (4.29), whic h is the o ptimal o rder of approximation in h . W e sho uld stress that for classica l multigrid metho ds for diﬀerential equations o ne has O (1) as the right-hand side estimate, which is suﬃcient for mesh-indep endence. In this case, if the theoretically int ro duced smo oth function λ could be the sa me for all meshes, the num b er of S h -preconditioned iterations is exp ected to de crease with h ↓ 0. In reality , the discrete function λ h is tied to the Lagra ng e multipliers v 1 , v 2 , which in turn are related (actually exp ected to co n verge to a s µ, h ↓ 0) the Lagrange multipliers λ , λ of the contin uous pr oblem. Since in general the latter a re only in L 2 , the factor | | λ − 1 2 | | W 2 ∞ (Ω) is ex p ected to b e unbounded as µ ↓ 0. Therefo r e the preconditioning qualities of S h are expected to increa s e with h ↓ 0 , but decrease with µ ↓ 0. Thus for lar g e-scale, high-re s olution problems, where h ≪ 1, the presented method is exp ected to p erform very well, esp ecially in connection with the m ultigrid metho d discusse d in the next section. Howev er, for ﬁxed h , as µ ↓ 0 in the IPM formulation and the approximate solution appr oaches b u , if the inequa lit y constra in ts are active then the qua lit y o f the prop osed preco nditioner normally degrades. The adv antages or disa dv antages of this metho d will ultimately be discussed based on n umerical exper imen ts in Section 6. 20 A. DR ˘ AG ˘ ANESCU AND C. P E T R A 5. The multigrid preconditioner. While the tw o-grid preconditione r S h may be eﬃcien t in terms o f n umber of iterations , it is exp ensive to apply . In this section we develop a m ultigrid preconditioner S mg h that also sa tisﬁes the optimal order es- timate (4.29) but has a le sser cost. Since the pro cess o f pas sing fro m a tw o-grid to a m ultigrid preconditioner of co mparable quality ha s b een analyzed in [13], we giv e here only a brief description. In this section we ass ume a ﬁnite n umber o f grids I max = { h i } 0 6 i 6 i max , h i = h 0 2 − i , and the goal is to ultimately c onstruct an eﬃcient m ultigrid preconditioner for the op erator on the ﬁnest grid. Consider the op erator I i i − 1 : L ( V h i − 1 ) → L ( V h i ) by I i i − 1 ( M ) def = ( I − π h i − 1 ) + M π h i − 1 . Cf. (4.8) we hav e S h i = I i i − 1 ( G − 1 h i − 1 ). If we deﬁne S V h , h ∈ I max , r ecursively by S V h i = ( G − 1 h 0 if i = 0 , I i i − 1 ( S V h i − 1 ) if 1 6 i 6 i max , (5.1) then S V h has a V-cycle structur e. How ever, it is s ho wn in [13] that S V h is sub optimal, in that it satisﬁes (4.29) with h 2 replaced by h 2 0 . Thu s the quality of S V h do es not improv e with h ↓ 0 , a s desired, it is simply mesh-indep endent (it only dep ends on h 0 ). T o achiev e the desir ed result w e deﬁne the o pera tor N i : L ( V h i ) → L ( V h i ) N i ( M ) def = 2 M − M G h i M . The latter is r elated to Newton’s metho d for the op erator -equation X − 1 − G h i = 0; namely , if X 0 is a go o d gue s s at the solution, i.e., approximates well G − 1 h i , then the ﬁrst Newto n iterate starting a t X 0 is X 1 = N i ( X 0 ) (see also Re ma rk 3.1 1 in [13]). W e deﬁne the multigrid preconditioner using the following alg orithm: Algorithm 1: Op er ator-form deﬁnition of S mg h i 1. if i = 0 2. S mg h 0 := G − 1 h 0 % coarsest level 3. else if i < i max 4. S mg h i := N i ( I i i − 1 ( S mg h i − 1 )) % in termediate lev el 5. else 6. S mg h i := I i i − 1 ( S mg h i − 1 ) % ﬁnest level 7. end if 8. end if The key factor in Algorithm 1 is the applicatio n o f N i at Step 4, and here is why: while G − 1 h i is well approximated b y I i i − 1 ( S mg h i − 1 ) provided that G − 1 h i − 1 ≈ S mg h i − 1 (recall that G − 1 h i ≈S h i = I ( G − 1 h i − 1 )), an application of N i brings I ( S mg h i − 1 ) even closer to G − 1 h i . This step is critica l if we w ant d σ ( G − 1 h , S mg h ) = O ( d σ ( G − 1 h , S h )). Also, there are tw o MUL TIGRID PRECONDITIONING FOR IPMs 21 main rea sons for splitting the cases of in termediate vs. ﬁnest re s olution, as opp osed to just repla cing I i i − 1 ( S V h i − 1 ) with N i ( I i i − 1 ( S V h i − 1 )) in (5.1). First w e w ould lik e to hav e S mg h = S h for h = h i max if only tw o g rids are used. Second, the application of N i includes a multiplication by G h i ; since for the int ended lar ge-scale applicatio ns the ﬁnest-level mat-vec is exp ected to b e very costly , w e prefer tha t no such mat-vecs are computed i nside the pr econditioner. Theorem 5.1. In the hyp otheses of The or em 4 .9 , and with S mg h deﬁne d as in Algo rithm 1 ther e exist C, δ > 0 indep endent of h and λ so that for h 2 | | λ − 1 2 | | W 2 ∞ (Ω) 6 δ d σ  G − 1 h , S mg h  6 C h 2 | | λ − 1 2 | | W 2 ∞ (Ω) , for h = h max . (5.2) The pro of of Theor em 5.1 follows closely that of Theorem 5.4 in [13] a nd, in the int erest of brevity , w e do not g ive further details. Suﬃce it to say that the use of the sp ectral dista nc e is instrumental, and that an essential ingredient is the symmetry (with res pect to h· , ·i h ) of G h . In pra ctice, for large- scale pr oblems, neither G h nor S mg h are ever formed, so bo th are to be applied matrix-free. A simple veriﬁcation shows that, given so me M ∈ L ( V h i ), the vector ˜ u = ( N i ( M )) r can be computed b y setting ˜ u := u 2 where u k +1 := u k + M ( r − G h i u k ) with u 0 = 0. Thus the matrix-free application of S mg h is computed by the following function: Algorithm 2: Matr ix-free implemen tation of the action u = S mg h r . 1. functio n u = M G ( r , i ) 2. if i = 0 % coa rsest level 3. u := G − 1 h 0 r % direct or unpreco nditioned CG s olv e 4. else 5. u := ( I − π h i − 1 ) r + M G ( π h i − 1 r , i − 1) 6. if i < i max % in termediate level 7. r 1 := r − G h i u 8. u 1 := ( I − π h i − 1 ) r 1 + M G ( π h i − 1 r 1 , i − 1) 9. u := u + u 1 10. end if 11. end if As can be readily seen, Algorithm 2 has a W-cycle structure. T o estimate the cost of M G ( · , i max ) we denote by T ( i ) the cos t of applying M G ( · , i ) for 0 < i < i max . If we assume that one r e sidual c o mputation a t level i ha s complexity O ( N 2 h i ), and that the cost of computing an L 2 -pro jection is negligible compared to that of a residual computation (this is reasona ble f or most a pplications since mass ma trices are normally easy to inv e r t) then the resulting r ecursion for the function T reads: T ( i ) = O ( N 2 h i ) + 2 T ( i − 1) . F or i = i max the term O ( N 2 h i ) is replaced by a p otentially smaller cost of just comput- ing an L 2 -pro jection. Giv en that N h i − 1 ≈ 2 − d N h i (here d = 2 ), a standa rd ar g umen t 22 A. DR ˘ AG ˘ ANESCU AND C. P E T R A shows that T ( i max ) = O ( N 2 i max ) , that is, a cost that is prop ortiona l to that o f a r esidual computation. Another comment refers to a detail that is not very tra nsparent in Algor ithm 2, namely that coars e versions of λ h i are necessa ry for each level, bec ause G h i = I + ( D h i 1 / √ λ h i ) ∗ ( K h i ) ∗ K h i D h i 1 / √ λ h i . Since the orig ina l problem is solved starting at the ﬁnest level, where λ h i max is g iv en by the optimiza tio n a lgorithm, the functions λ h i are obtained by s imply dis carding the v alues at ﬁner no des of λ h i max . The parameter β is hidden in λ and aﬀects the pro cess indirectly . 6. Applications and n umerical example s. In this section w e discuss tw o applications. The ﬁrst is related to the inv erse co n tamination problem studied in [2 , 1], where K is a time- T s olution op erator of a parab olic equation. The second is a standard e lliptic-constrained optimal control problem with additional b ox-constr ain ts on the control. 6.1. Solutio n s trategi es and m etrics for success. F or both applications w e apply Me hr otra’s algor ithm and we solve the inner linear sys tems using the m ultigrid preconditioner previous ly deﬁned. In the a bsence of m ultiple grids, the linear sys- tems (3.12) a re solved using conjugate gra dien t (CG), while for more than one lev el we used MG-preconditioned conjuga te gradient square d (CGS), be c ause of the slight non-symmetry of the MG pre conditioner. As a ﬁrst metric we r ecord the num b er of inner linear iterations needed at each outer iteration; secondly , we reco rd the total nu mber of ﬁnest-grid mat-vecs for the en tire solution pro cess. Recall that each outer iteration r equires the solution of tw o linea r systems with identical ma tr ices, namely one for the predictor step and one for the corrector step; in the interest of the pre- sentation we reco rd only the linear iterations for the predictor step. Also, a small nu mber o f mat-vecs ar e required in the pro cess in addition to those nee de d for the predictor-co rrector solves, and ar e r e ﬂe c ted in the count. With resp ect to the second metric we rema r k that the prop osed a lgorithm is intended for larg e - scale pro blems , with the mos t expensive computation being the ﬁnest-scale residual computation. The ultimate g o al is to sig niﬁcan tly reduce the total num b er of ﬁnest-scale mat-vecs, bec ause this is ex pected to b e directly link ed with execution time in a truly large- scale computation. With r egard to the s econd metric we would like the total num ber of ﬁnest-level mat-vecs to decrease with h ↓ 0. As for the ﬁrst metric (n um b er of iterations) we would like to witness the following: [a.] The num b er of MG-preco nditioned CGS-iterations should be less than half of the unpreconditioned CG-iter ations (each CGS-itera tion in volv es t wo mat-vecs, while a CG iter ation requires o nly one). [b.] F or a given resolution h , the num ber of MG- preconditioned CGS iterations s hould be re lativ ely b ounded with resp ect to the num b er o f levels used, provided the coar sest level is suﬃciently ﬁne, a s stated in Theorem 5.1. [c.] Mostly imp ortant, the num b er o f MG-preconditio ned CGS i te rations should decrease with h ↓ 0 ; in other words, the MG-preco nditioned CGS beco mes increasingly adv antageous compared to CG as the problem-size increas es. One word of ca ution though: linear s ystems of diﬀerent reso lutio ns are no t necessar ily related in MUL TIGRID PRECONDITIONING FOR IPMs 23 a direct fashion, since their “ λ ” is dictated b y the evolution of Mehrotra’s algor ithm, which is slightly diﬀerent for each r e s olution. F or e xample, the ten th linea r system to be s o lv ed in the IPM pro cess at a reso lution 2 h is not necess arily s ome coa r se version of the ten th system to be solved at r e s olution h . Also, we should p oint out that in all our tests we use a c old start, that is, we do not take adv antage of results from coa r ser levels except for in the MG- s olv e of the inner linear systems. While for r ealistic applications “warm-start” strategies are essential, we r estrict o ur attention to the wa y o ur multigrid technology plays a r ole in so lving the inner linear sy s tems. 6.2. Time-reversal for a parabol ic equation. W e conside r the pr o blem of ﬁnding the initial state (the co n trol) for a system gov e rned b y a par abo lic equation given the s tate at a later time T under additional box-constraints on the control. Multigrid-preco nditioning for the unco ns trained v ersio n of this problem was studied in detail in [1 2, 13] and for the space-time measurements in [2]. F or mally , we consider the following parab olic initial v alue problem with p erio dic bo undary conditions    ∂ t y − ∂ x ( a∂ x y + by ) + cy = 0 , on [0 , 1] × (0 , T ] , y (0 , t ) = y (1 , t ) , ∂ x y (0 , t ) = ∂ x y (1 , t ) , f or t ∈ (0 , T ] , y ( x, 0) = u ( x ) , for x ∈ [0 , 1] , (6.1) where a > 0 , b, c > 0 ar e constants, and T > 0 is the end-time. F or t > 0 we denote by S ( t ) ∈ L ( L 2 ([0 , 1])) the time- t so lution op erator mapping the initial v alue onto y ( · , t ) u S ( t ) 7− → y ( · , t ) , and let K = S ( T ). The discre te K h is obtained by using a Galerkin formulation with contin uo us piecewise elemen ts on a uniform grid for the spatia l discretiza tion and Crank-Nicols on in time. It is shown in [30] (see also [3 6]) that for s uﬃcien tly small h the following estimate holds: | |K u − K h π h u | | H m 6 C h 2 − m | | u | | , ∀ u ∈ L 2 ([0 , 1]) , m = 0 , 1 , (6.2) where C = C ( T ), pr o vided the time step k is prop ortional to the s patial r esolution k = C 1 h , with C 1 chosen to ensure stability . Consequently , s pa ce and time resolutio ns are reﬁned a t the same rate, and Condition 2.1 is veriﬁed, so our theory applies. The sp eciﬁc details (b o undary conditions , co nstan t advection etc.) in this example w ere chosen for co n venience, howev er, t wo and thre e spatial dimensions, other types of bo undary conditions, as well as smo othly v arying functions in place of the constants a, b, c ar e s upp orted. A dir e ct veriﬁc ation of the c onver genc e or der. As mentioned earlier , when r unning Mehrotra’s algo rithm with diﬀeren t res olutions, the a dded diagonal terms λ ma y not be in direct re lationship with ea c h o ther. Hence, in o rder to pr actically verify the presence of h 2 in the estimate (4 .2 9) w e r e sort to an artiﬁcial context: w e construct G h based on a ﬁxed function λ h = I h (sin) + β for h j = 80 · 2 j , j = 0 , 1 , 2 , 3, and we deﬁne the corresp onding tw o-grid preconditioners N h . Then we compute the “distances” d h = max { | ln α | : α ∈ σ ( G h , N h ) } . Since d h approximates the sp ectral distance of interest (beca use N h is close to b eing symmetric, actually w e have d h 6 d σ ) we exp ect to see that d h = O ( h 2 ). W e rep eat the e x periment for β = 1 , 0 . 1 , 0 . 01. The results presented in T able 6.1, while not yet c on verged, give a s trong indica tio n of an asymptotic r ate lim h → 0 d 2 h /d h = 4. 24 A. DR ˘ AG ˘ ANESCU AND C. P E T R A T able 6.1 d h = max {| l n α | : α ∈ σ ( G h , N h ) } for λ ( x ) ≈ (sin( x ) + β ) . h \ β 1 0.1 0.01 d h rate d h rate d h rate 1/80 0 .0206 0.1127 0.281 2 1/160 0.0066 3.1 342 0 .0363 3.1 0 78 0.127 0 2.2140 1/320 0.0020 3.3 140 0 .0102 3.5 4 88 0.044 5 2.8535 1/640 0.0006 3.5 199 0 .0027 3.7 3 65 0.012 3 3.6284 Numeric al s t udy. W e consider the “tr ue” initial v alue u 0 suppo rted on tw o in- terv als with u 0 reaching the v alue 1 on one of the in terv a ls and 1 / 2 on the other int erv al, then let f = K u 0 . Sp eciﬁc v alues a re a = 4 · 1 0 − 3 , b = 0 . 4 , c = 0, and T = 0 . 8. In Figur e 6.1 we show u 0 , f as well as the conv erged solution u min of the box-constr ained optimization pr oblem with β = 10 − 3 , a v alue chosen b ecause of the relatively g oo d (visual) a greement of u 0 with u min . W e run Mehrotra’s alg orithm fo r h = 2 − 10 , 2 − 11 , 2 − 12 , 2 − 13 , and for ea c h h we test the so lv ers with 1, 2, and 3 levels. The num b er o f linear iterations required by each of the linear solves in the predictor step are shown in Figure 6.2, while the co rresp onding v a lue s for | | λ − 1 / 2 | | W 2 ∞ and µ ar e shown in the top tw o pictures of Figur e 6.3. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 u 0 u min f f u 0 u min Fig. 6.1 . Solution with f = K u 0 , β = 10 − 3 , 0 6 u 6 1 . First w e remark that the num b er o f unpreconditioned CG itera tio ns app ears to be mesh-indepe nden t (top chart in Figure 6 .2): essentially the curves r epresenting the n umber of iterations for each of the resolutions more-or-less o verlap. W e notice only a slig ht increase in num be r of itera tions for higher resolutions . Seco nd, from the middle chart in Figure 6.2 re present ing the n umber of t wo-grid CGS iterations , we infer that the num ber of tw o-grid preco nditio ne d itera tions consistently decreases MUL TIGRID PRECONDITIONING FOR IPMs 25 0 5 10 15 20 25 0 10 20 unpreconditioned CG 1024 2048 4096 0 5 10 15 20 25 0 5 10 CGS with two−grid preconditioner number of linear iterations 1024 2048 4096 0 5 10 15 20 25 0 5 10 CGS with three−grid preconditioner outer iteration 1024 2048 4096 Fig. 6.2 . Numb er of iter ations for e ach of t he pr e dictor-step line ar systems solve d ( β = 10 − 3 ) . 0 5 10 15 20 25 0 2 4 6 x 10 8 evo lu tio n o f | | 1 / λ 1 / 2 | | W 2 ∞ 0 5 10 15 20 25 −10 −5 0 evo lu tio n o f th e p a ra m eter µ fo r h = 1 / 4 0 9 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 20 40 λ a t l ast it era tio n fo r h = 1 / 4 0 9 6 1024 2048 4096 Fig. 6.3 . T op: discr ete | | λ − 1 2 h | | W 2 ∞ b ase d on discr ete L aplacian should gi ve an ide a of the size of | | λ − 1 2 | | W 2 ∞ ; midd le: l og -plot of µ as a function of outer iter ation; b ott om: λ h for h = 1 / 4096 at the last outer iter ate ( β = 10 − 3 ) . with h ↓ 0, as desired. F or ex ample, at the t wen t y-fourth iter ation these num be r s are 8 , 6, and 4, w hile at the tw en tieth they a re 7 , 4 , 4. This phenomenon is rep eated for the three-grid preconditioner as can be seen from the bo ttom c hart in Figure 6.2. 26 A. DR ˘ AG ˘ ANESCU AND C. P E T R A Moreov er, for the mo s t par t, the n um b er o f MG-CGS preco nditioned iterations is signiﬁcantly s maller than half the num b er of unpreco nditioned CG iterations: e.g., for h = 1 / 4096, up to the thirteenth itera tion (where µ is alr eady down to approx. 1 0 − 8 , see Fig ure 6.3) only one t wo-grid (or three-gr id) MG-preconditio ne d CGS iter ation is necessary , while up to 15 unpre conditioned CG iterations ar e neede d. Ho wev er, after the thirteen th iter ation | | λ − 1 / 2 | | W 2 ∞ shows a sig niﬁcan t incr ease, a s seen on the top chart of Fig ure 6.3, a nd the MG-preconditioned CGS is less eﬃcien t: for h = 1 / 40 96, at the tw ent y-fourth iteration 4 t wo-grid iterations ar e needed co mpared to 13 CG iterations, a le sser adv antage compared to the earlier outer itera tions. The b ottom chart in Fig ure 6.3 shows the las t computed λ h at h = 1 / 4096 to g iv e an idea o f why the quantit y | | λ − 1 / 2 | | W 2 ∞ is so large. A compar ison b et ween the b ottom and middle charts in Figure 6.2 shows that the n umber of MG-preconditioned iterations is not very sensitive to the n umber of levels, provided the co arsest mesh is suﬃcient ly ﬁne. In this example four levels would force a muc h to o coa rse base mesh, and pro duce unsatisfactory preconditio ners. The last piece o f evidence is the tota l count of ﬁnest- level mat-vecs, shown in T a ble 6 .2. In this example, a mat-vec involv es s olving the advection-reaction-diﬀusion equation on [0 , T ]. The data clearly shows tha t, a s h ↓ 0 , the t wo-lev el solvers is getting increasingly eﬃcien t in this metric compared to CG: the r atio go es from 581 / 7 28 for h = 1 / 10 24 to 377 / 7 68 for h = 1 / 81 92 = 2 − 13 . W e should remark a lso that the essential imp ediment to a more signiﬁcant impr ovemen t ov er CG lies in the increase in the | | λ − 1 / 2 | | W 2 ∞ as h ↓ 0. As shown in Theorem 4.9, non-smo othness of λ − 1 / 2 decreases the preconditioner ’s e ﬃciency . T able 6.2 T otal numb er of ﬁne-grid mat-ve cs for the 1D r e verse d p ar ab olic e quation h \ levels 1 2 3 1/102 4 728 581 6 6 1 1/204 8 740 463 4 8 9 1/409 6 764 403 4 2 5 1/819 2 768 377 4 0 3 T able 6.3 T otal numb er of ﬁne-grid mat-ve cs for the 2D el liptic-co nstr aine d opt. ctrl. pr oblem h \ levels 1 2 3 4 1/256 35 4 282 572 – 1/512 35 5 220 250 4 52 1/102 4 355 198 2 1 0 224 1/204 8 363 172 1 7 4 174 6.3. An ell iptic-constrained control problem. In this example we discuss the elliptic-constr ained optimal control pr oblem (1.4) from Example B, which is a stan- dard test pro blem in PDE-constra ined o ptimization [8, 26], and co r resp onds to (1.1) with K = ∆ − 1 . W e cons ide r a squar e domain with a contin uous piecewise linear ﬁnite element discretiza tion based on the sta ndard three- lines tria ng ular mesh 2 . Standard estimates for ﬁnite element solutions of elliptic proble ms show that Condition 2.1 is veriﬁed [9]. Numeric al s t udy. Let Ω = [0 , 1] × [0 , 1], β = 10 − 6 , and f b e the function tha t satisﬁes ∆ f = u 0 , f | ∂ Ω = 0, where u 0 ( x, y ) = 3 2 sin(2 π x ) sin(2 π y ). With this sele c tion of f , the ch oice u = u 0 would be a solution of (1.4 ) if β = 0 a nd no box cons tr ain ts w ere present (or if [ − 3 2 , 3 2 ] ⊆ [ u , u ]). Here the bo unds [ u, u ] = [ − 1 , 1] ar e active: without them, given that β ≪ 1, the solution w ould b e close to u 0 , that is, would have a maximum (resp. minimum ) clos e to 3 / 2 (resp. − 3 / 2). The s olution with h = 1 / 12 8 is depicted in Figure 6.4. W e ha ve solved the pr oblem with h = 2 − 8 , 2 − 9 , 2 − 10 , 2 − 11 2 The thr e e-line mesh is obtained by dividing the square i n to equally sized squares with sides parallel to the co ordinate axes, and by further cutting eac h little square along its slop e-one diagonal. MUL TIGRID PRECONDITIONING FOR IPMs 27 using one, tw o , three, and four levels (where appropr iate) using the strateg y describe d in Sectio n 6.1. 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 −1 −0.5 0 0.5 1 X Y Fig. 6.4 . Solution with f satisfying ∆ f = 3 2 sin(2 π x ) sin(2 π y ) , β = 10 − 6 , [ u , u ] = [ − 1 , 1] . 0 2 4 6 8 10 12 0 10 20 unpreconditioned CG 256 x 256 512 x 512 1024 x 1024 0 2 4 6 8 10 12 0 5 10 CGS with two−grid preconditioner number of linear iterations 256 x 256 512 x 512 1024 x 1024 0 2 4 6 8 10 12 0 5 10 CGS with three−grid preconditioner outer iteration 256 x 256 512 x 512 1024 x 1024 Fig. 6.5 . Numb er of iter ations for e ach of t he pr e dictor-step line ar systems solve d ( β = 10 − 6 ) . As with the previous example, we show in Figure 6.5 the num b er of iter ations required b y each of the linear s y stems at the predictor step. The top plot shows the num b er o f unpreconditioned iterations to level oﬀ at 2 1 . The middle plot again shows tw o facts: the num b er of MG-preconditio ne d CGS iterations decreases with h ↓ 0. In addition, for the ﬁnest grid, the num ber o f t wo-grid pr econditioned CGS 28 A. DR ˘ AG ˘ ANESCU AND C. P E T R A iterations is less than 1 / 4 that of the num b er o f iterations fo r the unpreconditioned case even when lo oking b eyond the eight h outer iter ation, where as befor e, r o ughness of the λ -function lo wers the q ua lit y of the MG-a lgorithm. F or ex ample, for the tenth outer iteration with h = 1 / 1024, CGS r equired 3 iterations, while 2 1 iterations w ere needed in the unprec onditioned case; for the eleven th outer iter a tion the num ber s are 5 vs. 20. Of cour se, ea c h tw o-grid pre conditioned CGS iteration is signiﬁcantly more exp ensiv e than an unpreconditioned CG iteration, which is why the num ber of levels should be maximized. The b ottom plot in Figure 6.5 shows that the tw o-level be- havior is replicated using three-level preco nditioners. Moreov er, at ﬁne resolutio ns ( h 6 2 − 10 ), the num b ers o f req uired three-le v el pr econditioned CGS itera tions ar e not signiﬁca n tly higher than those of t w o-level CGS iterations . How ever, this is no t the case for lo w resolution h = 2 − 8 , wher e the nu mber of thr e e-lev el CGS iter a tions required for the last three systems (not shown o n the plot) is quite large : 45 , 17 , 5 8. This is why w e insist that the MG-preconditioner is eﬃcien t only when the coarsest resolution used is suﬃciently ﬁne and the ﬁnest res o lution h is small. With resp ect to the seco nd metric, we show in T able 6.3 the to tal num ber o f ﬁne mat-vecs fo r each of the r uns, a nd the results conﬁrm that the MG-preconditioner b ecomes incr easingly eﬃcient with h ↓ 0 . Recall that for this application a mat-vec requires solving the Poisson equation. App endix A. Som e facts about the sp ectral distance. Throughout this section ( X , h· , ·i ) is a real, ﬁnite dimensional Hilb ert space with no rm | | · | | . All opera tors in this section are ass umed to b e in L + ( X ) (se e Sectio n 4.2 for deﬁnition) unless otherwise speciﬁed. The following inequalities w ere prov ed in [13] (Lemma 3.2 ): Lemma A.1. If α ∈ (0 , 1) and z ∈ B α (1) , then ln(1 + α ) α | 1 − z | 6 | ln z | 6 | ln(1 − α ) | α | 1 − z | . (A.1) F or | ln z | 6 δ we have 1 − e − δ δ | ln z | 6 | 1 − z | 6 e δ − 1 δ | ln z | . (A.2) Lemma A.2. L et L, G ∈ L + ( X ) such that min  d σ ( L − 1 , G ) , d σ ( L, G − 1 )  6 δ . Then ρ ( I − LG ) 6 e δ − 1 δ min  d σ ( L − 1 , G ) , d σ ( L, G − 1 )  . (A.3) Pr o of . If λ ∈ σ ( I − LG ) then there ex is ts a unit vector u ∈ X C such that ( I − LG ) u = λu , ther efore (1 − λ ) u = LGu . (A.4) After left-multiplying with L − 1 and taking the inner pro duct with u w e obtain (1 − λ )  L − 1 u, u  = h Gu, u i , therefore λ = 1 − h Gu, u i h L − 1 u, u i . MUL TIGRID PRECONDITIONING FOR IPMs 29 If we substitute v = G − 1 u in (A.4) a nd take the inner pro duct with v we have (1 − λ ) G − 1 v = L v , therefore λ = 1 − h Lv , v i h G − 1 v , v i . Hence, if d σ ( L − 1 , G ) 6 δ , then ρ ( I − LG ) 6 sup { | 1 − z | : z = h Gu, u i /  L − 1 u, u  for so me u ∈ X C \ { 0 }} (A.2) 6 e δ − 1 δ d σ ( L − 1 , G ) . Instead, if d σ ( L, G − 1 ) 6 δ , then ρ ( I − LG ) 6 sup { | 1 − z | : z = h Lu , u i /  G − 1 u, u  for so me u ∈ X C \ { 0 }} (A.2) 6 e δ − 1 δ d σ ( L, G − 1 ) . which proves (A.3) . REFERENCES [1] V. Akc ¸ elik, G. Biros, A. Dr ˘ ag ˘ anescu, O. G ha tt as, J. Hill, and B. V an Bloemen W aa n- ders , Inversion of airb orne c ontaminants in a r e gional mo del , in LNCS, vo l. 3993, 2006, pp. 481–488. [2] V olkan Akc ¸ elik, George Biros, An drei Dr ˘ ag ˘ anescu, Om ar Ghatt as, Judith C. Hill, and Bar t G. v an Bloemen W aanders , Dynamic data driven inversion f or ter asc ale simulations: r e al-time indentiﬁc ation of airb orne c ontaminants , in SC ’05: Pro ceed ings of the 2005 ACM/IEEE conference on Sup ercomputing, W ashington, DC, USA, 2005, IEEE/A CM, IEEE Computer So ciet y . [3] Randolph E. Bank, Philip E. Gill, and Roummel F. Marcia , Interior metho ds for a class of el liptic variational ine qualities , in Lar ge-scale PDE-constrained optimization (San ta Fe, NM, 2001), vol. 30 of Lect. Notes Comput. Sci. Eng., Springer, Berli n, 2003, pp. 218–235. [4] Michele Benzi, Eldad Haber, and Lauren T aralli , Multilevel algorithms for lar ge -sc ale interior p oint metho ds , SIAM J. Sci. Comput., 31 (2009), pp. 4152–4175. [5] George Biros an d G ¨ una y Do ˇ gan , A multilevel algorithm for inve rse pr oblems with el liptic PDE c ontr aints , Inv er se Problems, 24 (2008), pp. 034010, 18. [6] Alfio Borzi and R. G riesse , Exp erienc es with a sp ac e-time multigrid metho d for the optimal c ontr ol of a chemic al turbulenc e mo del , In ternat. J. Numer. M ethods Fl ui ds, 47 (2005), pp. 879–885. [7] A. Borz ` ı an d K. Kunisch , A multigrid scheme for el liptic c onstr aine d optimal c ontr ol pr ob- lems , Comput. Optim. Appl., 31 (2005), pp. 309–333. [8] Alfio Borzi and Volker S chulz , M ulti g rid metho ds for PDE optimization , SIAM Rev., 51 (2009), pp. 361–395. [9] Susanne C. Brenner and L. Ridgw a y Scott , The mathematic al theo ry of ﬁnite element metho ds , v ol. 15 of T exts in Applied Mathematics, Spri nger, New Y ork, third ed., 2008. [10] William L. Briggs, V an Emden Henson, and Steve F. M cCormick , A multigrid t utorial , Society for Industrial and Applied M athe matics (SIAM), Phil adelphia, P A, second ed., 2000. [11] Andrei Dr ˘ ag ˘ anescu , Multigrid pr e c onditioning of line ar systems for semismo oth Newton metho ds . i n preparation. [12] , Two invest igations in numeric al analysis: monotonicity pr eservi ng ﬁnite element meth- o ds, and multigrid metho ds for inverse p ar ab olic pr oblems , PhD thesis, Unive rsity of Chicago, August 2004. [13] Andrei Dr ˘ ag ˘ anescu and Todd F. Dupont , Optimal or der multilevel pr e co nditioners for r e gularize d il l-p ose d pr oblems , Math. Comp. , 77 (2008), pp. 2001–2038. [14] How ard C. Elman, Da vid J. Sil vester, and An drew J . W a then , Finite elements and fast iter ative solvers: with applic ations in inc ompr essible ﬂuid dynamics , Numerical Mathe- matics and Scien tiﬁc Computation, Oxfor d Universit y Press, New Y ork, 2005. 30 A. DR ˘ AG ˘ ANESCU AND C. P E T R A [15] Heinz W. Engl, Mar tin Hanke, and An dreas Neubauer , R e gularization of inverse pr ob- lems , vol. 375 of Mathematics and i ts Applications, Kluw er Academic Publishers Group, Dordrech t, 1996. [16] E. M. Gert z and S. J. Wright , Obje ct-orient ed softwar e for quadr atic pr o gra mming , ACM T rans. Math. Soft w., 29 (2003), pp. 58–81. [17] Philip E. Gill, W al ter Murra y, Dulce B. Poncele ´ on, an d Michael A. Saunders , Primal- dual metho ds for line ar pr o gr amming , Math. Programmi ng, 70 (1995), pp. 251–277. [18] W olfgang Hackbusch , O n the fast solving of p ar ab olic b oundary co ntr ol pr oblems , SIAM J. Con trol Optim., 17 (1979), pp. 231–244. [19] W. Hackbusch , F ast solution of el liptic c ontro l pr oblems , J. Optim. Theory Appl., 31 (1980), pp. 565–581. [20] W olfgang Hackbusch , Die schnel le Auﬂ¨ osung der Fr e dholmschen Inte gra lgleichung , Beitr ¨ age zur Numerischen Mathematik, 9 (1981), pp. 47–62. [21] , Multigrid metho ds and applic ations , vol. 4 of Springer Series in Computational Mathe- matics, Springer-V erlag, Berlin, 1985. [22] , Inte gra l e quations , v ol. 120 of In ternational Series of Numerical Mathematics, Birkh¨ auser V erlag, B asel, 1995. Theory and numerical treatment , T r anslated and revised b y the author from the 1989 German original. [23] Mar tin Hanke and Cur tis R. Vogel , Two-level pr e c onditioners for r e gularize d inve rse pr ob- lems. I. The ory , Numer. M ath., 83 (1999), pp. 385–402. [24] M. Hinterm ¨ uller, K. Ito, and K. Kunisch , The primal-dua l active set str ate gy as a semis- mo oth Newton metho d , SIAM J. Optim., 13 (2002), pp. 865–888 (electronic) (2003). [25] Michael Hinterm ¨ uller a n d Michael Ulbrich , A mesh-indep endenc e r esult for se mismo oth Newton metho ds , Math. Program., 101 (2004), pp. 151–184. [26] M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich , Optimization with PDE c onstr aints , v ol. 23 of Mathematical M odelling: Theory and Applications, Springer, New Y ork, 2009. Edited by M. Hinze, R. Pinnau, M. Ul bric h and S. Ulbr ic h. [27] Barbara Kal tenbacher , V - cycle c onvergenc e of some multigrid metho ds for il l- p ose d pr ob- lems , Math. Comp., 72 (2003), pp. 1711–1730 (electronic). [28] J. Thomas King , Multi le v el algorithms for il l-p ose d pr oblems , Numer. Math., 61 (1992) , pp. 311–334. [29] O. Lass, M. V allejos, A. Borzi, an d C. C. Douglas , Implementation and analysis of multi- grid schemes with ﬁnite elements f or el liptic optimal c ontr ol pr oblems , Computing, 84 (2009), pp. 27–48. [30] Mitchell Luskin and R olf Rannacher , On the smo othing pr op ert y of the Cr ank-Nic olson scheme , Applicable Anal., 14 (1982/83) , pp. 117–135. [31] S. Mehrotra , O n the implementation of a primal-dual i nterior p oint metho d , SIAM Journal on Optimization, 2 (1992), pp. 575–601. [32] Jorge Nocedal and Stephen J. Wright , Numerica l Optimization (2nd Edition) , Springer Series in Op erations Research, Spri nger V erlag, 2006. [33] Andreas Rieder , A wavelet multilevel metho d for il l-p ose d pr oblems stabilize d by Tikhonov r e gularization , N umer . Math., 75 (1997), pp. 501–522. [34] Ber t W. Rust and Dianne P. O’Lear y , Conﬁdenc e intervals for discr ete appr oximations t o il l-p ose d pr oblems , J. Comput. Graph. Statist., 3 (1994), pp. 67–96. [35] A. H. Stroud , Appr oximate c alculation of multiple inte gr als , Pren tice-Hall Inc., Englew oo d Cliﬀs, N.J., 1971. Pren tice-Hall Series in Automatic Computation. [36] Vidar Thom ´ ee , Galerkin ﬁnite element metho ds for p ar ab olic pr oblems , vol. 25 of Springer Series in Computational Mathematics, Springer-V erl ag, Berlin, second ed., 2006. [37] Michael Ulbrich , Semismo oth N ewton metho ds for op er ator e quations in function sp ac es , SIAM J. Optim., 13 (2002), pp. 805–842 (electronic) (2003). [38] Michael Ulbrich and Stef an Ulb rich , Sup erline ar co nver genc e of aﬃne-sca ling interior- p oint Newton metho ds for inﬁnite -dimensional nonline ar pr oblems with p ointwise bo unds , SIAM J. Con trol Optim., 38 (2000), pp. 1938–1 984 (electronic). [39] , Primal-dua l interior-p oint metho ds for PDE-co nstr aine d optimization , Math. Program., 117 (2009), pp. 435–485. [40] M. V allejos and A. Borz ` ı , Multigrid optimization metho ds for line ar and biline ar el liptic optimal c ontr ol pr oblems , Computing, 82 (2008), pp. 31–52. [41] Mar tin Weiser , Interior p oint metho ds in function sp ac e , SIAM J. Con trol Optim., 44 (2005), pp. 1766–1786 (electronic). 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −0.2 0 0.2 0.4 0.6 0.8 concentration initial concentration recovered − unconstrained end state (data) recovered − constrained 2 4 6 8 10 12 outer iteration two levels three levels four levels

Multigrid preconditioning of linear systems for interior point methods applied to a class of box-constrained optimal control problems

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment