PACO: Global Signal Restoration via PAtch COnsensus

P A CO: Global Signal Restoration via P A tc h COnsensus ∗ Ignacio Ram ´ ırez † Marc h 15, 2021 Abstract Man y signal pro cessing algorithms break the target signal in to o v erlapping segmen ts (also called windo ws, or patches), pro cess them separately , and then stitch them back in to place to pro duce a uniﬁed output. A t the ov erlaps, the ﬁnal v alue of those samples that are estimated more than once needs to b e decided in some w ay . Averaging, the simplest approac h, often leads to unsatisfactory results. Signiﬁcan t w ork has been devoted to this issue in recent years. Sev eral works explore the idea of a weigh ted a verage of the ov erlapped patches and/or pixels; others promote agreement (consensus) b etw een the patches at their intersections. Agreemen t can b e either encouraged or imposed as a hard constrain t. This work develops on the latter case. The result is a v ariational signal pro cessing framework, named P ACO, which features a num b er of app ealing theoretical and practical prop erties. The P ACO framework consists of a v ariational formulation that ﬁts a wide v ariety of problems, and a general admm -based algorithm for minimizing the resulting energies. As a byproduct, we sho w that the consensus step of the algorithm, whic h is the main b ottleneck of similar methods, can b e solv ed eﬃcien tly and easily for any arbitrary patch decomposition scheme. W e demonstrate the ﬂexibility and p ow er of P A CO on three diﬀerent problems: image inpainting (which we hav e already co vered in previous w orks), image denoising, and con trast enhancement, using diﬀerent cost functions including Laplacian and Gaussian Mixture Mo dels. Key w ords. patc h-based metho ds, global-lo cal metho ds, patch consensus, inpainting, denoising, contrast en- hancemen t. AMS sub ject classiﬁcations. 49M30, 49M37, 65K05, 65K10, 90C25, 90C30. 1 In tro duction W e refer to p atch r estor ation metho ds to a family of techniques where the target signal is ﬁrst brok en down into smaller, p ossibly ov erlapping patches of some size and shap e, some restoration metho d is applied to each patch separately , and the restored patc hes are stitched back together in to their corresp onding place to pro duce the complete restored signal (sometimes referred to as the glob al signal). This is a common technique, with many examples in audio (e.g. [26, 28, 35, 43]) and image processing (see e.g., [11, 16, 38, 39, 40, 41, 48]). A recen t review on patch-based restoration metho ds for image pro cessing can b e found in [3]. Note that we are not considering metho ds such as [9, 45] (and their many v ariants) which estimate each sample separately using the surrounding patc hes as context, but only those that process the whole patch as a sub-signal. When o verlapping occurs, the ﬁnal v alue of those samples that are estimated more than once must be resolved in some w ay . A traditional approac h in audio processing metho ds is to apply a windowing function to the patches (e.g. Hanning); this pro cedure is back ed by theoretical results related to the violation of frequency-based h yp othesis during pro cessing, but is widely applied to other methods as well. On the other hand, many recent, very successful patc h-based metho ds do aw ay with the ov erlapping issue by simply a veraging the ov erlapp ed patches. This may inciden tally help in co vering up artifacts in the restoration process, but more often will simply smooth out the resulting global estimate (see e.g.[16]). T o ov ercome suc h limitations via b etter stitc hing strategies has b een the sub ject of several w orks ov er the last decade: some of them [13, 29, 33, 59, 61] are based on assigning weigh ts to the ∗ This paper is a signiﬁcan t extension of w ork previously published in [54] and [55]. As such, some sections are hea vily based on the corresponding ones in the aforementioned works. Also, some results are repeated here for the sak e of completeness. † Departamento de Procesamiento de Se ˜ nales, Instituto de Ingenier ´ ıa El ´ ectrica, F acultad de Ingenie r ´ ıa, Universidad de la Rep ´ ublica, Uruguay . ( nacho@fing.edu.uy , http://iie.fing.edu.uy/personal/nacho/ ) 1 diﬀeren t patches when av eraging; others such as [4, 20] assign a diﬀerent weigh t to each pixel. W e refer to these as weighte d aver aging metho ds . An alternative to weigh ted av eraging, whic h sidesteps the arbitrariness of deﬁning appropriate weigh ts, is to promote solutions where patc hes coincide at their intersection. This idea app ears under the k eyword p atch disagr e e- ment in [57]; there, an iterative metho d is used to give more weigh t to the discrepancies b etw een the estimated patc hes, with the aim of reducing suc h disagreemen t in the long run. The problem of patc h aggregation has also b e studied from a probabilistic viewpoint in works suc h as [66] and [58]. The work [66] is based on the observ ation that, after stitc hing, the av eraged patches may no longer follow the prior from whic h they were mo deled. The authors of [66] seek a reme dy to this by augmenting the energy to b e minimized with a quadratic term that p enalizes the disagreement b etw een the patches b efore and after stitching. In [58], instead of directly a veraging the patc h estimates, the authors produce a unique probabilit y model o v er the whole image by “stitching” the probabilit y mo dels of each indep enden t patch via a well-deﬁned fusion mechanism. Besides its own v alue as an original and p ow erful concept, the authors of [58] derive a new interpretation on the aforemen tioned work [66] as a particular fusion mechanism. Bac k to the problem of av eraging patch estimations, it was observed in [22] that the agreement imp osed in [66] w ould b ecome a consensus constraint on the patches as the quadratic p enalty co eﬃcient grew to inﬁnity . They then draw on this idea to prop ose a general consensus-based map denoising algorithm and a general admm -based metho d for solving it. In this sense, the metho d prop osed in [22] can b e seen as a particular case of our prop osed framew ork. F urthermore, the work [22] analyzes the computational cost and b ottlenecks of their prop osed admm metho d to some degree. How ever, they fall short of fully exploiting the structure of the problem, leading to an o verly conserv ative upp er b ound on the ov erall computational cost of the metho d. As we show b elow, this cost can b e signiﬁcantly reduced. 1.1 Con tributions This work prop oses a general v ariational formulation for patch-based restoration methods, under a strict patch consensus constraint which can b e applied to a wide family of problems, linear and non-linear, conv ex and non- con vex, sparse or dense, to signals of any dimension. Belo w we summarize the contributions of our work. P atch Consensus F ramework W e provide a formal c haracterization of the Patc h Consensus set, its geometrical prop erties, and its relationship with the space of signals. W e then develop a general form ulation for patch restoration problems under the p aco constraint whic h can b e applied to any metho d that can b e written as the minimization of a cost function of the estimated patches and/or signal; this can b e seen as a generalization of the one prop osed in [22]. Optimization W e prop ose a metho d for solving the aforementioned family of problems based on a splitting strategy and the standard admm [15] algorithm. While our algorithm b ears strong similarities to that prop osed in [22], the latter is limited to the denoising case. Also, b y retaining the patch extraction and stitching op erations in matrix form, it do es not fully exploit the geometry of the patch consensus space, whic h leads to a signiﬁcantly higher computational cost. In contrast, our admm -based algorithm can b e readily applied to any v ariational patch-based restoration metho d. W e also dev elop th e sp ecial case of orthogonal linear patch mo dels, and a Linearized admm algorithm (a.k.a. Uzaw a’s Metho d [18]) for non-orthogonal linear patch mo dels, which has the same conv ergence guarantees as admm while b eing computationally feasible even for very large signals. Both metho ds require signiﬁcantly less computational resources than previously-prop osed patc h-consensus algorithms, and inherit the conv ergence properties of the admm metho d. Stitc hing tric k As describ ed earlier, the b ottlenec k of metho ds such as [66, 22, 5] lies in the “patch agreement step”, which might not b e practical ev en for sm all problem sizes. In this work w e show that the pro jection op erator on to the consensus constraint set b oils do wn to a simple composition of the patc h stitc hing and extraction operations. This not only av oids the computation and/or storage of pro jection matrices, but is also very easy to implement, and generalizes to any arbitrary patch decomp osition scheme including irregular grids, diﬀerent patch sizes, etc., on signals of any dimension. Global constrain ts As mentioned ab ov e, the p aco constraint guaran tees a one-to-one corresp ondence b etw een patc h space (sometimes called “lo cal” space) and global signal space; this allo ws formulations where arbitrary 2 x 1 x 2 x 3 x 4 x 2 x 3 x 4 x 5 x 3 x 4 x 5 x 6 x 1 x 2 x 3 x 4 x 5 x 6 x Y Figure 1: P atch extraction op erator Y = R ( x ) for a signal x of length N = 6 and patches of size m = 3; patches are arranged as columns on an m × n matrix Y where the column y k con tains the patch starting at oﬀset k in x . ˆ y 11 ˆ y 12 ˆ y 13 ˆ y 14 ˆ y 21 ˆ y 22 ˆ y 23 ˆ y 24 ˆ y 31 ˆ y 32 ˆ y 33 ˆ y 34 ˆ x 1 ˆ x 2 ˆ x 3 ˆ x 4 ˆ x 5 ˆ x 6 ˆ x ˆ Y Figure 2: Average patch stitching op erator ˆ x = S ( ˆ Y ). Each sample ˆ x j is the av erage of all its estimates in ˆ Y . In our example, this corresp onds to the av erage of each anti-diagonal of ˆ Y . constrain ts are imp osed on the global signal rather than on the patches. This is a crucial adv an tage of imp osing strict consensus rather than promoting it in a lax wa y; we show this feature in the missing samples estimation metho d describ ed section 4. F urthermore, for the types of patch extraction op erations used in typical patch-based applications, the mapping b et ween signal space and patch space can b e considered to b e an isometry . This allows for orthogonal pro jections on constraint sets to b e p erformed on either space. Applications and v arian ts W e show the broad applicability and the p otential of p aco on three very diﬀerent settings: inpainting, denoising, and con trast enhancement. Each problem is discussed in a self-contained section where w e provide a brief introduction to the problem and derive the corresp onding optimization algorithms. Note that the inpain ting problem has already b een co v ered in detail in [54, 55]; w e repeat the main results here for completeness. In order to further demonstrate the ﬂexibilit y of the framew ork, diﬀerent patch priors are explored in each case: w e use Laplacian prior on dct co eﬃcients for inpainting, and a Gaussian Mixture Mo del [65] for denoising. The dct -based inpainting case has already b een cov ered in detail in [54] and [55]. F or the other settings we pro vide preliminary results; a complete treatmen t of these metho ds, as well as exp eriments using other priors, will b e published elsewhere. 1.2 Organization The rest of the do cument is laid out as follows. W e begin with an in tro duction to the general problem of patc h-based signal pro cessing in section 2. Section 3 provides a formal deﬁnition of the p aco family of problems, as well as a general solution of the problem which serves as a basis for all the algorithms developed later. The next three sections provide self-contained treatments of diﬀerent signal pro cessing problems using p aco : our previous work on inpainting is revisited in section 4, section 5 deals with denoising, and section 6 introduces p aco for contrast enhancemen t. W e conclude our treatment in section 7. 2 P atc h-based metho ds The metho ds that we are interested in this pap er inv olve three stages that may b e rep eated until some conv ergence criterion is met: p atch extr action , p atch estimation , and p atch stitching . Recall that we are interested in metho ds that estimate whole patches rather than a single pixel (e.g., the center) such as in [9]. W e will no w illustrate these concepts on a simple one-dimensional case. 2.1 P atc h extraction Giv en a 1D signal x = ( x 1 , . . . , x N ) ∈ X = R N of length N , the patch extraction is a linear op erator R determined b y a p atch extr action matrix R | = [ R | 1 | R | 2 | · · · | R | n ] , where each sub-matrix R j , j = 1 , . . . , n copies (extracts) 3 samples from x into a sub-vector y j , whic h w e call a patch. Which samples, how man y samples, and how man y copies of each sample in x are extracted onto y j is arbitrary and may dep end on j , so that the dimension of each patch is some m j ∈ N . The result of the extraction op erator is a p atches ve ctor y | = [ y | 1 | y | 2 | . . . | y | n ] ∈ R P n j =1 m j . The space Y = R P n j =1 m j is called the p atches sp ac e . In the common case where m j = m for all j , it ma y b e con venien t to rearrange the patches vector y ∈ R mn as a matrix Y ∈ R m × n , where the patches y j are arranged as columns; man y form ulations are presented in this wa y hereafter. F ormally , ho wev er, we will alwa ys refer to the extracted patc hes as a vector b elonging to the p atch sp ac e Y . In the example of Figure 1 w e ha ve a 1D signal of length N = 6, patches of size m = 3 and R j =  e j | e j +1 | e j +2  j = 1 , 2 , 3 , 4. More generally , we will ha v e R | j =  e 1 | e 1+ s | e 1+2 s | . . .  where s > 1 is called the stride , that is, the separation in space b etw een one patch an its neigh b or. The case s = 1 depicted in Figure 1 yields ful ly overlapp e d patc hes; the extreme case s = 3 yields non-ov erlapping patches. The preceding discussion generalizes to signals x of any dimension by ve ctorizing x prior to extraction: y j = R j v ec( x ). Here vec( · ) arranges its argument into a v ector by tra versing its elements in some predeﬁned order. In our case, for 2 D signals of size M × N , we follow a ro w-ma jor ordering, that is, the output vector is the concatenation of the M input rows. The inv erse of vec( · ) is called the matriﬁc ation op erator and is denoted b y mat( · ). Geometry of the problem The op erator R is linear and injective. Thus, its image C is a subspace of the patc hes space Y . If R extracts all the samples from x then it also deﬁnes a linear bijection b etw een X and C . As deﬁned, the elements of C which are mapp ed from the same sample in x are all equal; for reasons that will b ecome clear so on, we say that these elements are in c onsensus . Th us we call C the c onsensus set . The sets X and C are, how ever, not isometric. It is easy to verify that this can only happ en if C contains exactly the same num b er of copies of each sample in X . Most common patch extraction schemes used in the literature do not comply with this, as samples in the b orders usually b elong to fewer patches than those in the center. How ev er, for most signals and, in particular images, of practical interest, the b order samples are few and thus the mappings X and C can b e considered isometric for most practical purp oses. This will prov e imp ortant, as it allo ws to p erform orthogonal pro jections onto conv ex sets deﬁned in C in the (usually muc h smaller) space X . 2.2 P atc h a v eraging and the stitching trick In a signal restoration setting one do es not observe the true or cle an signal ˜ x , but a distorted version x (which usually has the same size and dynamic range as ˜ x ), and the task is to infer ˜ x from x . W e denote the result of this inference as ˆ x , and call it the r estor e d or estimate d signal indistinctly . The patches vector extracted from x is corresp ondingly denoted by y . The idea is to estimate the signal ˜ x b y recomp osing it from an estimate ˆ y , which is itself a function of y . In general, how ev er, y / ∈ C . By stitching w e refer to the op eration that maps a patches vector ˆ y ∈ Y onto an estimated signal ˆ x . In general, this op erator is not unique. A common choice is to seek for the signal estimate ˆ x whose patches, when extracted, are as close as possible to the individually estimated ones in ˆ y . If we measure this distance using ` 2 norm, we obtain the classical le ast squar es estimator: ˆ x ls = arg min x k Rx − ˆ y k 2 =   X j R | j R j   − 1 X j R | j ˆ y j = ( R | R ) − 1 R | ˆ y . (1) Equation (1) is interpreted as follows: it is easy to verify that R | j ˆ y j puts back the patc h ˆ y j in its corresp onding place in ˆ x ; thus, R | ˆ y pro duces a vector of length N where the i -th element contains the sum of all the elements of ˆ y that are mapp ed there by one or more R j . On the other hand R | j R j is a diagonal matrix with m ones, and so R | R = P j R | j R j is a diagonal matrix whose ( i, i )-th elemen t counts ho w many R j s hav e mapp ed some element of ˆ y to the sample i . Th us, the i -th entry of the right hand side of (1) is the av erage of all the diﬀeren t estimates of the i -th sample in ˆ y . W e call the op erator deﬁned in (1) as the aver age stitching op er ator and denote it b y S : Y → R N , ˆ x = S ( ˆ y ). The extraction op erator R is given by y = R ( x ) = Rx . If we comp ose this with the stitc hing op erator S deﬁned earlier we obtain R [ S ( ˆ y )] = R ( R | R ) − 1 R | ˆ y = Π C ( ˆ y ) , (2) where Π C ( ˆ y ) is the orthogonal pro jection of ˆ y ∈ Y onto C . Th us, we conclude that the pro jector op erator (2) from Y onto C can b e written as the comp osition of the extraction op erator with its corresp onding av erage stitching op erator. Although (2) is a well known result, to the b est of our knowledge, its straightforw ard implementation as 4 the aforementioned comp osition has not b een fully exploited in the literature; this is why we call it a trick . Rather, w orks such as [5, 22] compute the orthogonal pro jection matrix R ( R | R ) − 1 R | explicitly , something that quickly b ecomes unwieldy as the signal size increases even if the structure of suc h matrix is exploited (e.g., blo ck sparsity). This is further complicated if, instead of the patches, the stitching is written in terms of some aﬃne transformation of the patches such as y j = Da j + b , such as in [5, 22]. The stitching trick will only w ork directly on patches, or some orthogonal transformation of them. Later w e will see how to deal with the aﬃne case. What is also remark able is that the pro cedure do es w orks for any arbitrary extraction op erator R . In particular, it do es not require patches to b e of the same shap e and/or num b er of samples. 3 P A CO: The P A tc h COnsensus problem 3.1 General consensus problems This subsection is based on the monograph [49, Chapter 5]. In the context of parallel and distributed optimization, a common problem is to minimize a function f ( ζ ) = P r j =1 f j ( ζ ), where f j ( ζ ) are functions that can only b e ev aluated lo c al ly , sa y at some node j on a spatially distributed net work with r no des, but which all dep end on the same glob al v ariable ζ ∈ R m . One strategy to solve this problem in a decentralized wa y is for each no de to hav e its own lo c al c opy of ζ , ζ j , whic h it then optimizes in terms of its lo cal function f ( ζ j ). In the end, how ever, we wan t all the ζ j to b e equal. F or this w e deﬁne the c onsensus set as G =  ( ζ 1 , ζ 2 , . . . , ζ r ) ∈ R m × r : ζ 1 = ζ 2 = . . . = ζ r  and constrain the ﬁnal solution to fall within G . This gives rise to the following glob al c onsensus pr oblem : min ( ζ 1 ,ζ 2 ,...,ζ r ) ∈ G r X j =1 f j ( ζ j ) . Clearly , G is a linear subspace of R m × r (e.g., the line x = y in R 2 ). Let Π G ( · ) denote the orthogonal pro jection op erator from R m × r on to G . It is straightforw ard to verify that, Π G ( ζ 1 , ζ 2 , . . . , ζ r ) = ( ¯ ζ , ¯ ζ , . . . , ¯ ζ ) , ¯ ζ = 1 r X j ζ j . Let [ m ] b e a shorthand for the set { 1 , 2 , . . . , } . Consider the more general case where eac h lo cal copy ζ j has an index subset c j ⊆ [ m ] asso ciated to it, and consensus with any other copy ζ j 0 is required only for those entries indexed by c j ∩ c j 0 . The p artial c onsensus set G 0 is deﬁned as follows: G 0 = n ( ζ 1 , ζ 2 , . . . , ζ r ) : ζ j i = ζ j 0 i , ∀ i ∈ c j ∩ c j 0 , ∀ j, j 0 ∈ [ n ] , j 6 = j 0 o . (3) The corresp onding p artial c onsensus pr oblem is given by: min ( ζ 1 ,ζ 2 ,...,ζ r ) ∈ G 0 r X j =1 f j ( ζ j ) . (4) It is easy to sho w that the pro jection onto G 0 is given by Π G ( ζ j i ) = ¯ ζ i : ¯ ζ i = P j : i ∈ c j ζ j i P j : i ∈ c j 1 . (5) In words, after pro jection, the i -th element of all vectors that include i in their corresp onding set c j will b e the a verage of the v alues of the i -th elements on those vectors b efore pro jection. 3.2 P atc h consensus problem Giv en a patch extraction op erator R : X → Y with its corresp onding consensus set C , a cost function f ( y ) : Y → R , and an optional constraint set Ω ⊆ Y , the p a co problem is stated as follows: ˆ y = arg min z f ( z ) s . t . z ∈ Ω ∩ C . (6) 5 W e b egin by showing that, when Ω = Y , (6) is a esp ecial case of (5). W e will treat the g eneral case Ω 6 = C later. A t this p oint, w e shall remind the reader that C is isomorphic to X , so that the constraint set Ω can b e deﬁned in either X or C , or a combination of b oth. F or example, in order to clip the estimated samples to the range [0 − 255] w e could deﬁne Ω = { z ∈ Y : S ( z ) ∈ [0 , 255] N } . Prop osition 1. The p atch c onsensus set C is a p articular c ase of the gener al c onsensus set G 0 and so the p aco pr oblem is of the form (4) . Pr o of. It suﬃces to deﬁne ζ j = R | j y j and c j = nonzero(diag( R | j R j )). Here diag( · ) returns the diagonal of an N × N square matrix as a vector of length N , and nonzero( · ) returns the indices of the nonzero elements in its argument. Note that, as deﬁned, c j con tains the indices of the samples ﬁlled in by y j . Th us, the requiremen t that y j and y j 0 should coincide at c j ∩ c j 0 matc hes the deﬁnition given b efore for the consensus set in (3) for this particular choice of ζ j and c j . No w, given f ( y ) = P j f j ( y j ) in (6), we deﬁne f 0 j ( ζ j ) = f j ( R ζ j ) = f j ( y j ) and plug it in to (4). Corollary 2. The ortho gonal pr oje ction onto the p atch c onsensus set C is given by the stitching trick , Π C ( ˆ y ) = R [ S ( ˆ y )] , as in (2) . Pr o of. W e ha ve established that the patc h consensus set C = G 0 , and that (2) gives Π C ( ˆ y ). Thus, Π C ( ˆ y ) = Π C ( ˆ y ) = R [ S ( ˆ y )]. 3.3 Optimization W e deﬁne the c onvex indic ator function asso ciated to the p aco constrain t set C ∩ Ω, g ( · ) : Y → R ∪ { + ∞} , g ( ξ ) =  0 , ξ ∈ C ∩ Ω + ∞ , ξ / ∈ C ∩ Ω This allows us to transform the p aco problem into an unconstrained one, ˆ y = arg min ξ f ( ξ ) + g ( ξ ) . (7) The problem (7) will b e conv ex if the function f ( · ) and the set Ω are also conv ex. This allows for con vex restoration problems to b e deﬁned under consensus constraints. This b eing said, the p aco framework (7) is not limited to con vex problems. The following metho d is based on the pr oximal op er ator form [49] of the p opular Alternating Dir e ctions Metho d of Multipliers ( admm ) [15]. The proximal op erator of a function f ( · ) with parameter λ is given by , pro x λf ( y ) := arg min ξ f ( ξ ) + 1 2 λ k ξ − y k 2 . (8) The proximal op erator has many interesting prop erties (see [49] and references therein). Among them, if g ( · ) is the indicator function of a set A we hav e that pro x λg ( · ) = Π A ( · ) . This is particularly useful within the p aco framework, as Π C ( · ) can b e computed eﬃciently using the stitching tric k describ ed in subsec tion 2.1, Equation (2). Ho wev er, in general, this trick cannot b e applied directly to (7). In order to do that, we need to reform ulate (7) so that pro jecting on to the constraint set, and minimizing the target cost function f ( · ), can b e done separately . One such reformulation is the so called splitting , ( ˆ y , ˆ z ) = arg min ξ,ζ f ( ξ ) + g ( ζ ) + 1 2 λ k ξ − ζ k 2 2 s . t . ξ = ζ . (9) The admm provides a general wa y to obtain a solution to (9) whic h is guaranteed to b e global if the problem is con vex, and lo cal otherwise. The particular admm steps for this case are describ ed in Algorithm 1. 6 Algorithm 1 Generic ADMM program for solving P ACO Require: z (0) ∈ Y Require: λ > 0 t ← 0 u (0) ← 0 rep eat y ( t +1) ← pro x λf  z ( t ) − u ( t )  { problem dep endent: f } z ( t +1) ← Π C ∩ Ω  y ( t +1) + u ( t )  { problem dep endent:Ω } u ( t +1) ← u ( t ) + y ( t +1) − z ( t +1) t ← t + 1 un til conv ergence return ˆ y ← y ( t ) return ˆ z ← z ( t ) 3.4 Applying P A CO p aco is a v ariational metho d and thus can b e ﬁtted to any signal restoration problem that can b e written in terms of a cost function and a constrain t set. This includes the large family of probabilistic metho ds where f ( y j ) measures the lik eliho o d of the patc h y j , either a priori or a p osteriori; a few relev ant examples of this are [1, 53, 55, 58, 65, 66]. Another group of metho ds includes p enalized likelihoo d metho ds where the p enalties are not necessarily proba- bilistic. These include T otal V ariation [25] and general linear mo dels based on orthogonal transforms, wa velets [12, 42], learned dictionaries [2, 17, 36] and sets of linear conv olutional ﬁlters such as [8]. Some of these metho ds con- sider the energy in terms of the whole image. Still, these images could b e seen as patches of a muc h larger image (for example, in a distributed application for pro cessing huge images, such as those pro duced by mo dern scien tiﬁc telescop es). In [51, 52] the patches of the image, seen as p oin ts in patch space, are assumed to trace a smo oth curve ov er a lo w-dimensional manifold. Estimations are then obtained by displacing those p oints so that the resulting curves are smo oth. There is also a large b o dy of works on image pro cessing based on Partial Diﬀerential Equations (PDE) whic h are also v ariational in nature; see the b o ok [10] for a complete review of the theory and applications of this approac h. More recently , this idea has b een combined within the framework of Non-Lo cal metho ds; the inpainting metho ds that we use for comparison [4, 19, 46] all fall into this category . Finally , other desired features of the patches can b e expressed in terms of minimizing a cost function. Examples of this are [32, 47]. As p er the constraint sets, these are usually av oided, as dealing with them usually (but not alwa ys) leads to slo wer conv ergence rates. Perhaps the most common ones are the ` 2 ball constraint, used in some denoising for- m ulations to keep the solutions close to the observ ed noisy signal, and the inpainting constrain t (to b e in tro duced later), whic h imp oses that the results should coincide in those places where the input image is already known. The lac k of hard constraints b esides consensus is go o d news for p aco , as it tends to complicate the pro jection stages. As mentioned, fortunately , this is not alw ays the case, as some of our applications show. In order to apply the p aco framew ork to a particular restoration problem, the user needs to provide tw o ingredien ts: the proximal op erator of the problem-dep endent cost function f ( · ), and the pro jection op erator onto Ω ∩ C . If Ω = Y then g = Π C ( · ) can b e eﬃciently obtained using stitching trick (2), z ( t +1) = R h S  y ( t +1) + u ( t ) i . If Ω ⊂ C is aﬃne, the solution can b e obtained by iteratively pro jecting on to Ω and C un til conv ergence (this metho d is sometimes referred to as POCS). If Ω 6 = C is conv ex but not aﬃne, a general solution can b e obtained using Dykstra’s algorithm, which we provide in Algorithm 2 for reference. 3.5 P articular case: linear patc h mo dels Note that, starting in this se ction, we wel l switch to the p atch matrix notation deﬁne d in se ction 1 , wher e p atches ar e arr ange d as c olumns of a matrix Y . The same go es to the c orr esp onding c o eﬃcient ve ctors, and al l auxiliary variables. 7 Algorithm 2 Dykstra’s algorithm for computing the pro jection of a p oint α onto the intersection of tw o sets A and B Require: p oint α , sets A and B t ← 0, α (0) ← α p (0) ← 0 , q (0) ← 0 rep eat β ( t ) ← Π B ( α ( t ) + p ( t ) ) p ( t +1) ← α ( t ) + p ( t ) − β ( t ) α ( t +1) ← Π A ( β ( t ) + q ( t ) ) q ( t +1) ← β ( t ) + q ( t ) − α ( t +1) t ← t + 1 un til conv ergence return α ( t ) In many patch-based algorithms, patches are mo deled in terms of some linear transformation, i.e., Y = D Y , where D is some matrix, and the cost function is a prior deﬁned o v er the coeﬃcients matrix A . Typical cases include the fft or dct , and some wa v elet transforms. The corresp onding splitting formulation is given in terms of the co eﬃcients matrix A and its split v ariable B : ( ˆ A , ˆ Z ) = arg min A , B f ( A ) + g ( DB ) + 1 2 λ k A − B k 2 F s . t . A = B , (10) Let g 0 ( B ) ∆ = g ( DB ) . If D is orthogonal, it is easy to show that prox λg 0 ( · ) = D − 1 pro x λg ( D · ) . This allows us to apply such patch mo dels with only a minor mo diﬁcation to the general solution given in Algorithm 1. The p aco - dct inpainting problem describ ed in section 4 shows an example of exploiting the orthogonalit y of the dct t yp e I I. Man y recent patch-based algorithms consider the more general situation where D is non-orthogonal. A notable case is the use of sparse linear mo dels deﬁned in terms of ov er-complete dictionaries where D ∈ R m × p with p > m . Imp osing the consensus constrain t directly on the dictionary co eﬃcients a j as has b een prop osed in previous works suc h as [66, 22, 5], leads to a pro jection op erator whic h is unwieldy to work with even for very small signals and dictionaries. Referring back to (1), we hav e ˆ x ls = ( R | R ) − 1 R | v ec( D ˆ A ). Now, under the consensus constraint, we m ust hav e R j ˆ x = D ˆ a j for all j or, equiv alently , R ˆ x = v ec( D ˆ A ). Com bined with (1) we obtain: v ec( D ˆ A ) = R ( R | R ) − 1 R | v ec( D ˆ A ) ( R | R ) B v ec( ˆ A ) = R | B v ec( ˆ A ) v ec( ˆ A ) = [( RB ) | RB ] − 1 ( RB ) | B v ec( ˆ A ) v ec( ˆ A ) = P vec( ˆ A ) where B = ( I n × n ⊗ D ) and ⊗ is the kroneck er pro duct (see [7] for this result), and P = [( RB ) | RB ] − 1 ( RB ) is a (non-orthogonal) pro jection matrix. The consensus set is given in terms of A as C A = { A : ( I − P ) A = 0 } , so it is again a linear subspace of the signal co eﬃcients space. How ev er, by inspecting P , it is also clear that this is a matrix whose size mn × mn can b e extremely large for practical problems (e.g., n = 10 6 , m = 64). Even considering the redundancy implied by the Kroneck er pro duct, and the sparsity of P , the num b er of non-zero elemen ts in P is t ypically very large, making it infeasible to compute Π( A ) = vec − 1 ( P v ec( A )) directly . The issue with the preceding strategy is to attempt to pro ject the co eﬃcients matrix A directly onto the consensus constraint. The solution presented here is based on the Line arize d admm ( ladmm ) or Uzawa’s Metho d [18] for solving (6); this metho d constructs the augmented Lagrangian for the constraint Z = DA and then solves a linear approximation of it around Z in each iteration. Global conv ergence is guaranteed as long as its parameter µ satisﬁes 0 < µ ≤ λ/ k D k 2 2 . The ladmm algorithm is given by , A ( t +1) ← pro x µf h A ( t ) − ( µ/λ ) D | ( D A ( t ) − Z ( t ) + U ( t ) ) i Z ( t +1) ← pro x λc h D A ( t +1) + U ( t ) i U ( t +1) ← U ( t ) + D A ( t +1) − Z ( t +1) . 8 The corresp onding ladmm algorithm for p aco is given by A ( t +1) ← pro x λf h A ( t ) − ( µ/λ ) D | ( ˆ Y ( t ) − Z ( t ) + U ( t ) i ˆ Y ( t +1) ← D A ( t +1) + U ( t ) Z ( t +1) ← R [ S ( ˆ Y ) ] U ( t +1) ← U ( t ) + ˆ Y ( t +1) − Z ( t +1) . 4 Inpain ting The metho d describ e d b elow has alr e ady b e en tr e ate d in detail in [54, 55]. Below we pr ovide a summary of the main r esults found on those public ations, in p articular for image inp ainting. Ple ase se e [54] for r esults on audio and vide o signals. The sour c e c o de c an b e found in the ipol p ap er [55]. The task here is to infer the missing v alues in an input image x . W e require that the samples in the estimated (inpain ted) signal ˆ x coincide exactly with those of x which are known. Let O denote the subset of indexes where x is known. The constraint set is then deﬁned in signal sp ac e as Ω 0 = { ˆ x ∈ X : ˆ x i = x i , i ∈ O } . In order to apply p aco , w e need to pro ject onto the constraint set Ω deﬁned in p atch sp ac e Y . In this particular case, it is easy to show that this can b e done by ﬁrst pro jecting the patches ˆ Y onto signal space using the av erage stitc hing operator S deﬁned in (1), then pro jecting the resulting signal estimate ˆ x , and ﬁnally extracting them again using the op erator R : pro x λg ( Z ) = Π C ∩ Ω ( Z ) = R { Π Γ [ S ( Z )] } . Starting from the standard assumption that the dct coeﬃcients of image patc hes follow a hea vy tailed dis- tribution, we seek a solution ˆ x whose patch co eﬃcients a j = D ˆ y j are most likely to happ en under a Laplacian distribution. Here D is the 2D orthogonal dct t yp e I I transform. The cost function is the total Laplacian negative log-lik eliho o d of the co eﬃcients vectors, where each co eﬃcien t is allow ed its own Laplacian parameter: f 0 ( A ) = n X j =1 m X i =1 ω i | a i,j | . The proximal op erator of f 0 ( A ) = P i,j ω i | a i,j | is known as the soft-thr esholding op erator and is given by (11), T λw i ( a ) =    a + λw i , a < − λw i 0 , − λw i ≤ a ≤ λw i a − λw i , a > λw i (11) The admm solution to the p aco - dct inpainting problem deﬁned in terms of the cost function and constraint sets is given in Algorithm 3. Algorithm 3 Complete p aco-dct algorithm. Require: input signal x , initial estimation ˆ x (0) , observed samples index set O Z (0) ← R ( ˆ x (0) ); U (0) ← 0 rep eat A ( t ) ← D | [ Z ( t ) + U ( t ) ] { DCT co eﬃcients } a ( t +1) i,j ← T λω i,j ( b ( t ) i,j − u ( t ) i,j ) , ∀ i, j { soft thresholding } Y ( t +1) ← D A ( t +1) { con vert to patc hes } ˆ x ( t +1) ← S [ A ( t +1) ] { stitc h } ˆ x ( t +1) i ← x ( t +1) i , ∀ i ∈ O { Pro ject onto Ω 0 } Z ( t +1) ← R ( ˆ x ( t +1) ) { extract patc hes } U ( t +1) ← U ( t ) + Y ( t +1) − Z ( t +1) t ← t + 1 un til conv ergence return inpainted signal ˆ x ( t ) 9 Figure 3: Inpain ting masks 1 to 4. Masks 2 and 4 are more challenging due to the size of the erasures. 4.1 Sample results The results b elow are taken from [55]. The full results can b e found in http://iie.fing.edu.uy/ ~ nacho/paco/results . Figure 3 sho ws the diﬀerent missing pixels masks, and sample images from the Ko dak dataset on whose 24 images we tested our algorithms using those four masks. Our results are compared to [6, 19, 46]. W e use the p aco - dct problem parameters that yield the b est compromise b etw een rmse and ssim : w = 16 ( m = 256) and s = 2. The optimization parameters are set to λ = 10, κ = 0 . 95 a maximum of 256 iterations and a minim um allow able cost function decrease of  = 10 − 5 . T able 1 summarizes the results of the four metho ds for the 24 Ko dak images in terms of the median, 25 and 75 p ercen tiles of the rmse and ssim measures, again computed only ov er the missing pixel lo cations. F or simplicit y , we only show the results on the Luminance channel of these images; similar results are obtained for the other c hannels. Figure 4 sho ws the results obtained with these four metho ds for a particular case, again on the Luminance channel. 5 Denoising In this section we apply the p aco framework to the classic problem of removing i.i.d. Gaussian noise of known v ariance from images. The goals here are to demonstrate: a) the simplicity of applying p aco to already existing form ulations and b) that doing so has the p otential of impro ving their results. Please visit http://iie.fing.edu.uy/ ~ nacho/paco/ for source co de and the full set of results. W e now provide a brief introduction to the problem, and a few key references to works directly related to our approac h. The input is a noisy signal x = ˜ x + η , where η ∼ N (0 , σ 2 ), and the task is to estimate the clean, unobserv ed signal ˜ x from x . T ypical approaches in volv e some combination of a priori information characteristic of 10 T able 1: Summary of inpain ting results on the whole Kodak dataset (luminance channel) in terms of rmse (smaller is b etter) and ssim (higher is b etter); b est results are in b old purple. metric → rmse ssim mask → 1 2 4a 4c 1 2 4a 4c method ↓ percentile 25 p aco 8.38 15.68 8.15 11.56 0.9886 0.9562 0.9879 0.9513 [19] 10.44 15.07 9.80 12.92 0.9868 0.9556 0.9863 0.9392 [46] 12.12 19.64 10.88 14.77 0.9825 0.9418 0.9826 0.9299 [6] 19.84 23.67 20.68 23.90 0.8791 0.8548 0.8628 0.8403 METHOD per centile 50 p aco 10.48 17.32 10.05 14.09 0.9906 0.9615 0.9900 0.9567 [19] 12.46 16.31 12.08 15.94 0.9886 0.9607 0.9888 0.9508 [46] 14.75 22.48 13.91 17.96 0.9856 0.9463 0.9846 0.9389 [6] 24.15 29.25 26.21 29.23 0.9103 0.8854 0.9084 0.8777 METHOD per centile 75 p aco 13.43 20.73 15.20 19.66 0.9931 0.9646 0.9918 0.9618 [19] 15.95 21.20 17.65 20.90 0.9914 0.9660 0.9910 0.9571 [46] 18.92 27.20 20.87 24.72 0.9887 0.9494 0.9886 0.9499 [6] 31.05 37.91 32.12 36.17 0.9414 0.9131 0.9378 0.9087 Figure 4: Visual comparison of the four metho ds on Ko dak #19 and mask #2; please enlarge page to view details. T op to b ottom, left to right: p aco , [6, 19, 46]. All four metho ds p erform w ell on narrow gaps. F or wider gaps, the results of [19] and p aco-dct are similar, the ma jor diﬀerence b eing the fence next to the life sav er; [19] pro duces sharp er results than p aco , but also more artifacts (e.g., large gap in fence or ab ov e the lighthouse window); [46] and [6] pro duce signiﬁcan t artifacts in these regions (we recall the reader that [6] is not suited to such large gaps b y design.) 11 clean signals, such as a prior distribution p ( x ), and the fact that the noise is i.i.d. with mean zero, so that it can b e canceled b y adding noisy samples together. Prior-based approaches lend naturally to a v ariational formulation b y deﬁning a cost function f ( x ) = − log p ( x ); they usually diﬀer on how they take in to accoun t the noise pro cess. F or instance, one might imp ose that the recov ered signal is no farther a wa y from x than its exp ected ` 2 distance from the clean one: ˆ x = arg min ζ f ( ζ ) s . t . k x − ζ k 2 2 ≤ σ 2 . Another p opular approach is the classic Maximum a Posteriori ( map ) estimator, which leads to an unconstrained problem. This is the case of works such as [16]: ˆ x = arg min ζ f ( ζ ) + 1 2 σ 2 k x − ζ k 2 2 . Besides map , other notable successful estimators are the Stein’s Unbiased Estimator ( sure ) [60], and Donoho & Johnstone’s shrink age op erator ( dj ) [14]. Many patch-based denoising metho ds translate these same metho ds to patch space by considering each patch to b e a small noisy image, for example, Donoho-Johnstone’s shrink age is applied in [64] to dct patch co eﬃcients, sure is employ ed in [62], and the map approach is used in [16, 66]. F or instance, the patch-based map denoising formulation can b e written as: ˆ y = arg min ζ n X j =1  f ( ζ j ) + 1 2 σ 2 k y j − ζ j k 2 2  , where ζ j are the individual patches and ζ is the whole patches vector ov er which we minimize the cost. Since w e w ant to show how p aco can result in r elative improv ements ov er already-existing patc h-based denoising metho ds, our ideal starting point would b e a reasonably recen t and performant map -based method. Ideally , to o, the prior should b e conv ex, so that results do not dep end on, for example, diﬀerent initialization strategies. Tw o go o d candidates are gmm [65] and epll [66]: although neither are conv ex, b oth are recent, comp etitive, and fall within the patch-based map denoising framework. Of b oth, the epll has the adv antage of using a pre-learned prior which w e can use as well, thus allowing us to concentrate more on the eﬀects of the consensus constraint. Moreov er, epll also includes a for of soft c onsensus b et ween the patches, which also serves us to compare the eﬀect of diﬀerent consensus strategies within the same setting. W e now provide a brief introduction to epll , and then develop a v ariant of the algorithm using p aco instead of the soft consensus just mentioned. 5.1 Exp ected patc h log-lik eliho o d The epll framework was introduced in [66]. F ormally , the epll acronym refers to the common, aforementioned idea of imp osing a probabilistic prior on patches and then pro ducing an estimate of the restored signal whose patches are the most likely to o ccur given the observed, degraded signal. The epll algorithm is a v ariational, patch-based metho d which uses a pre-learned Gaussian Mixture Mo del ( gmm ) such as the one introduced in [65] and assumes an i.i.d. Gaussian mo del of known v ariance for the image noise, that is: x = ˜ x + η , η ∼ N (0 , σ 2 ) , ˜ x j ∼ p ( ˜ x ) where p ( ˜ x j ) = w k j N (0 , Σ k ) , k j = 1 , . . . , K, where K is the num b er of mo des in the gmm mo del. The denoising pro cedure is essentially a straightforw ard map estimation of the patches follo wed by standard patc h av eraging. The key contribution of [66] is the realization that, after patch av eraging, the resulting patches in the denoised image are no longer, individually , the most likely ones giv en the prior (for example, they may b e o verly smo oth). Therefore, the authors include a quadratic term in the form ulation whic h encourages the o verlapped patc hes to b e lik ely to o. The target energy to b e minimized is expressed as a function of the denoised image estimate ˆ x as: n X j =1 f ( R j ˆ x ) + 1 2 σ 2 N X i =1 o i k ˆ x i − x i k 2 2 , (12) where f ( R j ˆ x ) = − log p ( R j ˆ x ) is the log-prior of the j -th patch of the image and o i is the num b er of patches that the i -th sample of x is mapp ed to. F ollowing our notation conv entions, this corresp onds to the i -th element of the diagonal of R | R , o i = ( R | R ) ii . 12 5.2 The EPLL algorithm Directly minimizing (12) on ˆ x runs into the same diﬃculties that were mentioned section 2, namely , they inv olve a h uge pro jection matrix. The authors of epll [66] sidestep this problem using a splitting metho d prop osed in [23], whic h can b e seen as a soft v ersion of our prop osed har d splitting (9). Auxiliary (split) v ariables { ˆ z 1 , ˆ z 2 , . . . ˆ z n } are deﬁned, each corresp onding to a patch on the image, and a new (not equiv alent) problem is solved instead: n X j =1 f ( ˆ z j ) + 1 2 σ 2 N X i =1 o i k ˆ x i − x i k 2 2 + β 2 σ 2 n X j =1 k R j ˆ x − ˆ z j k 2 2 , (13) where β is a p enalty term. The energy (13) is then minimized alternatively on the split { ˆ z 1 , ˆ z 2 , . . . ˆ z n } and main ( ˆ x ) v ariables for a ﬁxed num b er of iterations, each with an increasing v alue of the p enalty β ; based on exp erimen tation, the authors of [66] prop ose six iterations with β taking on the v alues (1 , 4 , 8 , 16 , 32 , 64). As β is increased, the patc hes of the image iterate ˆ x and their split v ariables are encouraged to match, R j ˆ x ≈ ˆ z j ; this is a form of soft consensus which conv erges to exact consensus for β → ∞ . W e no w reform ulate the problem in a form whic h is more consisten t with our notations and conv entions by writing (13) in terms of the patches vectors of the input (noisy) image, y = Rx , and its denoised estimate, ˆ y = R ˆ x , and τ = σ 2 /β : n X j =1  f ( ˆ z j ) + 1 2 σ 2 k ˆ y j − y j k 2 2 + 1 2 τ k ˆ y j − ˆ z j k 2 2  . (14) Also, for con venience, w e deﬁne the patc hes vector ˆ z = [ ˆ z 1 | ˆ z 2 | . . . | ˆ z n ]. The alternate minimization pro ceeds as follo ws. F or ﬁxed ˆ z , (14) is minimized with resp ect to ˆ y . This is a simple quadratic problem whose solution is a w eighted sum of the noisy input image x and the image obtained by av erage stitching the auxiliary patches ˆ z : ˆ y ( t +1) = arg min ζ 1 2 σ 2 k ζ − Rx k 2 2 + 1 2 τ k ζ − ˆ z ( t ) k 2 2 = x + β S ( ˆ z ( t ) ) 1 + β . (15) F or ﬁxed ˆ y ( t +1) , the solution to (14) with respect to ˆ z corresponds to the proximal op erator of f ( · ) giv en the parameter τ = β /σ 2 , which is separable in the patches: ˆ z ( t +1) j = arg min ξ f ( ξ ) + 1 2 τ k ξ − ˆ y ( t +1) j k 2 2 , ∀ j = 1 , . . . , n. (16) The gmm prior is non-con vex and so (16) can only b e solved approximately . The epll algorithm do es so b y ﬁrst assigning each patch ˆ y ( t +1) j to the Gaussian mo de k j with maximum p osterior probability given ˆ y ( t +1) j : k j = arg max k log p ( k | ˆ y ( t +1) j ) = arg max k  log w k − 1 2 log | (Σ k + τ I ) | − 1 2 ζ | (Σ k + τ I ) − 1 ζ  , (17) where Σ k is the cov ariance matrix of the k -th Gaussian mo de of the gmm (the mean of all the mo des is assumed to b e 0). Giv en k j , the problem b ecomes quadratic and the patch estimate can b e obtained in closed form: ˆ z ( t +1) j = arg min ξ 1 2 ξ | Σ − 1 k j ξ + 1 2 τ k ξ − ˆ y ( t +1) j k 2 2 = (Σ k j + τ I ) − 1 Σ k j ˆ y ( t +1) j . (18) The epll metho d is summarized in Algorithm 4. 5.3 Learned vs ﬁxed GMM mo del The ab ov e EPLL metho d is deﬁned for a pre-learned GMM mo del. Ho wev er, b oth in [66, 30] the implemen tations allo w for the input mo del to b e further adapted using the standard Exp ectation-Maximization algorithm for GMM mo dels [65]. The denoising metho d that w e present b elo w, as well as the EPLL results that w e obtained using the implementation [30], do not adapt the GMM mo del. This allows us to p erform a comparison of b oth metho ds where the only diﬀerences lie in the splitting metho ds used ([23] vs admm ). This b eing said, as with epll , nothing prev ents our algorithm to up date the gmm mo del during the iterations. 13 Algorithm 4 Alternate blo ck minimization – EPLL algorithm Require: noisy image x , noise v ariance σ 2 t ← 0 ˆ x (0) ← x for β = 1 , 4 , 8 , 16 , 32 , 64 do τ ( t ) = σ 2 /β ˆ y ( t ) ← R ˆ x ( t ) k j = arg min j − log w k + 1 2 log | Σ k + τ ( t ) I | + 1 2 ( ˆ y ( t ) j ) |  Σ k + τ ( t ) I  − 1 ˆ y ( t ) j ˆ z ( t +1) j ← (Σ k j + τ ( t ) I ) − 1 Σ k j ˆ y ( t ) j ˆ x ( t +1) ← (1 + β ) − 1  x + β ( R | R ) − 1 R | ˆ z ( t +1)  t ← t + 1 end for return denoised image ˆ x ( t ) 5.4 P ACO-GMM By now it should b e clear that, for this particular case, epll b ears many similarities with p aco : it is a v ariational patc h-based metho d which, alb eit indirectly , encourages consensus b etw een the estimated patches. Last but not least, the n umerical solution is based on a splitting strategy . W e now dev elop the “ p aco equiv alent” to the epll metho d, that is, w e employ the same prior model and the same appro ximate metho d for estimating a denoised patc h given a noisy observ ation (17) and (18). Concretely , the p aco form ulation for the map -denoising problem is giv en by: ˆ y = arg min ζ f ( ζ ) + 1 2 σ 2 k ζ − y k 2 2 + g ( ζ ) , (19) where g ( ζ ) is the consensus constraint and y are the patches extracted from the input noisy image. It is interesting to note that the splitting in this case can b e done in t wo w ays: the quadratic term in (19) can b e merged with either f ( · ) or g ( · ). In the ﬁrst case we obtain: ( ˆ y , ˆ z ) = arg min ( ζ ,ξ )  f ( ξ ) + 1 2 σ 2 k ζ − ξ k 2 2  + g ( ζ ) + 1 2 τ k ξ − y k 2 2 , s . t . ζ = ξ ; (20) and in the second case ( ˆ y , ˆ z ) = arg min ( ζ ,ξ ) f ( ξ ) +  g ( ζ ) + 1 2 σ 2 k ζ − y k 2 2  + 1 2 τ k ζ − ξ k 2 2 , s . t . ζ = ξ . (21) Despite being formally equiv alent, (20) and (21) admit diﬀeren t in terpretations. In (20), the ﬁrst cost function is the log-p osterior of the patch estimate given the corresp onding noisy patch, and the second function is the consensus indicator. In (21), the cost function is the log-prior of the patc h estimate, and the quadratic ﬁtting term is absorbed in to the consensus indicator function. W e now observe that, if we disregard the equality constraint, the solution of (21) for ﬁxed ξ with resp ec t to ζ can b e manipulated to yield a familiar form: ˆ y = arg min ζ g ( ζ ) + 1 2 σ 2 k ζ − y k 2 2 + 1 2 τ k ζ − ξ k 2 2 = arg min ζ g ( ζ ) + τ + σ 2 2 τ σ 2     ζ − τ y + σ 2 ξ τ + σ 2     2 2 = arg min ζ g ( ζ ) + 1 + β 2 σ 2     ζ − y + β ξ 1 + β     2 2 = R  S  y + β ξ 1 + β  = ˆ y ls . (22) The resulting expression subsection 5.4 is actually equiv alen t to (15). F urthermore, if we ﬁx ˆ y in (21) and solv e for ˆ z we arrive exactly at (18)! Based on the ab ov e facts, we choose to follow our algorithm along the splitting deﬁned in (21). Not only this leads to a solution that is as close as p ossible to that of epll , but allo ws us to use the approximate map estimation 14 prop osed in [66], which is the most delicate part of the pro cess, out of the b ox. Of course, our splitting do es not op erate directly on ˆ x and ˆ y but includes the Lagrangean m ultiplier; this amounts to solving subsection 5.4 with y + u in place of y , (18) with z − u instead of z , and then up dating u ← u + y − z . There is another imp ortant asp ect in this case which is the role of the admm p enalt y τ . In standard admm this v alue is kept ﬁxed, but there is no harm in letting it grow with the iterations. How ev er, care must b e taken if (as we did so far) the Lagrangian multiplier u is pre-scaled by τ : the algorithm breaks do wn if τ v aries during iterations. Now, for a highly non-conv ex problem like (19), the choice of τ , and whether to increase it or not during iterations, may b e crucial for conv erging to a go o d lo cal minima. A deep analysis of the numerical implications of this choice is clearly b eyond this work. Instead, again, we resort to the heuristics used in epll in the following wa y: the epll algorithm is run for 7 ﬁxed iterations, for β = 1 , 2 , 4 , 8 , 16 , 32 , 64; these v alues were found empirically to w ork b est for most noise levels. As we sa w in our analysis of the epll Algorithm, the nexus b etw een the admm p enalt y parameter and β is giv en b y τ = β /σ 2 . Th us, we follow a similar strategy and apply our algorithm for v arying v alues of τ . In our case, we found that τ = i 2 , i = 1 , 2 , 3 , 4 , 5 , 6 , . . . , where i is the iteration n umber, gives excellen t results on par or b etter than epll , as we will see b elow. The resulting p a co - gmm is given in Algorithm Algorithm 5. W e show the diﬀerences b etw een b oth metho ds in purple for b etter comparison. Algorithm 5 P ACO-GMM algorithm. Diﬀerences with the EPLL Algorithm 4 are marked in purple. Require: noisy image x , noise v ariance σ 2 , admm parameter τ t ← 1 ˆ y ( t ) ← Rx u ( t ) ← 0 while t ≤ 8 do β = t 2 τ ← σ 2 /β k j = arg min j − log w k + 1 2 log | Σ k + τ ( t ) I | + 1 2 ( ˆ y ( t ) j − τ u ( t ) ) |  Σ k + τ ( t ) I  − 1 ( ˆ y ( t ) j − τ u ( t ) ) ˆ z ( t +1) j ← (Σ k j + τ ( t ) I ) − 1 Σ k j ( ˆ y ( t ) j − τ u ( t ) ) ˆ x ( t +1) ← (1 + β ) − 1  x + β ( R | R ) − 1 R | ( ˆ z ( t +1) + τ u ( t ) )  u ( t +1) ← u ( t ) + (1 /τ )( R ˆ x ( t +1) − ˆ z ( t +1) ) t ← t + 1 end while return denoised image ˆ x 5.5 Qualitativ e Comparison As exp ected and, by comparing Algorithm 4 and Algorithm 5, it is clear that b oth metho ds are very similar. The main diﬀerence, of course, lies in the use of a hard constraint instead of a soft one, which in turns leads to the presence of the Lagrangian m ultiplier u in Algorithm 5. Moreo ver, note that, as the Lagrange m ultiplier is initialized to u (1) = 0, the ﬁrst iteration of b oth metho ds is exactly the same. Ho wev er, the iterates in b oth metho ds diverge after the ﬁrst iteration, precisely b ecause of u ( t ) > 0 for t > 1. Despite this divergence, b oth metho ds yield very similar results in practice, particularly from the quantitativ e p oint of view, as shown in T able 2. 5.6 V ariants using other estimators Both epll and our p aco - gmm metho d estimate the denoised patches using the map corresp onding to a Gaussian prior on the patches; this corresp onds to an ` 2 -p enalized least squares problem. How ever, nothing preven ts us to apply a diﬀerent estimator (keeping the mo de assignmen t intact), as done for example in [65]. Concretely , w e replace the ` 2 estimator (18), ﬁrst with an ` 1 prior, and second with the dj hard thresholding estimator. In the former case we hav e, ˆ z ( t +1) j = arg min ξ X i 1 s i | ( V | ξ ) i | + 1 2 τ k ξ − ˆ y ( t +1) j k 2 2 = V T τ / s ( V | ˆ y ( t +1) j ) , (23) where V diag( s ) V | = Σ k is the Singular V alue Decomp osition of the cov ariance matrix, s is the vector of singular v alues of that decomp osition, and T τ /vs ( · ) is the soft-thresholding operator with threshold τ /s i applied to the comp onen t i of the input vector, which we rep eat here for conv enience: 15 T able 2: Summary of denoising results on the grayscale Ko dak images. F or each noise level and metho d w e show the 25th, 50th (the median) and 75th p ercen tiles of the rmse (lo wer is b etter). The diﬀerence b etw een the medians of all metho ds in all settings is very small (less than 1 intensit y level). In particular, those of p aco - gmm , its ` 1 v ariant, and epll are almost identical. BM3D yields s ligh tly but consistently smaller rmse , whereas the dj v ariant of p aco is consistently w orse. metho d P ACO EPLL BM3D ` 1 DJ p ercen tile 25 50 75 25 50 75 25 50 75 25 50 75 25 50 75 σ = 10 4.47 5.13 6.06 4.48 5.12 6.04 4.27 4.96 5.79 4.47 5.15 6.09 4.71 5.39 6.21 σ = 20 6.40 7.78 9.18 6.44 7.83 9.17 6.16 7.50 8.98 6.43 7.72 9.12 6.92 8.22 9.97 σ = 30 8.25 9.88 12.04 8.25 9.84 11.87 7.50 9.10 11.31 8.25 9.88 12.04 8.69 10.67 12.90 T τ / s ( ξ ) i =      − ξ + τ /s i , ξ < − τ /s i 0 , − τ /s i ≤ ξ ≤ τ /s i ξ − τ /s i , ξ > τ /s i . As for the latter, the estimate is obtained as V H 3 σ ( V | ˆ y ( t +1) j ) with: H 3 σ ( ξ ) i = ( ξ , | ξ | > 3 σ 0 , | ξ | ≤ 3 σ . (24) where the threshold 3 σ is suggested in [64]. 5.7 Quan titativ e comparison T able 2 summarizes the results obtained on the Ko dak dataset, under three levels of noise, b y the three v ariants of p aco - gmm describ ed ab ov e, epll , and BM3D. Figure 7 shows a visual comparison of these results on one of the Ko dak images. Despite the similarities in terms of rmse , the results obtained with the diﬀerent methods are clearly distinguishable. In particular, the ` 1 and dj v arian ts exhibit less artifacts. Moreo ver, BM3D , which is b etter in terms of rmse , tends to pro duce strong artifacts in smo oth regions. 6 Con trast Enhancemen t As a ﬁnal example application w e explore a problem where the cost function is not derived from a prior probability mo del on the patches. This is the case of the Lo cal Contrast Enhancement ( lce ) problem: a no w-ubiquitous p ost- pro cessing image enhancement tec hnique whic h generally provides b etter results than older and simpler Global Con trast Enhancement ( gce ) metho ds. The basic metho d for improving the global contrast of an image is Histogram Equalization ( heq ): given an input image x taking on v alues in [0 , 1), and the empirical cumulativ e distribution function of its v alues, F x ( z ) = P N i =1 1 ( x i < z ) N , the equalized image ˆ x is obtained b y the mapping ˆ x i ← F − 1 x ( x i ) . An example of this pro cedure is shown in Figure 8. 6.1 Con trast enhancemen t cost function Let (0 ≤ z 1 < z 2 < . . . z k ≤ 1) b e the k diﬀerent intensit y levels o ccurring in x . Histogram equalization can b e seen as ﬁnding the corresp onding levels in the enhanced image, ( ˆ z 1 < . . . < ˆ z k ) which minimize the following cost function: h ( ˆ x ) = Z 1 0 ( F ˆ x ( µ ) − µ ) dµ = k +1 X i =1 Z ˆ z i ˆ z i − 1 ( F ˆ x ( µ ) − µ ) dµ, 16 Figure 5: Sample denoising results for test image and σ = 20. Eac h pixel in the color images indicates which of the 200 gmm mo des is assigned to each pixel in the corresp onding image. The top row shows, from left to right: the clean image, its mo de assignment map (this is the ground truth of the mo de assignment problem), the noisy input, and the ground truth, again, for b etter comparison. The second row sho ws the ﬁrst iterate of the denoised estimate obtained using p aco , its corresp onding mo de map, the ﬁrst iterate of epll , and its map. The last row shows the corresp onding images for the 7th iterate of b oth algorithms (which is the last of epll ). As can b e seen, the results are v ery close and b oth ﬁnal maps are close to the ground truth. Ho wev er, on close insp ection, the epll result has less artifacts and pro duces a softer cluster assignment. 17 Figure 6: Comparison of denoising results obtained with the three p aco - gmm v arian ts on test image and σ = 20. The ﬁrst row is identical to that of Figure 5. The second and third rows corresp ond to the ﬁrst and last iterates obtained using soft-thresholding ( ` 1 ) and dj for the estimates instead of the baseline ( ` 2 ) estimator of epll and p aco - gmm . As can b e seen, the results are signiﬁcantly b etter, both in p erceptive quality , and in terms of the estimated mo de assignment. 18 Figure 7: Visual comparison of results on the Ko dak dataset and all v ariants tested (high resolution image – zo om for details). F rom top to bottom, left to right: noisy image (Ko dim09 with σ = 30), p aco - gmm , epll , BM3D, p aco - gmm - ` 1 and p aco - gmm - dj . The results in this case are clearly diﬀerent b et ween epll and p aco - gmm . F urthermore, despite their relatively lo w p erformance in terms of rmse , the ` 1 and dj v ariants pro duce less artifacts and ov erall a nicer result than any of the other metho ds (BM3D, which consistently yields the low er rmse , also exhibits more visible artifacts.) 19 Figure 8: Global Histogram Equalization. Left, from top to b ottom: original image, its histogram and cdf . Righ t: enhanced image, its histogram, and its cdf . In this case, the simple gce is enough to pro duce a signiﬁcantly b etter image. 20 where we hav e deﬁned z 0 = 0 and z k +1 = 1. Let ( p 1 , p 2 , . . . , p k ) b e the empirical probability distribution of each lev el in x ; this vector is not mo diﬁed during minimization. W e hav e that F ˆ x ( µ ) = F i − 1 = i − 1 X r =1 p r , ˆ z i − 1 ≤ µ < ˆ z i , so that F ˆ x is constan t within each integral and do es not dep end on ( ˆ z 1 , ˆ z 2 , . . . , ˆ z k ). W e deﬁne F 0 = 0. After in tegration, we obtain the following cost function on ( ˆ z 1 , ˆ z 2 , . . . , ˆ z k ): h ( ˆ x ) = f ( ˆ z 1 , ˆ z 2 , . . . , ˆ z k ) = k +1 X i =1 Z ˆ z i ˆ z i − 1 ( F ˆ x ( µ ) − µ ) dµ = k +1 X i =1 1 2 h ( F i − 1 − ˆ z i ) 2 − ( F i − 1 − ˆ z i − 1 ) 2 i . The solution to the ab o ve problem is given by 6.2 Lo cal con trast enhancemen t In certain situations, global contrast enhancement is not enough. Images such as the leftmost one in Figure 9, where large regions are sub ject to very diﬀerent lo cal luminosities, are not eﬀectively improv ed by this metho d. As the name implies, lo cal contrast enhancemen t works by correcting con trast in diﬀerent regions. Tw o well known and very successful metho ds for this task are Retinex [34] and Automatic Color Enhancemen t ( ace ) [56]; see [24] for a fast, free and op en source implementation of ace . It should b e noted that b oth Retinex and ace are not limited to contrast enhancement: Retinex also handles white balance, while ace handles contrast, white balance, and exp osure automatically . The algorithm that w e present next is limited only to contrast enhancement. Besides these well established metho ds, there are many other lo cal contrast enhancements which hav e their own strengths, e.g. being faster, or b eing less sensitive to noise than the abov e ones [31, 37, 44, 47]. Many such metho ds are implemen ted in IPOL, 1 see e.g. [21, 24, 27, 37, 50]; we invite the reader to exp eriment with these metho ds. In this section we explore the simple idea of applying histogram equalization on a patch-wise basis to ac hieve lo cal contrast while using p aco to obtain a globally-coheren t result. W e do not claim our prop osed metho d to b e b etter than any of the aforementioned ones. In fact, it should hardly b e the case, as many of them rely on carefully studied p erceptual prop erties of human p erception, whereas our metho d is conceptually very simple. Because plain histogram equalization is usually very aggressive, we include an additional term to the cost function so that the result is a weigh ted av erage b etw een the equalized result and the original patches. Also, in order to account include some exp osure comp ensation capability (the so called gr ay world principle), we add an extra term which accounts for deviations of the mean patch v alue from the middle intensit y 0 . 5; the resulting cost function on a given patch is given by: h α,β ( ˆ x j ) = α Z 1 0 ( F ˆ x ( µ ) − µ ) dµ + β 2     ˆ x − 1 2 1     2 2 + 1 − α − β 2 k x − ˆ x k 2 2 . The ﬁdelity term k x − ˆ x k 2 2 can also b e written in terms of the F i and ˆ z i , so that we can write the cost function solely in terms of the output levels of the image patch: h α,β ( ˆ z 1 , ˆ z 2 , . . . , ˆ z k ) = α 2 k +1 X i =1 h ( F i − 1 − ˆ z i ) 2 − ( F i − 1 − ˆ z i − 1 ) 2 i + β 2 k X i =1 p i ( ˆ z i − 0 . 5) 2 + 1 − α 2 k X i =1 p i ( ˆ z i − z i ) 2 . (25) The solution to the ab o ve problem is ˆ x = αF − 1 ( x ) + β 0 . 5 − 1 N X i x i ! 1 + (1 − α − β ) x . (26) It is imp ortant to note that (25) includes a ﬁdelity term with resp ect to the original input image x . If we wan t to preserve its purpose, we cannot let this term dep end on an image which is an iterate within the admm algorithm, as the original reference would b e lost. Thus, in our implementation, the last term of (25) is alwa ys with resp ect to 1 Image Pro cessing On-Line, h ttp://ip ol.im an excellent resource for w ell do cumented and implemented image pro cessing algorithms, which are free, op en source, and can be exp erimented on-line. 21 the input image, which is constant throughout the iterations. With this in mind, we now plug h α,β ( · ) in the p aco framew ork to deﬁne our p aco - heq problem: ˆ x = arg min ζ h α,β ( ζ ) + g ( ζ ) . As in the denoising case, the splitting can b e done in more than one wa y . W e choose the following splitting, which is consistent with our deﬁnition of the problem: ( ˆ x , ˆ z ) = arg min ζ ,ξ h α,β ( ζ ) + g ( ξ ) + 1 2 τ k ζ − ξ k 2 2 , s . t . ζ = ξ . F ollowing the standard pro cedure we hav e b een using so far, the up date ˆ y ( t +1) giv en ˆ z ( t ) and u ( t ) is obtained by applying the stitching trick to ˆ z ( t ) − u ( t ) . The up date on ˆ z is given by ˆ z ( t +1) ← α 1 + τ F − 1 h ˆ y ( t +1) − u ( t ) i + β 1 + τ " 0 . 5 − 1 N X i ( ˆ y ( t +1) − u ( t ) ) i # 1 + 1 + τ − α − β 1 + τ x . Notice that the last term dep ends on the input x , not on the iterate. In the patch-wise formulation, this is replaced b y the original image patches vector y . W e summarize these steps in Algorithm 6. Algorithm 6 P ACO-HEQ algorithm. Require: image x , mixing ratios 0 < α, β < 1, admm parameter τ y ← Rx t ← 0; ˆ y ( t ) ← y ; u ( t ) ← 0 rep eat ˆ z ( t +1) j ← α 1+ τ F − 1  ˆ y ( t +1) j − u ( t ) j  + β 1+ τ h 0 . 5 − 1 N P i ( ˆ y ( t +1) j − u ( t ) j ) i 1 + 1+ τ − α − β 1+ τ y j , j = 1 , . . . , n ˆ y ( t +1) ← R ( R | R ) − 1 R | ( ˆ z ( t +1) + u ( t ) ) u ( t +1) ← u ( t ) + ˆ y ( t +1) − ˆ z ( t +1) t ← t + 1 un til conv ergence return enhanced image ˆ x ← R | ˆ y ( t ) 6.3 Sample results Sample results are shown in Figure 9 and Figure 10. Ov erall, with these parameters, p a co - heq do es pro duce a noticeable improv ement in lo cal contrast. It tends to preserve the lo cal intensit y more than the other metho ds, but still pro duces signiﬁcantly sharper results. The source co de, as well as more examples, can b e found in http://iie.fing.edu.uy/ ~ nacho/paco/ . 7 Concluding remarks and future w ork W e ha ve presented p aco , a general framework for solving signal restoration problems under a patc h consensus constrain t. W e ha ve shown that the resulting optimization problem is feasible and easy to solve using the standard admm metho d. In particular, we describ ed a general and practical solution to the patch repro jection step, named the “stitching trick”, which sidesteps the ma jor b ottleneck in similar metho ds. W e hav e also shown that the hard consensus constraint allows for the deﬁnition of constraints directly in signal space, which may allow for new wa ys of formulating old problems. The p aco framework was show cased in three sample applications with go o d results, b oth in terms of quality and eﬃciency . Sev eral other applications are p ossible; we are already working on a few of them, the results of which will b e published elsewhere. Besides new applications, p aco can also b e used as a platform for distributed pro cessing of very large images, b y letting diﬀeren t no des pro cess smaller sub-images and then using the consensus mechanism to merge the results. In fact, distributed computing applications and netw ork optimization are the ﬁelds where consensus optimization problems, as describ ed in section 3, were ﬁrst analyzed. 22 Figure 9: Con trast Enhancemen t results on Iris image. T op row: original image, global heq , ace with α = 8. Bottom row: p aco - heq equalization function (26) with α = 0 . 25 and β = 0 . 05 applied globally , p aco - heq with same parameters on patc hes of size 64 × 64 and stride 8, p aco - heq obtained with the same parameters except α = 0 . 125 (less contrast). Diﬀerences are particularly noticeable in the leav es of the plant. Figure 10: Contrast Enhancement results on Barril image. Left to righ t: original image, global HEQ, ACE with α = 8, p aco - heq with α = 0 . 25, β = 0 . 05, patches of size 64 × 64 and a stride of s = 8. 23 Finally , the patch consensus idea can b e extended in v arious wa ys. F or example, it could b e used in combination with the classic mixtur e of exp erts or b o osting as a wa y to combine the output of the exp erts in a meaningful wa y . Another in teresting extension w ould be a true m ultiscale patc h consensus framew ork (our curren t framew ork readily allo ws for schemes such as dilate d c onvolutions [63]). References [1] C. A guerrebere, A. Almansa, J. Delon, Y. Gousseau, and P. Mus ´ e , A b ayesian hyp erprior appr o ach for joint image denoising and interp olation, with an applic ation to hdr imaging , IEEE T ransactions on Com- putational Imaging, 3 (2017), pp. 633–646, https://doi.org/10.1109/TCI.2017.2704439 . [2] M. Aharon, M. Elad, and A. Bruckstein , K-SVD: An algorithm for designing over c omplete dictionaries for sp arse r epr esentation , IEEE T ransactions on Signal Pro cessing, 54 (2006), pp. 4311–4322, https://doi.org/ 10.1109/TSP.2006.881199 . [3] M. H. Alkinani and M. R. El-Sakka , Patch-b ase d mo dels and algorithms for image denoising: a c omp ar- ative r eview b etwe en p atch-b ase d images denoising metho ds for additive noise r e duction , Eurasip Journal on Image and Video Pro cessing, 2017 (2017), www.scopus.com . [4] P. Arias, G. F a cciolo, V. Caselles, and G. Sapiro , A variational fr amework for exemplar-b ase d image inp ainting , Int. J. Comput. Vision, 93 (2011), pp. 319–347, https://doi.org/10.1007/s11263- 010- 0418- 7 , http://dx.doi.org/10.1007/s11263- 010- 0418- 7 . [5] D. Ba tenko v, Y.R omano, and M. Elad , On the Glob al-L o c al Dichotomy in Sp arsity Mo deling , Springer In ternational Publishing, Cham, 2017, pp. 1–53, https://doi.org/10.1007/978- 3- 319- 69802- 1_1 , https://doi.org/10.1007/ 978- 3- 319- 69802- 1_1 . [6] F. Bornemann and T. M ¨ arz , F ast image inp ainting b ase d on c oher enc e tr ansp ort , J. Math. Imaging Vis., 1 (2007), pp. 259–278. [7] K. Brandt, M. Syskind, J. Larsen, K. Strimmer, L. Christiansen, J. Hansen, L. He, L. Thibaut, M. Bar ˜ ao, S. Ha ttinger, V. Sima, and W. The , The matrix c o okb o ok – version 12 , tech. rep ort, T echnical Univ ersity of Denmark, 2012. [8] H. Bristow, A. Eriksson, and S. Lucey , F ast c onvolutional sp arse c o ding , in 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 391–398, https://doi.org/10.1109/CVPR.2013.57 . [9] A. Buades, B. Coll, and J. Morel , A r eview of image denoising algorithms, with a new one , Multiscale Mo deling & Simulation, 4 (2005), pp. 490–530, https://doi.org/10.1137/040616024 , https://doi.org/10.1137/040616024 , https://arxiv.org/abs/https://doi.org/10.1137/040616024 . [10] T. F. Chan and J. Shen , Image Pr o c essing and Analysis: V ariational, PDE, Wavelet, and Sto chastic Meth- o ds , So ciety for Industrial and Applied Mathematics (SIAM). Philadelphia, P A., 2005. [11] K. Dabo v, A. Foi, V. Ka tk ovnik, and K. Egiazarian , Image denoising by sp arse 3-d tr ansform-domain c ol lab or ative ﬁltering , IEEE T ransactions on Image Processing, 16 (2007), pp. 2080–2095, https://doi.org/10.1109/ TIP.2007.901238 . [12] I. Da ubechies , The wavelet tr ansform, time-fr e quency lo c alization and signal analysis , IEEE T ransactions on Information Theory , 36 (1990), pp. 961–1005, https://doi.org/10.1109/18.57199 . [13] Z. Dengwen and S. Xiaoliu , Image denoising using weighte d aver aging , in 2009 WRI International Confer- ence on Comm unications and Mobile Computing, v ol. 1, Jan 2009, pp. 400–403, https://doi.org/10.1109/CMC.2009.64 . [14] D. L. Donoho and I. M. Johnstone , Ide al sp atial adaptation by wavelet shrinkage , Biometrik a, 81 (1994), pp. 425–455, https://doi.org/10.1093/biomet/81.3.425 , https://doi.org/10.1093/biomet/81.3.425 , academic.oup.com/biomet/article- pdf/81/3/425/26079146/81.3.425.pdf . [15] J. Eckstein and D. Ber tsekas , On the Douglas-R achfor d splitting metho d and the pr oximal p oint algorithm for maximal monotone op er ators , Mathematical Programming, 55 (1992), pp. 293–318, https://doi.org/https:// doi.org/10.1007/BF01581204 . 24 [16] M. Elad and M. Aharon , Image denoising via sp arse and r e dundant r epr esentations over le arne d dictionar- ies , IEEE T ransactions on Image Pro cessing, 15 (2006). [17] K. Engan, S. Aase, and J. Husoy , Multi-fr ame c ompr ession: The ory and design , Signal Processing, 80 (2000), pp. 2121–2140. [18] E. Esser, X. Zhang, and T. Chan , A gener al fr amework for a class of ﬁrst or der primal-dual algorithms for c onvex optimization in imaging scienc e , SIAM Journal on Imaging Sciences, 3 (2010), pp. 1015–1046. [19] V. Fedoro v, G. F acciolo, and P. Arias , V ariational F r amework for Non-L o c al Inp ainting , Image Pro- cessing On Line, 5 (2015), pp. 362–386, https://doi.org/10.5201/ipol.2015.136 . [20] J. Feng, L. Song, X. Huo, X. Y ang, and W. Zhang , An optimize d pixel-wise weighting appr o ach for p atch-b ase d image denoising , IEEE Signal Pro cessing Letters, 22 (2015), pp. 115–119, https://doi.org/10.1109/ LSP.2014.2350032 . [21] S. Ferradans, R. P alma-Amestoy, and E. Pro venzi , A n A lgorithmic A nalysis of V ariational Mo dels for Per c eptual L o c al Contr ast Enhanc ement , Image Pro cessing On Line, 5 (2015), pp. 219–233. https://doi.org/ 10.5201/ipol.2015.131 . [22] M. A. T. Figueiredo , Synthesis versus analysis in p atch-b ase d image priors , in 2017 IEEE In ternational Conference on Acoustics, Sp eech and Signal Pro cessing (ICASSP), March 2017, pp. 1338–1342, https://doi.org/ 10.1109/ICASSP.2017.7952374 . [23] D. Geman and C. Y ang , Nonline ar image r e c overy with half-quadr atic r e gularization , IEEE T rans. Image Pro cess., 4 (1995), pp. 932–946. [24] P. Getreuer , A utomatic Color Enhanc ement (ACE) and its F ast Implementation , Image Processing On Line, 2 (2012), pp. 266–277. https://doi.org/10.5201/ipol.2012.g- ace . [25] P. Getreuer , T otal V ariation Inp ainting using Split Br e gman , Image Pro cessing On Line, 2 (2012), pp. 147– 157, https://doi.org/10.5201/ipol.2012.g- tvi . [26] V. Gnann and J. Becker , Signal r e c onstruction fr om multir esolution STFT magnitudes with mutual initial- ization , in Pro ceedings of the AES International Conference, 2012, pp. 274–279, www.scopus.com . [27] J. G. Gomila and J. L. Lisani , L o c al Color Corr e ction , Image Pro cessing On Line, 1 (2011), pp. 260–280. https://doi.org/10.5201/ipol.2011.gl_lcc . [28] M. M. Goodwin , R e alization of arbitr ary ﬁlters in the STFT domain , in IEEE W orkshop on Applications of Signal Pro cessing to Audio and Acoustics, 2009, pp. 353–356, www.scopus.com . Cited By :1. [29] O. G. Guler yuz , Weighte d aver aging for denoising with over c omplete dictionaries , IEEE T ransactions on Image Pro cessing, 16 (2007), pp. 3020–3034, https://doi.org/10.1109/TIP.2007.908078 . [30] S. Hura ul t, T. Ehret, and P. Arias , EPLL: An Image Denoising Metho d Using a Gaussian Mixtur e Mo del L e arne d on a L ar ge Set of Patches , Image Pro cessing On Line, 8 (2018), pp. 465–489. https://doi.org/10.5201/ ipol.2018.242 . [31] D. Jobson, Z. Rahman, and G. W oodell , A multisc ale r etinex for bridging the gap b etwe en c olor images and the human observation of sc enes , IEEE T rans. Image Pro cess., 6 (1997), pp. 965–976, https://doi.org/http:// dx.doi.org/10.1109/83.597272 . [32] N. Ka w ai, T. Sa to, and N. Yokoy a , Image inp ainting c onsidering brightness change and sp atial lo c ality of textur es and its evaluation , in Adv ances in Image and Video T echnology , T. W ada, F. Huang, and S. Lin, eds., Berlin, Heidelb erg, 2009, Springer Berlin Heidelb erg, pp. 271–282. [33] C. Ker vrann and J. Boulanger , Optimal sp atial adaptation for p atch-b ase d image denoising , IEEE T rans. Image Pro cess., 15 (2006), pp. 2866–2878. [34] E. Land and J. McCann , Lightness and r etinex the ory , Journal of the Optical So ciet y of America, 61 (1971), pp. 1–11, https://doi.org/http://dx.doi.org/10.1364/JOSA.61.000001 . 25 [35] J. Le R oux, H. Kameoka, N. Ono, and S. Saga y ama , F ast signal r e c onstruction fr om magnitude STFT sp e ctr o gr am b ase d on sp e ctr o gr am c onsistency , in 13th In ternational Conference on Digital Audio Eﬀects, DAFx 2010 Pro ceedings, 2010, www.scopus.com . Cited By :29. [36] M. Lewicki and B. Olshausen , Pr ob abilistic fr amework for the adaptation and c omp arison of image c o des , J. Opt. So c. Am. A, 16 (1999), pp. 1587–1601, http://josaa.osa.org/abstract.cfm?URI=josaa- 16- 7- 1587 . [37] J. L. Lisani, A. B. Petro, and C. Sber t , Color and Contr ast Enhanc ement by Contr ol le d Pie c ewise Aﬃne Histo gr am Equalization , Image Pro cessing On Line, 2 (2012), pp. 243–265. https://doi.org/10.5201/ipol.2012.lps- pae . [38] Z. Liu, L. Chen, and J. Y ang , An image r e c onstruction algorithm using p atch-b ase d lo c al ly optimal wiener ﬁltering , Dianzi Y u Xinxi Xuebao/Journal of Electronics and Information T echnology , 36 (2014), pp. 2556–2562, www.scopus.com . Cited By :3. [39] J. Mairal, F. Bach, J. Ponce, and G. Sapiro , Online le arning for matrix factorization and sp arse c o ding , Journal of Machine Learning Research, 11 (2010), pp. 19–60, www.scopus.com . Cited By :1302. [40] J. Mairal, M. Elad, and G. Sapiro , Sp arse r epr esentation for c olor image r estor ation , IEEE T ransactions on Image Pro cessing, 17 (2008), pp. 53–69, www.scopus.com . Cited By :930. [41] J. Mairal, G. Sapiro, and M. Elad , L e arning multisc ale sp arse r epr esentations for image and vide o r estor ation , Multiscale Mo deling and Simulation, 7 (2008), pp. 214–241, www.scopus.com . Cited By :285. [42] S. G. Malla t , A the ory for multir esolution signal de c omp osition: the wavelet r epr esentation , IEEE T ransac- tions on Pattern Analysis and Machine Intelligence, 11 (1989), pp. 674–693, https://doi.org/10.1109/34.192463 . [43] J. A. Moorer , A note on the implementation of audio pr o c essing by Short-T erm F ourier T r ansform , in IEEE W orkshop on Applications of Signal Pro cessing to Audio and Acoustics, vol. 2017-Octob er, 2017, pp. 156–159, www.scopus.com . [44] N. Moroney , L o c al c olor c orr e ction using non-line ar masking , in IS&T/SID Eight Color Imaging Conference, 2000, pp. 108–111. [45] G. Mott a, E. Ordentlich, I. Ramirez, G. Seroussi, and M. J. Weinberger , The iDUDE fr amework for gr aysc ale image denoising , IEEE T ransactions on Image Pro cessing, 20 (2011), pp. 1–21, https://doi.org/ 10.1109/TIP.2010.2053939 . [46] A. Newson, A. Almansa, Y. Gousseau, and P. P ´ erez , Non-L o c al Patch-Base d Image Inp ainting , Image Pro cessing On Line, 7 (2017), pp. 373–385, https://doi.org/10.5201/ipol.2017.189 . [47] R. P alma-Amestoy, E. Provenzi, M. Ber t alm ´ ıo, and V. Caselles , A p er c eptual ly inspir e d variational fr amework for c olor enhanc ement , IEEE T ransactions on P attern Analysis and Mac hine Intelligence, 31 (2009), pp. 458–474, https://doi.org/10.1109/TPAMI.2008.86 . [48] V. P apy an and M. Elad , Multi-sc ale p atch-b ase d image r estor ation , IEEE T ransactions on Image Pro cessing, 25 (2016), pp. 249–261, www.scopus.com . [49] N. P arikh and S. P. Boyd , Pr oximal algorithms , F oundations and T rends in Optimization, 1 (2014), pp. 127– 239, https://doi.org/10.1561/2400000003 , https://doi.org/10.1561/2400000003 . [50] A. B. Petro, C. Sber t, and J.-M. Morel , Multisc ale R etinex , Image Pro cessing On Line, (2014), pp. 71–88. https://doi.org/10.5201/ipol.2014.107 . [51] G. Peyr ´ e , Image pr o c essing with nonlo c al sp e ctr al b ases , Multiscale Mo deling & Simulation, 7 (2008), pp. 703–730, https://doi.org/10.1137/07068881X , https://doi.org/10.1137/07068881X , 10.1137/07068881X . [52] G. Peyr ´ e , Manifold mo dels for signals and images , Computer Vision and Image Understanding, 113 (2009), pp. 249–260, https://doi.org/https://doi.org/10.1016/j.cviu.2008.09.003 , https://www.sciencedirect.com/science/article/pii/ S1077314208001550 . [53] I. Ramirez and G. Sapiro , Universal r e gularizers for r obust sp arse c o ding and mo deling , IEEE T ransactions on Image Pro cessing, 21 (2012), pp. 3850–3864, https://doi.org/10.1109/TIP.2012.2197006 . 26 [54] I. Ram ´ ırez and I. Hounie , Pac o and p ac o-dct: Patch c onsensus and its applic ation to inp ainting , in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 5775–5779, https://doi.org/10.1109/ICASSP40776.2020.9053953 . [55] I. Ram ´ ırez and I. Hounie , Image Inp ainting using Patch Consensus and DCT Priors , Image Pro cessing On Line, 11 (2021), pp. 1–17. https://doi.org/10.5201/ipol.2021.286 . [56] A. Rizzi, C. Ga tt a, and D. Marini , A new algorithm for unsup ervise d glob al and lo c al c olor c orr e ction , P attern Recognition Letters, 24 (2003), pp. 1663–1677, https://doi.org/https://doi.org/10.1016/S0167- 8655(02)00323- 9 , https://www.sciencedirect.com/science/article/pii/S0167865502003239 . Colour Image Pro cessing and Analysis. First Euro- p ean Conference on Colour in Graphics, Imaging, and Vision (CGIV 2002). [57] Y. Romano and M. Elad , Patch-disagr e ement as away to impr ove K-SVD denoising , in 2015 IEEE In- ternational Conference on Acoustics, Sp eech and Signal Pro cessing (ICASSP), April 2015, pp. 1280–1284, https://doi.org/10.1109/ICASSP.2015.7178176 . [58] A. Saint-Dizier, J. Delon, and C. Bouveyron , A uniﬁe d view on p atch aggr e gation , Journal of Math- ematical Imaging and Vision, 62 (2020), pp. 149–168, https://doi.org/10.1007/s10851- 019- 00921- z , https://doi.org/ 10.1007/s10851- 019- 00921- z . [59] O. G. Sezer and Y. Al tunbasak , Weighte d aver age denoising with sp arse orthonormal tr ansforms , in 2009 16th IEEE International Conference on Image Processing (ICIP), No v 2009, pp. 3849–3852, https://doi.org/ 10.1109/ICIP.2009.5414056 . [60] C. M. Stein , Estimation of the Me an of a Multivariate Normal Distribution , The Annals of Statistics, 9 (1981), pp. 1135 – 1151, https://doi.org/10.1214/aos/1176345632 , https://doi.org/10.1214/aos/1176345632 . [61] B. W ang, T. Lu, and Z. Xiong , A daptive b o osting for image denoising: Beyond low-r ank r epr esentation and sp arse c o ding , in 2016 23rd In ternational Conference on P attern Recognition (ICPR), Dec 2016, pp. 1400–1405, https://doi.org/10.1109/ICPR.2016.7899833 . [62] Y.-Q. W ang and J.-M. Morel , Sur e guide d gaussian mixtur e image denoising , SIAM Journal on Imaging Sciences, 6 (2013), pp. 999–1034, https://doi.org/10.1137/120901131 , https://doi.org/10.1137/120901131 , abs/https://doi.org/10.1137/120901131 . [63] F. Yu and V. K ol tun , Multi-sc ale c ontext aggr e gation by dilate d c onvolutions , in 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, Ma y 2-4, 2016, Conference T rac k Pro ceedings, Y. Bengio and Y. LeCun, eds., 2016, . [64] G. Yu and G. Sapir o , DCT Image Denoising: a Simple and Eﬀe ctive Image Denoising Algorithm , Image Pro cessing On Line, 1 (2011), pp. 292–296. https://doi.org/10.5201/ipol.2011.ys- dct . [65] G. Yu, G. Sapiro, and S. Malla t , Solving inverse pr oblems with pie c ewise line ar estimators: F r om gaussian mixtur e mo dels to structur e d sp arsity , IEEE T ransactions on Image Pro cessing, 21 (2012), pp. 2481–2499. [66] D. Zoran and Y. Weiss , F r om le arning mo dels of natur al image p atches to whole image r estor ation , in 2011 In ternational Conference on Computer Vision, Nov 2011, pp. 479–486, https://doi.org/10.1109/ICCV.2011.6126278 . 27

PACO: Global Signal Restoration via PAtch COnsensus

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment