Multiresolution Representations for Piecewise-Smooth Signals on Graphs

1 Multiresolution Representations for Piece wise-Smooth Signals on Graphs Siheng Chen, Aarti Singh, Jelena K ova ˇ ce vi ´ c Abstract —What is a mathematically rigorous way to describe the taxi-pickup distribution in Manhattan, or the pr oﬁle informa- tion in online social networks? A deep understanding of r epresent- ing those data not only provides insights to the data properties, but also beneﬁts to many subsequent processing procedures, such as denoising, sampling, reco very and localization. In this paper , we model those complex and irregular data as piecewise- smooth graph signals and propose a graph dictionary to effec- tively repr esent those graph signals. W e ﬁrst propose the graph multiresolution analysis, which pr ovides a principle to design good representations. W e then propose a coarse-to-ﬁne approach, which iteratively partitions a graph into tw o subgraphs until we reach individual nodes. This appr oach efﬁciently implements the graph multiresolution analysis and the induced graph dictionary promotes sparse representations piecewise-smooth graph signals. Finally , we validate the proposed graph dictionary on two tasks: approximation and localization. The empirical results show that the proposed graph dictionary outperforms eight other represen- tation methods on six datasets, including trafﬁc networks, social networks and point cloud meshes. Index T erms —Signal processing on graphs, signal repr esenta- tions, graph dictionary I . I N T RO D U C T I O N T oday’ s data is being generated from a diversity of sources, all residing on complex and irregular structures; examples include proﬁle information in social networks, stimuli in brain connectivity networks and trafﬁc ﬂow in city street networks [1], [2]. The need for understanding and analyzing such complex data has led to the birth of signal processing on graphs [3], [4], which generalizes classical signal processing tools to data supported on graphs; the data is the graph signal index ed by the nodes of the underlying graph. Modeling real-world data using piecewise-smooth graph signals. In urban settings, the intersections around shopping areas will exhibit homogeneous mobility patterns and life- style behaviors, while the intersections around residential areas will exhibit different, yet still homogeneous mobility patterns and life-style behaviors. Similarly , in social networks, within a given social circle users’ proﬁles tend to be homogeneous, while within a different social circle they will be different, yet still homogeneous. W e can model data generated from both cases as piecewise-smooth graph signals, as they capture large variations between pieces and small variations within pieces. Figure 1 illustrates ho w a piecewise-smooth signal model can be used to approximate the taxi-pickup distrib ution in Manhattan and users’ proﬁle information on Facebook (hard thresholding is applied for better visualization). The piecewise-smooth signal model has been intensely stud- ied and widely used in classical signal processing, image processing and computer graphics [5], [6], [7]. Multireso- lution analysis and splines are standard representation tools (a) T axi-pickup distrib ution (b) Piecewise-smooth in Manhattan (13,679 intersections). approximation (50 coefﬁcients). (c) Proﬁle information (d) Piecewise-smooth on Facebook (277 users). approximation (5 coefﬁcients). Fig. 1: Piecewise-smooth graph signals approximate irregular , nonsmooth graph signals by capturing both lar ge variations at boundaries as well as small variations within pieces. to analyze piecewise-smooth signals [8]. The idea of using piecewise-smooth graph signals is not novel either; in [3], the authors show that graph wav elets can capture discontinuities in a piecewise-smooth graph signal and in [9], the authors proposed denoising for piece wise-polynomial graph signals through minimizing a generalized total-variation term. There are two gaps in the previous literature we address here: (1) deﬁne piece wise-smooth graph signals precisely and ﬁnd appropriate representations and (2) provide theoretical results on sparse representations for piecewise-smooth graph signals. Representations f or piecewise-smooth graph signals. Sig- nal representations are at the heart of most signal processing techniques [10], allowing for targeted signal models for tasks such as denoising, compression, sampling, recovery and detec- tion [11]. Our aim in this paper is to ﬁnd an appropriate and efﬁcient approach to represent piecewise-smooth graph signals. W e deﬁne a mathematical model for piecewise-smooth graph signals and propose a graph dictionary to sparsely represent piecewise-smooth graph signals. Inspired by classical signal processing, we generalize the idea of multiresolution analysis to graphs as a representation tool for piece wise-smooth sig- nals [12]. W e implement the graph multiresolution analysis by using a coarse-to-ﬁne decomposition approach; that is, 2 we iterati vely partition a graph into two connected subgraphs until we reach individual nodes. W e show that the process leads to an efﬁcient construction of a graph wa velet basis that satisﬁes the graph multiresolution analysis, and the induced graph dictionary promotes sparse representations for piece wise- smooth graph signals. W e validate the proposed graph dic- tionaries on two tasks: approximation and localization. W e show that the proposed graph dictionaries outperform eight other representation methods on six graphs, including trafﬁc networks, citation network, social networks and point cloud meshes. Contributions. The main contrib utions of this paper are: • A no vel and e xplicit deﬁnition for piece wise-smooth graph signals; see Section III. • A nov el graph multiresolution analysis that is imple- mented by a coarse-to-ﬁne decomposition approach; see Section IV. • A novel graph dictionary that promotes the sparsity for piecewise-smooth graph signals and is well-structured and storage-friendly; see Section III-B; and • An extensi ve empirical study on a number of real-world graphs, including traf ﬁc networks, citation networks, so- cial networks and point cloud meshes; see Section VI. Outline of the paper . Section II re vie ws the background materials; Sections III deﬁnes piecewise-smooth graph signals; Section IV proposes the graph multiresolution analysis, which provides a principled way to represent graph signals. Sec- tions III-B show that the proposed graph dictionary promotes sparse representations for piecewise-smooth graph signals. W e validate the proposed methods in Section VI and conclude in Section VII. I I . B AC K G RO U N D W e brieﬂy introduce the framework of graph signal pro- cessing. W e then overvie w related works on graph signal representation. A. Graph Signal Pr ocessing Graphs. Let G = ( V , E , A) be an undirected, irregular and non-negati ve weighted graph, where V = { v i } N i =1 is the set of nodes (vertices), E = { e i } E i =1 is the set of weighted edges and A ∈ R N × N is the weighted adjacency matrix whose element A i,j is the edge weight between the i th and the j th nodes. Let D ∈ R N × N be a diagonal degree matrix with D i,i = P j A i,j . The graph Laplacian matrix is L = D − A ∈ R N × N , which is a second-order dif ference operator on graphs [13]. Let ∆ ∈ R E × N be the graph incidence matrix; its ro ws correspond to edges. If e i is an edge that connects the j th node to the k th node ( j < k ), the elements of the i th row of ∆ are ∆ i,` =    − p A j,k , ` = j ; p A j,k , ` = k ; 0 , otherwise . The graph incidence matrix measures the ﬁrst-order difference and ∆ T ∆ = L . Graph signals. Giv en a ﬁxed ordering of nodes, we assign a signal coef ﬁcient to each node; a graph signal is then deﬁned as a vector , x = [ x 1 , x 2 , · · · , x N ] T ∈ R N , with x n the signal coefﬁcient corresponding to the node v n . W e say that the graph is smooth when adjacent nodes ha ve similar signal coefﬁcients [3], [14]. • Smoothness in the vertex domain. Consider ∆ x ∈ R E as an edge signal representing the ﬁrst-order difference of x . The i th element of ∆ x , (∆ x ) i = p A j,k ( x k − x j ) , assigns the difference between two adjacent signal coefﬁ- cients to the i th edge, which connects the j th node to the k th node ( j < k ). When the variation k ∆ x k 2 2 = x T L x is small, the differences are small and x is smooth. • Smoothness in the spectral domain. W e typically call this type of smoothness bandlimitedness [15], [16]. Let the graph Fourier basis V ∈ R N × N be the eigenv ector matrix of L 1 , with L = V Λ V T and the diagonal elements of Λ are arranged in ascending order . The graph spectrum is then b x = V T x ∈ R N . Let V ( K ) ∈ R N × K be the ﬁrst K columns of V , which span the lowpass bandlimited subspace. For K  N , when    V T ( K ) x    2 2 / k x k 2 2 = 1 , the energy concentrates in the lowpass band and x is smooth. In practice, graph signals may not satisfy the smoothness constraint as abov e (as shown in Figure 1), which serves as motiv ation to further dev elop graph signal models and tools to represent them, the topic of this paper . B. Related W orks Ideally , a good representation should be efﬁcient, hav e some structure such as orthogonality and promote sparse represen- tations for graph signals (at least in some subspaces). T o deal with large-scale graphs, we may also need the representation itself to be sparse and storage-friendly . W e categorize the previous work on graph signal representations as follo ws: 2 . 1) Graph F ilter Banks: Here, representations are con- structed by generalizing classical ﬁlter banks to graphs. [17], [18] designs critically-sampled ﬁlter banks via bipartite sub- graph decomposition; [19], [20] design critically-sampled ﬁlter banks for circulant graphs; [21] designs oversampled ﬁlter banks; [22] designs iterati ve ﬁlter banks; [23] designs critically- sampled ﬁlter banks via community detection; and [24] designs each channel via sampling and recovery . 2) Graph V ertex-Domain Designs: Here, representations are constructed by designing each basis vector (atom) in the graph vertex domain. [25] designs spatial wavelets via neighborhoods; and [26], [27], [28], [29] considers coarse-to- ﬁne approaches. 3) Graph Spectral-Domain Designs: Here, representations are constructed by designing graph ﬁlters in the graph spectral domain. [30] designs graph wav elets; [31], [32] design tight frames; and [33] designs the windo wed graph F ourier trans- forms by generalizing translation. 1 The graph Fourier basis can also be deﬁned based on adjacency matrix [4]. 2 The categorization follows from https://www .macalester .edu/ ∼ dshuman1/ T alks/Shuman GSP 2016.pdf. 3 4) Graph Diffusion-Based Designs: Here, representations are constructed based on the polynomial of the transition matrix. [34] designs dif fusion wa velets; and [35] designs diffusion wavelet packets; 5) Graph Dictionary Learning: Here, representations are constructed by learning from the giv en graph signals. The representations in the other branches depend on the graph structure only; [36], [37] learn graph dictionaries that provide smoothness for giv en graph signals, which is adaptive and biased to the observed graph signals. In this paper , we consider connecting graph ﬁlter banks and graph vertex domain designs. Similarly to [26], [27], [28], [29], [23], the proposed representation considers the coarse-to-ﬁne decomposition in the graph vertex domain. Our goal is to im- plement the graph multiresolution analysis, where the coarse- to-ﬁne approach is more efﬁcient and straightforward than the local-to-global approach (graph ﬁlter banks). W e further show that the proposed representation is efﬁcient, orthogonal and storage-friendly; it also satisﬁes the graph multiresolution analysis and promotes the sparsity for piecewise-constant and piecewise-smooth graph signals. The representations of smooth graph signals have been thor- oughly studied in the graph spectral domain [38], [39]. In this paper , we emphasize the representations of piecewise-smooth graph signals in the graph vertex domain. As a continuous counterpart of graph signal representations, some other works study on manifold data representations [40], [41], [42]. I I I . G R A P H S I G N A L M O D E L S Piecewise-smooth graph signals can model a number of real- world cases as they capture large variations between pieces and small v ariations within pieces. In this section, we mathemat- ically deﬁne piece wise-smooth graph signals. W e start with piecewise-constant graph signals, an important subclass, and then extend it to piece wise-smooth graph signals. A. Piecewise-Constant Model In classical signal processing, a piecewise-constant signal is a signal that is locally constant over connected regions sepa- rated by lower -dimensional boundaries. Such a signal is often related to step functions, square wa ves and Haar wav elets and is widely used in image processing [10]. Piecewise-constant graph signals ha ve been used in many applications without having been explicitly deﬁned; for example, in community detection, community labels form a piecewise-constant graph signal for a giv en social network and in semi-supervised learn- ing, classiﬁcation labels form a piecewise-constant graph signal for a graph constructed from the dataset. While smooth graph signals emphasize slow transitions, piecewise-constant graph signals emphasize fast transitions (corresponding to bound- aries) and localization in the vertex domain (corresponding to signals being nonzero in a local neighborhood). W e deﬁne a piece wise-constant graph signal through the concept of a piece that has been implicitly used before [43], [44]. The deﬁnition is intuitiv e; a piece wise-constant graph signal partitions a graph into several pieces; within each piece, signal coefﬁcients are constant. Deﬁnition 1. Let S be a subset of the node set V . W e call S a piece when its corresponding subgraph G S is connected. W e can represent a piece S by using a one-piece graph signal, 1 S ∈ R N . A piecewise-constant graph signal is a linear combination of sev eral one-piece graph signals. Deﬁnition 2. Let { S c } C c =1 be a partition of the node set V , where each S c is a piece. A graph signal x ∈ R N is piecewise- constant with C pieces when x = C X c =1 a c 1 S c , where and a c ∈ R is the piece coef ﬁcient for the piece S c . Denote this class by PC( C ) . For most graph signals, two adjacent signal coef ﬁcients are typically the same; that is, k ∆ x k 0 may be close to the number of edges E . For a piecewise-constant graph signal x with a few number of pieces, ho we ver , k ∆ x k 0 is usually small. W ithin a piece, it is 0 while across pieces, k ∆ x k 0 is the cut cost to separate the pieces. For example, in an unweighted graph, k ∆ x k 0 ≤ # edges across pieces { S c } C c =1 , for all x ∈ PC( C ) , where the equality is achie ved when all a c are different. Thus, k ∆ x k 0  E when C  N ; see a quick summary in T able I. Graph signal models Properties k ∆ x k 0 k ∆ x k 2 Arbitrary graph signal O ( E ) large Smooth graph signal O ( E ) small Piecewise-constant graph signal O (1) large T ABLE I: Property summary of some typical graph signals. B. Piecewise-Smooth Model Piecewise-smooth signals are widely used to represent im- ages, where edges are captured by the piece boundaries and smooth content is captured by the pieces themselves. Piecewise-smooth graph signals arise naturally from piecewise- constant graph signals with more ﬂexibility to model real-world data, such as taxi-pickup distrib ution supported on the city- street networks and 3D point cloud information supported on the meshes. W e deﬁne a piece wise-smooth graph signal as a general- ization of a piecewise-constant graph signal. For a piece wise- constant signal, signal coefﬁcients within a piece are constant; for a piece wise-smooth signal, signal coefﬁcients within a piece form a smooth graph signal over that piece. Let S be a piece, G S be the corresponding subgraph, V S ∈ R | S |×| S | be the corresponding graph Fourier basis. Giv en a graph signal x ∈ R N , x S ∈ R | S | denotes the signal coefﬁcients supported on G S and x ¯ S ∈ R N −| S | denotes the rest signal coefﬁcients. Deﬁnition 3. A graph signal x ∈ R N is localized and bandlimited over the piece S with bandwidth K ( K ≤ | S | ) when x ¯ S = 0 and V S ( K ) V S T ( K ) x S = x S ∈ R | S | , 4 where V S ( K ) ∈ R | S |× K contains the ﬁrst K columns of the graph Fourier basis V S . Deﬁnition 3 sho ws a class of graph signals that is localized in both the verte x and the graph spectral domains. Since these signals are bandlimited over a piece, we consider them lowpass and smooth with in this piece. A similar deﬁnition has also been proposed in [45]; the dif ference is that the bandlimitedness in [45] is deﬁned for the entire graph, while the bandlimitedness in Deﬁnition 3 is deﬁned for a subgraph only . W e then consider piece wise-bandlimited graph signals as a linear combination of localized and bandlimited graph signals. Deﬁnition 4. A graph signal x ∈ R N is piecewise-bandlimited with C pieces and bandwidth K when x = P C c =1 x ( c,K ) , where each { S c } , c = 1 , . . . , C is a valid piece and x ( c,K ) ∈ R N is bandlimited o ver the piece S c with bandwidth K . Denote this class by PBL( C , K ) . I V . G R A P H M U L T I R E S O L U T I O N Having deﬁned a piecewise-smooth model for the data we are interested in, we no w embark upon looking for the appro- priate representations. Inspired by classical signal processing, we generalize the multiresolution analysis to graph signals and propose a coarse-to-ﬁne approach to implement it. A. Graph Multir esolution Analysis Deﬁnition 5. A multir esolution analysis on graphs consists of a sequence of embedded closed subspaces V ( L ) ⊂ · · · ⊂ V (1) ⊂ V (0) , such that • it satisﬁes upward completeness, V (0) = R N ; • it satisﬁes do wnward completeness, V ( L ) = { c 1 V , c ∈ R } ; • there exists an orthonormal basis { v ( ` ) k } K ( ` ) − 1 k =0 for V ( ` ) ; • it satisﬁes generalized shift in variance, that is, for any x ∈ V ( ` ) , there exists an nontri vial permutation operator Φ ∈ { 0 , 1 } N × N ( Φ 1 = Φ T 1 = 1 ) such that Φ x ∈ V ( ` ) . The permutation operator Φ only allo ws for swapping signal coefﬁcients in two nonoverlapping pieces. • it satisﬁes generalized scale in variance; that is, for any x ∈ V ( ` ) , there exists an nontri vial permutation oper- ator , Φ ∈ { 0 , 1 } N × N such that Φ x ∈ V ( ` ) . When the permutation operator Φ swaps signal coef ﬁcients in two nonov erlapping pieces, each piece has at most 2 ` nodes. While similar in spirit, the proposed graph multiresolution analysis is different from the original one [12]. For example, the complete space here is R N instead of L 2 ( R ) because of the discrete nature of the graph. W e unify the shift and scale inv ariance axioms via a permutation operator, which reshapes a graph signal by swapping signal coefﬁcients. The standard shift in v ariance axiom ensures that the input signal shape is preserved during shifting; here, this is accomplished by requiring that the permutation operator swap the signal coefﬁcients supported on two nonov erlapping pieces only . The standard scale in variance axiom ensures that the input signal shape is preserved during scaling; here, this is accomplished by requiring that the number of swaps scale exponentially as the multiresolution lev el ` grows; see Figure 3 for illustration. Fig. 2: Coarse-to-ﬁne decomposition approach. At each step, we partition a larger piece into two smaller disjoint pieces and generate a pair of lowpass/highpass basis vectors. Piece v (3) 1 = V is at Lev el 3; Pieces v (2) 1 , v (2) 2 are at Le vel 2; and Pieces v 1 , v 2 , v 3 , v 4 are at Lev el 1. (a) Permutation in V (1) . W e swap the signal coefﬁcients between the blue piece and the yellow piece. The total number of swaps is 2. (b) Permutation in V (2) . W e swap the signal coefﬁcients between the blue piece and the yellow piece. The total number of swaps is 4. Fig. 3: Permutation leads to the generalized shift and scale in variances. The permutation operator Φ shifts a graph signal x ∈ V ( ` ) to another graph signal Φ x ∈ V ( ` ) by swapping signal coefﬁcients supported on two difference pieces, which leads to the generalized shift in v ariance; the permutation oper - ator needs twice as many swaps to permute a graph signal in a coarser space, which leads to the generalized scale inv ariance. B. Coarse-to-F ine Construction Our goal now is to implement the graph multiresolution analysis. In classical signal processing, this is typically accom- plished by using ﬁlter banks, which in volves a series of do wn- sampling and shifting. Filter banks start with building ﬁlters in a ﬁne space, which captures local information, and gradually building them in coarser spaces, which captures global infor- mation. For discrete-time signals, ﬁlter banks happen to be an efﬁcient w ay to implement the multiresolution analysis because the downsampling and shifting operators follow naturally . For graph signals, howe ver , there is no recipe to permute the nodes; thus, it is hard to obtain efﬁcient downsampling and shifting 5 operators; see details in Appendix A. Instead, we consider implementing graph multiresolution analysis using a coarse-to-ﬁne appr oach . The main idea is to recursiv ely partition each piece into two smaller disjoint child pieces as follows: Gi ven a connected graph G 0 ( V 0 , E 0 , A 0 ) with |V 0 | > 1 , partition G 0 into two smaller graphs G 1 ( V 1 , E 1 , A 1 ) and G 2 ( V 2 , E 2 , A 2 ) by solving min V 1 , V 2 ||V 1 | − |V 2 || (1a) sub ject to V 1 ∩ V 2 = ∅ , V 1 ∪ V 2 = V 0 , (1b) G 1 , G 2 are connected . In other words, we want (1) each of the two child pieces to be connected; and (2) the partition to be close to a bisection; that is, the difference between cardinalities of two child pieces is as small as possible. These properties ensure that the coarse-to- ﬁne approach implements the graph multiresolution analysis. W e solve (1) in Section IV -D. W e start with the coarsest lowpass subspace V (0) = { c 1 V , c ∈ R } and partition the largest piece V into two disjoint and connected child pieces v (1) 1 , v (1) 2 ⊆ V ; that is, v (1) 1 ∪ v (1) 2 = V , v (1) 1 ∩ v (1) 2 = ∅ , where the subscript denotes the index at each level. The lowpass/highpass basis vectors are, respectiv ely , v (1) 1 = g ( v (1) 1 , v (1) 2 ) , u (1) 1 = h ( v (1) 1 , v (1) 2 ) , where g ( S 1 , S 2 ) = s S 1 || S 2 | | S 1 | + | S 2 |  1 S 1 | S 1 | + 1 S 2 | S 2 |  ∈ R N , h ( S 1 , S 2 ) = s S 1 || S 2 | | S 1 | + | S 2 |  1 S 1 | S 1 | − 1 S 2 | S 2 |  ∈ R N , with S 1 , S 2 ⊂ V two nono verlapping pieces. The normalization ensures that each basis vector is of unit norm and 1 T u (1) 1 = 1 . The highpass subspace is U (1) = { c u (1) 1 , c ∈ R } . W e no w partition pieces v (1) 1 and v (1) 2 to obtain v (1) 1 , v (2) 2 and v (2) 3 , v (2) 4 , respecti vely . The lowpass/highpass basis vectors are v (2) k = g ( v (2) 2 k − 1 , v (2) 2 k ) , u (2) k = h ( v (2) 2 k − 1 , v (2) 2 k ) , for k = 1 , 2 . The lowpass subspace is V (2) = span  { v (2) k } K (2) k =1  and the highpass subspace is U (2) = span  { u (2) k } K (2) k =1  , where K (2) = 2 . W e keep on partitioning, building the lo wpass and highpass subspaces and their corre- sponding bases in the process. At the ` th lev el, we partition v ( ` +1) k to obtain v ( ` ) 2 k − 1 , v ( ` ) 2 k . When both v ( ` ) 2 k − 1 , v ( ` ) 2 k are nonempty , v ( ` ) k = g ( v ( ` ) 2 k − 1 , v ( ` ) 2 k ) is a lowpass basis v ector and u ( ` ) k = h ( v ( ` ) 2 k − 1 , v ( ` ) 2 k ) is a highpass basis vector; when one of them is empty , the cardinality of v ( ` +1) k is 1 and we cease partitioning in this branch. At the ﬁnest resolution, each piece corresponds to an indi vidual node. Since we promote bisection, the total decomposition depth L is around 1 + log 2 N . At the end of the process, we collect all highpass basis vectors into a Haar -like graph wav elet basis (see Algorithm 1). A toy example is sho wn in Figure 2. Algorithm 1 Haar-lik e Graph W avelet Basis Construction Input G ( V , E , A) graph Output W wav elet basis Function 1) initialize a stack of pieces sets S and a set of vectors W 2) push S = V into S 3) add w = 1 √ | S | 1 S to W 4) while the cardinality of the largest element in S is larger than 1 4.1) pop up one element from S as S 4.2) evenly partition S into two disjoint pieces S 1 , S 2 4.3) push S 1 , S 2 into S 4.4) add w = q | S 1 || S 2 | | S 1 | + | S 2 |  1 | S 1 | 1 S 1 − 1 | S 2 | 1 S 2  to W retur n W C. Graph W avelet Basis Pr operties 1) Efﬁciency: This coarse-to-ﬁne approach inv olves ( N − 1) partitions. The ov erall computational complexity is approxi- mately P 1+log 2 N ` =1 2 ` f ( N / 2 ` ) , where f ( N ) is the computa- tional complexity of partitioning an N -node graph. For a sparse graph ( E = O ( N ) ), when we use a standard graph partitioning algorithm, METIS [46], to partition the graph, f ( N ) = O ( N ) and the overall computational complexity is O ( N log 2 N ) . 2) Graph multiresolution: When the number of nodes N = 2 L for some L ∈ Z + , the proposed graph wav elet basis in Algorithm 1 satisﬁes the axioms of the graph multiresolution analysis. When the number of nodes cannot be partitioned equally , the proposed graph wav elet basis may not exactly satisfy the generalized shift and scale in variance axioms due to the residual condition, but still comes close to the spirit of multiresolution. 3) Orthogonality: Orthogonality also implies efﬁcient per- fect reconstruction. Theorem 1. The proposed graph wav elet basis W ∈ R N × N in Algorithm 1 is orthonormal; that is, for any graph signal x ∈ R , we have x = W W T x . The proof is giv en in Appendix B. D. Graph P artition Algorithm An ideal graph partitioning results in two connected sub- graphs with the same number of nodes; ho wever , connecti vity and bisection may conﬂict in practice. Many existing graph partition algorithms can be used in graph partition. For ex- ample, METIS provides an efﬁcient bisection, but does not ensure that two resulting subgraphs are connected. In (1), we consider the connectivity-ﬁrst approach, as the constraints (1b) requires that the resulting subgraphs be connected. The objec- tiv e function (1a) promotes a bisection; that is, two subgraphs 6 hav e similar number of nodes. The optimization problem (1) is combinatorial and we aim to obtain a suboptimal solution with certain theoretical guarantee. T o solve (1), we consider ﬁnding two nodes with the longest geodesic distance as two hubs and then compute the geodesic distances from each nodes to two hubs. W e rank all the nodes based on the dif ference between the geodesic distances to two hubs and record the median value. W e partition the nodes according to this median value. All the nodes falling into the median v alue forms the boundary set. W e further partition the boundary set to ensure connectivity and promote bisection. The details are summarized in Algorithm 2. Algorithm 2 Graph Partition with Connecti vity Guarantee Input G 0 original graph Output V 1 , V 2 two node sets Function 1) compute the geodesic distance matrix D ∈ R |V 0 |×|V 0 | ; 2) select v i , v j ∈ V 0 , such that D v i ,v j is maximized; 3) let p be median value of D v i , : − D v j , : ; 4) let S 1 = { v | D v i ,v − D v j ,v > p } and boundary set S 2 = { v | D v i ,v − D v j ,v = p } ; 5) partition S 2 into connected components C 1 , C 2 , . . . C M with | C 1 | < | C 2 | < . . . < | C M | . 6) set q m = | S 1 ∪ C 1 ∪ . . . ∪ C m | for m = 1 , 2 , · · · , M ; 7) set m ∗ = arg min m | q m − |V 0 | / 2 | ; 8) V 1 = S 1 ∪ C 1 ∪ . . . ∪ C m ∗ and V 2 = V 0 \V 1 end retur n V 1 , V 2 W e can sho w that Algorithm 2 provides a near-optimal solution; see Appendix C for the proof. Theorem 2. Let b V 1 , b V 2 be the solution gi ven by Algorithm 2. Then, b V 1 , b V 2 is a feasible solution of the optimization prob- lem (1) and || b V 1 | − | b V 2 || ≤ 2 | C m ∗ | , where C m ∗ is the m ∗ th smallest connected component in the boundary set, following from the Steps 5 - 7 in Algorithm 2. V . G R A P H D I C T I O NA R I E S W e now use the graph dictionary induced by the graph multiresolution analysis from the previous section to represent piecewise-smooth graph signals. As before, we start with piecewise-constant graph signals and then generalize to the piecewise-smooth ones. A. Piecewise-Constant Gr aph Dictionary Representing piecewise-constant graph signals is difﬁcult because the geometry of the pieces is arbitrary . W e now show that the graph wa velet basis in Algorithm 1 can effecti vely parse the pieces and promote the sparse representations for piecewise-constant graph signals. Theorem 3. Let W ∈ R N × N be the graph wa velet basis in Algorithm 1. For a piece wise-constant graph signal x ∈ R N ,   W T x   0 ≤ 1 + k ∆ x k 0 L. where L is the decomposition depth. The proof is gi ven in Appendix D. Since we promote the bisection scheme, L is roughly 1 + log 2 N . Theorem 3 shows an upper bound on the sparsity of graph wa velet coefﬁcients, which depends on the cut cost k ∆ x k 0 and the size of the graph. As shown in T able I, k ∆ x k 0 is usually small when x is a piecewise-constant signal. In [47], we also show that this graph wa velet basis can be used to detect localized graph signals. W e can expand the graph wav elet basis from Algorithm 1 to a redundant graph dictionary , allowing for more ﬂexibility . Each piece v ( ` ) k obtained from the graph partition is a column vector (called an atom) 1 v ( ` ) k in the graph dictionary; we collect all the pieces at all lev els to obtain a dictionary . In other words, the piecewise-constant graph dictionary is D PC = { 1 v ( ` ) k } ` = L,k =2 ` ` =1 ,k =1 . (2) There are 2 N − 1 pieces in total; thus, D PC ∈ R N × (2 N − 1) and the proposed graph dictionary D PC contains a series of atoms with different sizes activating dif ferent positions. Each graph wav elet basis vector in Algorithm 1 can be represented as a linear combination of two atoms in the piecewise-constant graph dictionary . Since most atoms are sparse, the number of nonzero elements in the piece wise-constant dictionary is small, allowing for efﬁcient storage. For example, when N = 2 L for some L = Z + , the number of nonzero elements is exactly N L . Corollary 1. Let D PC ∈ R N × (2 N − 1) be the piecewise- constant graph dictionary . Let the sparse coefﬁcients of a piecewise-constant graph signal x ∈ PC( C ) be a ∗ = arg min a ∈ R 2 N − 1 k a k 0 , (3) sub ject to x = D PC a . Then, we hav e k a ∗ k 0 ≤ 1 + k ∆ x k 0 L. Corollary 1 directly follows from Theorem 3, as the graph wa velet basis can be linearly represented by the piecewise- constant graph dictionary . W e expect the upper bound in Corollary 1 is not tight. In practice, the corresponding sparsity is usually e ven smaller than the sparsity provided by the graph wa velet basis because of the redundancy and ﬂexibility of the piecewise-constant graph dictionary . B. Piecewise-Smooth Gr aph Dictionary W e no w generalize the piece wise-constant graph dictionary to the piecewise-smooth graph dictionary . In the piece wise- constant graph dictionary , we use a single one-piece graph signal to activ ate a certain subgraph; in the piece wise-smooth graph dictionary , we can use multiple localized and bandlimited graph signals to acti vat e the same subgraph. Since localized and bandlimited graph signals are smooth on the corresponding subgraphs, the piecewise-smooth graph dictionary provides more redundancy and ﬂexibility to capture the localized e vents within a graph signal. Let v ` k be the k th piece in the ` th decomposition le vel, G v ` k the corresponding subgraph and V v ` k ∈ R | v ` k |×| v ` k | the corre- sponding graph Fourier basis. The subdictionary corresponding 7 to the k th piece at the ` th decomposition lev el is D v ` k = V v ` k ( K ) ∈ R | v ` k |× min( K, | v ` k | ) , which is the ﬁrst min( K , | v ` k | ) columns of V v ` k . W e collect all subdictionaries across all le vels to obtain the piecewise-smooth graph dictionary , D PS = { D v ` k } ` = L,k =2 ` ` =1 ,k =1 . (4) The total number of atoms of D PS is upper bounded by (2 N − 1) K with bandwidth K . The total number of nonzero elements of D PS is at most O ( N K log 2 N ) , still storage friendly . W e no w sho w that D PS promotes sparsity for piecewise- bandlimited graph signals. Theorem 4. Let D PS be the piecewise-smooth graph dic- tionary . Let the sparse coefﬁcient of a piecewise-bandlimited graph signal x ∈ PBL( C , K ) be a ∗ = arg min a k a k 0 , sub ject to k x − D PS a k 2 2 ≤  par k x k 2 2 , where  par is a constant determined by the graph partitioning algorithm. Then, we hav e k a ∗ k 0 ≤ 1 + 2 K k ∆ x PC k 0 L, where L is the decomposition depth in the coarse-to-ﬁne approach and x PC is a piecewise-constant signal that shares the same pieces with x . The proof is giv en in Appendix E. V I . E X P E R I M E N TA L R E S U LT S A good representation can be used in compression, ap- proximation, inpainting, denoising and localization. Here we ev aluate our proposed graph dictionaries on two tasks: approx- imation and localization. A. Experimental Setup W e consider six datasets summarized in T able II. • Sensors. This is a simulated geometric graph with 500 nodes and 2,050 edges. W e simulate a piecewise-smooth graph signal following [22]. • Minnesota. This is the Minnesota road network with 2,642 intersections and 3,304 road segments. W e model each intersection as a node and each road segment as an edge. W e simulate a localized smooth graph signal following [22]. • Manhattan. This is the Manhattan street network with 13,679 intersections and 34,326 road segments. W e model each intersection as a node and each road segment as an edge. W e model the restaurant distribution, and taxi- pickup positions as signals supported on the Manhattan street network. • Kaggle 1968. This is a social network of Facebook users with 277 nodes and 2,321 edges. It also contains 14 social circles, where each one can be modeled as a binary piece wise-constant signal supported on this social network. • Citeseer . This is a co-authorship network with 2,120 nodes and 3,705 edges. It also contains 7 research groups, where each one can be modeled as a binary piecewise-constant signal supported on this co-authorship network. • T eapot. This is a dataset with 7,999 3D points, repre- senting the surface of a teapot. W e construct a 10-nearest neighbor graph to capture the geometry . 3D coordinates can be modeled as three piecewise-smooth signals sup- ported on this generalized mesh. Dataset T ype # nodes # edges Signals Sensors Simulation 500 2,050 Simulation Minnesota Traf ﬁc net 2,642 3,304 Simulation Manhattan Traf ﬁc net 13,679 3,679 T axi Kaggle 1968 Social net 277 2,321 Circle Citeseer Citation net 2,120 3,705 Attribute T eapot Mesh 7,999 198,035 Coordinate T ABLE II: Dataset description. W e consider the following ten competitiv e representation methods: • PC (dashed dark red line). This is our piece wise-constant graph dictionary (2). • PS (solid red line). This is our piece wise-smooth graph dictionary (4). The bandwidth in each piece is 10 . • Delta (solid dark yellow line). This is the basis of Kro- necker deltas. • GFT (dashed yellow line) [3]. This is the graph Fourier basis. • SGWT (solid blue line) [30]. This is the spectral graph wa velet transform with ﬁv e wav elet scales plus the scaling functions for a total redundancy of 6. • Pyramid (dashed light blue line) [22]. This is the multi- scale pyramid transform. • CKWT (solid grey line) [25]. These are spatial graph wa velets with wa velet functions based on the renormalized one-sided Mexican hat wav elet, also with ﬁve wa velet scales and concatenated with the dictionary of Kronecker deltas. • Dif fusionW (dashed purple line) [34]. These are the the diffusion wavelets. • QMF (solid pink line) [17]. This is the graph-QMF ﬁlter bank transform. • CoSubFB (solid green line) [23]. This is the subgraph- based ﬁlter bank. B. Appr oximation Approximation is a standard task used to ev aluate the quality of a representation. The goal here is to use a few expansion coefﬁcients to approximate a graph signal. W e consider two approximation strategies: nonlinear approximation and orthog- onal marching pursuit. Gi ven the budget of K expansion coefﬁcients, nonlinear approximation chooses the K largest- magnitude ones to minimize the approximation error while or - thogonal marching pursuit greedily and sequentially selects K expansion coefﬁcients to minimize the residual error . For each representation method, we use both approximation strategies 8 0 50 100 # expansion coefficents -6 -5 -4 -3 -2 -1 0 log(Error) PS PC delta GFT SGWT pyramid CKWT diffusionW QMF CoSubFB 0 50 100 # expansion coefficents -2 -1.5 -1 -0.5 0 log(Error) PS PC delta GFT SGWT pyramid CKWT diffusionW QMF CoSubFB 0 50 100 # expansion coefficents -5 -4 -3 -2 -1 0 log(Error) PS PC delta GFT SGWT pyramid CKWT diffusionW QMF CoSubFB 0 50 100 # expansion coefficents -0.8 -0.6 -0.4 -0.2 0 log(Error) PS PC delta GFT SGWT pyramid CKWT diffusionW QMF CoSubFB 0 50 100 # expansion coefficents -6 -5 -4 -3 -2 -1 0 log(Error) PS PC delta GFT SGWT pyramid CKWT diffusionW CoSubFB (a) Sensors. (b) Minnesota. (c) Kaggle 1968. (d) Citeseer . (e) T eapot. Fig. 4: Piecewise-smooth graph dictionary (in red) outperforms the other competitive methods on ﬁv e datasets. The x -axis is the number of coefﬁcients used in the approximation and the y -axis is the approximation error (5), where lower means better . and report the results of the better one. The ev aluation metric is the normalized mean square error, deﬁned as Error = k b x − x k 2 2 k x k 2 2 . (5) Figure 4 compares the approximation performances on ﬁv e datasets. Fi ve columns in Figure 4 sho w the sensors, Minnesota, Kaggle 1968, Citeseer and T eapot, respectiv ely . Each plot in the ﬁrst row sho ws the visualization of the graph signal; each plot in the second ro w shows the approximation error on the logarithm scale, where the x -axis is the number of expansion coefﬁcients and the y -axis is the normalized mean square error . Overall, the proposed piecewise-smooth graph dictionary outperforms its competitors under v arious types of graphs and graph signals. • Sensors. The graph signal is piecewise-smooth. The top three methods are the piecewise-smooth graph dictionary , the piecewise-constant graph dictionary and the diffusion wa velets; on the other end of the spectrum, the Kronecker deltas, which ﬁt one signal coefﬁcient at a time, fails. • Minnesota. The graph signal is localized smooth. The top three methods are the piecewise-smooth graph dictionary , the diffusion wa velets and the spectral graph wav elet transform; on the other end of the spectrum, the spatial graph wa velets fail. • Kaggle 1968. The graph signal is binary and piecewise- constant with a few pieces. The top three methods are the piecewise-smooth graph dictionary , the piecewise-constant graph dictionary and the spectral graph wa velet transform; on the other end of the spectrum, the multiscale pyramid transform fails. • Citesser . The graph signal is binary , piece wise-constant with a lar ge number of pieces. None of the methods performs well due to the noisy input signal. The top three methods are the piecewise-smooth graph dictionary , the subgraph-based ﬁlter bank and the graph-QMF ﬁlter bank transform; on the other end of the spectrum, the multiscale pyramid transform fails. • T eapot. The graph signal is smooth. The top three methods are the piecewise-smooth graph dictionary , the subgraph-based ﬁlter bank and the graph F ourier basis; on the other end of the spectrum, the Kronecker deltas and the spatial graph wav elets fail. T o ha ve an illustrati ve understanding, we visualize the reconstructions in Figure 5 where each plot sho ws the recon- struction by using 100 expansion coefﬁcients. Additionally , Figure 6 compares the approximations of urban data supported on the Manhattan street networks. The two rows show the reconstructions of the taxi-pickup distribution and restaurant distribution, respectively , by using 100 expansion coefﬁcients. W e see that three graph signals are nonsmooth and inhomogeneous. For each of the three graph signals, the piecewise-smooth graph dictionary pro vides the lar gest signal- to-noise ratio (SNR) and smallest normalized mean square error . The spectral graph wav elet transform is also competitiv e; the subgraph-based ﬁlter bank tends to be over smooth and the spatial graph wav elets tend to be less smooth. C. Localization One functionality of a graph dictionary is to detect local- ized graph signals [47]; applications include localizing virus attacks in cyber-ph ysical systems, localizing stimuli in brain connectivity networks and mining trafﬁc events in city street networks. W e here consider simulations on the Minnesota road networks. W e generate one-piece graph signals with Gaussian noises. Giv en the noisy graph signals, we use graph dictionary to remove noises and reconstruct a denoised graph signal to localize the underlying activ ated pieces. W e average over 20 random trials. Figure 7 shows the localization performance, where the x - axis is the noise lev el and the y -axis is either SNR or corre- lation. In both cases, higher value means better . The baseline (dark curv e) means that we naively use the noisy graph signal as the reconstruction. W e see that the piecewise-smooth graph dictionary outperforms the others in terms of both metrics, 9 (a) Original. (b) PS. (c) PC. (d) Delta. (e) GFT . (f) SGWT . (g) Pyramid. (h) CKWT . (i) Diffusion wa velets. (j) CSFB. Fig. 5: Reconstruction visualization for T eapot. (a) T axi-pickup distrib ution. (b) PS. (c) CSFB. (d) SGWT . (e) CKWT . (f) Restaurant distribution. (g) PS. (h) CSFB. (i) SGWT . (j) CKWT . Fig. 6: Reconstruction visualization for urban data. especially when the noise level is lo w; when the noise level is high, piecewise-constant graph dictionary , piecewise-smooth graph dictionary and multiscale pyramid transform perform similarly . Figure 8 compares reconstructions. Figure 8 (a) shows the original one-piece graph signal, (b) shows the noisy graph signal, while (c), (d) and (e) show the denoised graph signals by using the piecewise-smooth graph dictionary , the subgraph- based ﬁlter bank and the spectral graph wav elet transform, respectiv ely . W e see that the piece wise-smooth graph dictionary localizes the underlying piece well, the spectral graph w av elet transform does a reasonable job, but the subgraph-based ﬁlter bank provides an ov er-smooth reconstruction and fails. V I I . C O N C L U S I O N S A N D F U T U R E W O R K S In this paper , we model complex and irregular data, such as urban data supported on the city street networks and proﬁle information supported on the social networks, as piece wise- smooth graph signals. W e propose a well-structured and storage-friendly graph dictionary to represent those graph signals. T o ensure a good representation, we consider the graph multiresolution analysis. T o implement this, we pro- pose the coarse-to-ﬁne approach, which iterativ ely partitions a graph into two subgraphs until we reach individual nodes. This approach efﬁciently implements the graph multiresolution analysis and the induced graph dictionary promotes sparse representations for piecewise-smooth graph signals. Finally , we test the proposed graph dictionary on the tasks of ap- proximation and localization. The empirical results validate that the proposed graph dictionary outperforms eight other 10 0 0.5 1 1.5 Noise level -15 -10 -5 0 5 SNR PS PC pyramid CoSubFB Baseline 0 0.5 1 1.5 Noise level 0 0.2 0.4 0.6 0.8 1 Correlation PS PC pyramid CoSubFB Baseline (a) Original. (b) PS. Fig. 7: Localization performance as a function of noise lev el. Piecewise-smooth graph dictionary (in red) outperforms the other competiti ve methods. The x -axis is the noise le vel and the y -axis is the signal-to-noise ratio (SNR), where higher means better . representation methods on v arious datasets. Future works may include dev elop sampling, reco very , denoising and detection strategies based on the proposed piece wise-smooth graph signal model. R E F E R E N C E S [1] M. Jackson, Social and Economic Networks , Princeton University Press, 2008. [2] M. Newman, Networks: An Intr oduction , Oxford Uni versity Press, 2010. [3] D. I. Shuman, S. K. Narang, P . Frossard, A. Ortega, and P . V anderghe ynst, “The emerging ﬁeld of signal processing on graphs: Extending high- dimensional data analysis to netw orks and other irregular domains, ” IEEE Signal Pr ocess. Mag. , vol. 30, pp. 83–98, May 2013. [4] A. Sandryhaila and J. M. F . Moura, “Discrete signal processing on graphs, ” IEEE T rans. Signal Pr ocess. , vol. 61, no. 7, pp. 1644–1656, Apr . 2013. [5] P . Prandoni and M. V etterli, “ Approximation and compression of piecewise smooth functions, ” Phil. T ransaction R. Soc. Lond. A. , v ol. 357, no. 1760, pp. 2573–2591, 1999. [6] M. B. W akin, J. K. Romberg, H. Choi, and R. G. Baraniuk, “W avelet- domain approximation and compression of piece wise smooth images, ” IEEE T rans. Imag e Pr ocess. , vol. 15, no. 5, pp. 1071–1087, 2006. [7] V . Chandrasekaran, M. B. W akin, D. Baron, and R. G. Baraniuk, “Repre- sentation and compression of multidimensional piecewise functions using surﬂets, ” IEEE T rans. Inf. Theory , vol. 55, no. 1, pp. 374–400, 2009. [8] M. Unser , “Splines: A perfect ﬁt for signal and image processing, ” IEEE Signal Pr ocess. Mag. , vol. 16, no. 6, pp. 22–38, Nov . 1999. [9] Y -X W ang, J. Sharpnack, A. Smola, and R. J. Tibshirani, “Trend ﬁltering on graphs, ” in AIST A TS , San Diego, CA, May 2015. [10] M. V etterli, J. Ko va ˇ cevi ´ c, and V . K. Goyal, F oundations of Sig- nal Pr ocessing , Cambridge Uni versity Press, Cambridge, 2014, http://www .fourierandwav elets.org/. [11] S. Mallat, A W avelet T our of Signal Processing , Academic Press, New Y ork, NY , third edition, 2009. [12] M. V etterli and J. K ov a ˇ cevi ´ c, W avelets and Subband Coding , Prentice Hall, Englewood Cliffs, NJ, 1995, http://wav eletsandsubbandcoding.org/. [13] M. Belkin and P . Niyogi, “Laplacian eigenmaps for dimensionality reduction and data representation, ” Neur . Comput. , vol. 13, pp. 1373– 1396, 2003. [14] A. Sandryhaila and J. M. F . Moura, “Big data processing with signal processing on graphs, ” IEEE Signal Pr ocess. Mag. , vol. 31, no. 5, pp. 80–90, 2014. [15] A. Anis, A. Gadde, and A. Ortega, “T owards a sampling theorem for signals on arbitrary graphs, ” in Pr oc. IEEE Int. Conf. Acoust., Speech, Signal Pr ocess. , Florence, May 2014, pp. 3864–3868. [16] S. Chen, R. V arma, A. Sandryhaila, and J. Ko va ˇ cevi ´ c, “Discrete signal processing on graphs: Sampling theory , ” IEEE T rans. Signal Pr ocess. , vol. 63, no. 24, pp. 6510–6523, Dec. 2015. [17] S. K. Narang and A. Ortega, “Perfect reconstruction two-channel wavelet ﬁlter banks for graph structured data, ” IEEE T rans. Signal Pr ocess. , vol. 60, pp. 2786–2799, June 2012. [18] S. K. Narang and Antonio Ortega, “Compact support biorthogonal wav elet ﬁlterbanks for arbitrary undirected graphs, ” IEEE T rans. Signal Pr ocess. , vol. 61, no. 19, pp. 4673–4685, Oct. 2013. [19] V . N. Ekambaram, G. C. Fanti, B. A yazifar , and K. Ramchandran, “Critically-sampled perfect-reconstruction spline-wavelet ﬁlterbanks for graph signals, ” in GlobalSIP , Austin, TX, Dec. 2013, pp. 475–478. [20] M. S. Kotzagiannidis and P . L. Dragotti, “The graph FRI framew ork- spline wavelet theory and sampling on circulant graphs, ” in ICASSP , Shanghai, China, Mar . 2016, pp. 6375–6379. [21] Y . T anaka and A. Sakiyama, “M-channel oversampled graph ﬁlter banks, ” IEEE T rans. Signal Pr ocess. , vol. 62, no. 14, pp. 3578–3590, 2014. [22] D. I Shuman, M. J. Faraji, and P . V andergheynst, “ A multiscale pyramid transform for graph signals, ” IEEE T rans. Signal Process. , vol. 64, no. 8, pp. 2119–2134, April 2016. [23] N. Tremblay and P . Borgnat, “Subgraph-based ﬁlterbanks for graph signals, ” IEEE T rans. Signal Process. , vol. 64, no. 15, pp. 3827–3840, August 2016. [24] Y . Jin and D. I Shuman, “ An m-channel critically sampled ﬁlter bank for graph signals, ” Pr oceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing , pp. 3909–3913, March 2017. [25] M. Crovella and E. Kolaczyk, “Graph wa velets for spatial trafﬁc analysis, ” in Pr oc. IEEE INFOCOM , Mar . 2003, vol. 3, pp. 1848–1857. [26] A. D. Szlam, M. Maggioni, R. R. Coifman, and J. C. BremerJr ., “Diffusion-dri ven multiscale analysis on manifolds and graphs: top-down and bottom-up constructions, ” in Pr oceedings of the SPIE, W avelets XI , Aug. 2005, vol. 5914, pp. 445–455. [27] M. Gavish, B. Nadler, and R. R. Coifman, “Multiscale wavelets on trees, graphs and high dimensional data: Theory and applications to semi supervised learning, ” in Pr oc. Int. Conf. Mach. Learn. , Haifa, Israel, June 2010, pp. 367–374. [28] R. M. Rustamov , “ A verage interpolating wav elets on point clouds and graphs, ” CoRR , vol. abs/1110.2227, 2011. [29] Jeff Irion and Naoki Saito, “Hierarchical graph laplacian eigen trans- forms, ” JSIAM Letters , vol. 6, no. 0, pp. 21–24, Jan. 2014. [30] D. K. Hammond, P . V andergheynst, and R. Gribon val, “W av elets on graphs via spectral graph theory , ” Appl. Comput. Harmon. Anal. , vol. 30, pp. 129–150, Mar . 2011. [31] N. Leonardi and D. V an De V ille, “T ight wav elet frames on multislice graphs, ” IEEE T rans. Signal Process. , vol. 61, no. 13, pp. 3357–3367, 2013. [32] D. I Shuman, C. W iesmeyr , N. Holighaus, and P . V andergheynst, “Spectrum-adapted tight graph wavelet and vertex-frequenc y frames, ” IEEE T rans. Signal Pr ocess. , vol. 63, no. 16, pp. 4223–4235, August 2015. [33] D. I Shuman, B. Ricaud, and P . V anderghe ynst, “V erte x-frequency analysis on graphs, ” Applied and Computational Harmonic Analysis , vol. 40, no. 2, pp. 260–291, March 2016. [34] R. R. Coifman and M. Maggioni, “Diffusion wavelets, ” Appl. Comput. Harmon. Anal. , pp. 53–94, July 2006. [35] J. Bremer, R. Coifman, M. Maggioni, and A. R. Szlam, “Diffusion wav elet packets, ” Appl. Comput. Harmon. Anal. , v ol. 21, pp. 95–112, July 2006. [36] X. Zhang, X. Dong, and P . Frossard, “Learning of structured graph dictionaries, ” in Proc. IEEE Int. Conf . Acoust., Speech, Signal Pr ocess. , Kyoto, Japan, 2012, pp. 3373–3376. [37] D. Thanou, D. I. Shuman, and P . Frossard, “Learning parametric dictionaries for signals on graphs, ” IEEE T rans. Signal Process. , vol. 62, pp. 3849–3862, June 2014. [38] X. Zhu and M. Rabbat, “ Approximating signals supported on graphs, ” in Pr oc. IEEE Int. Conf. Acoust., Speech, Signal Pr ocess. , K yoto, Japan, Mar . 2012, pp. 3921 – 3924. [39] B. Ricaud, D. I. Shuman, and P . V anderghe ynst, “On the sparsity of wav elet coefﬁcients for signals on graphs, ” in Conference on W avelets and Sparsity XV , 2013, vol. 8858. [40] W . K. Allard, G. Chen, and M. Maggioni, “Multiscale geometric methods for data sets i: Multiscale svd, noise and curvature, ” Appl. Comput. Harmon. Anal. , , no. 3, pp. 504–567, 2017. [41] W . K. Allard, G. Chen, and M. Maggioni, “Multi-scale geometric methods for data sets ii: Geometric multi-resolution analysis, ” Appl. Comput. Harmon. Anal. , , no. 3, pp. 435–462, 2012. [42] W . Liao and M. Maggioni, “ Adaptiv e geometric multiscale ap- proximations for intrinsically lo w-dimensional data, ” arXiv preprint arXiv:1611.01179 , 2016. 11 (a) Original. (b) Noisy . (c) PS. (d) CSFT . (e) SGWT . Fig. 8: Localization visualization. [43] U. V . Luxbur g, “ A tutorial on spectral clustering, ” Statistics and Computing. , vol. 17, pp. 395–416, 2007. [44] X. W ang, P . Liu, and Y . Gu, “Local-set-based graph signal reconstruc- tion, ” IEEE T rans. Signal Process. , vol. 63, no. 9, May 2015. [45] M. Tsitsvero, S. Barbarossa, and P . D. Lorenzo, “Signals on graphs: Uncertainty principle and sampling, ” , 2015. [46] G. Karypis and V . K umar , “ A fast and high quality multilev el scheme for partitioning irregular graphs, ” SIAM J. Scientiﬁc Computing , vol. 20, no. 1, pp. 359–392, 1998. [47] S. Chen, Y . Y ang, S. Zong, A. Singh, and J. K ov a ˇ cevi ´ c, “Detecting localized categorical attributes on graphs, ” IEEE T rans. Signal Pr ocess. , vol. 65, pp. 2725–2740, May 2017. A P P E N D I X A. Iterated Gr aph F ilter Bank In this section, we generalize the classical ﬁlter banks to the graph domain and point out why the graph ﬁlter banks are hard to implement. Suppose we have an ordering of nodes { v 1 , v 2 , . . . , v N } , such that two consecutive nodes v 2 k − 1 , v 2 k are connected for k = 1 , 2 , . . . , K , where K ≤ b N/ 2 c . W e group all pairs of v 2 k − 1 , v 2 k to form a series of connected and nonov erlapping subgraphs. The basis v ectors of the k th subgraph are v (1) k = 1 √ 2  1 v 2 k − 1 + 1 v 2 k  ∈ R N , (6) u (1) k = 1 √ 2  1 v 2 k − 1 − 1 v 2 k  ∈ R N , (7) where the subscript k is the index of the subgraph and the superscript 1 indicates the root layer , the low-pass basis vector v (1) k considers the average of two nodes within this subgraph and the high-pass basis sequence u (1) k considers the dif ference between two nodes within this subgraph. W e collect all the low-pass basis vectors and high-pass basis vectors to form a low-pass subspace and a high-pass subspace, respecti vely , V (1) = span  { v (1) k } K k =1  and U (1) = span  { u (1) k } K k =1  . Different from the discrete-time scenario, V (1) ⊕ U (1) may not span the entire R N space, as a fe w nodes may be isolated due to the ordering. Let the residual subspace be R (1) = span  { 1 v k } N k =2 K +1  , where each basis vector only activ ates an indi vidual node. No w V (1) ⊕ U (1) ⊕ R (1) = R N . For any graph signal x ∈ R N , the reconstruction is x = K X k =1 h x , v (1) k i v (1) k | {z } x V (1) + K X k =1 h x , u (1) k i u (1) k | {z } x U (1) + N X k =2 K +1 h x , 1 v k i 1 v n | {z } x R (1) , Fig. 9: As a ﬁne-to-coarse approach, the analysis part of iterated graph ﬁlter banks implement the graph multiresolution analysis (Deﬁnition 5). There is no residual in this case. where x V (1) ∈ V (1) is the low-pass projection, x U (1) ∈ U (1) is the high-pass projection and x R (1) ∈ R (1) handles the residual condition. T o summarize, based on a well-designed ordering, we parti- tion the entire graph into a series of nonov erlapping subgraphs and then design the Haar-like basis vectors on graphs. For discrete-time signals whose underlying graph is a directed line graph, the ordering is provided by time and each subgraph con- tains two consecutiv e time stamps. As described in Section ?? , because of the nice ordering by time, all the basis vectors can be efﬁciently obtained by ﬁltering following by downsampling; howe ver , this is not true for arbitrary graphs. Follo wing the classical discrete-time signal processing, we can iterativ ely decompose the low-pass subspace and obtain smoother and smoother subspaces, which is equiv alent to coarsen in the graph vertex domain. This iterated graph ﬁlter bank di vides the verte x-spectrum plane into more tiles, ap- proaching to the limit of uncertainty barrier . Here we sho w the second layer for an example. Let a supernode (connected node set) v (2) k = v 2 k − 1 ∪ v 2 k for k = 1 , 2 , . . . , K , where the superscript of the supernode indicates the second layer . T wo supernodes v (2) i , v (2) j are connected when there exists a pair of nodes p ∈ v (2) i , q ∈ v (2) j satisfying that p, q are connected. Similarly to the paradigm in Section A, suppose we ha ve an ordering of K supernodes { v (2) 1 , v (2) 2 , . . . , v (2) K } , such that two consecuti ve supernodes v (2) 2 k − 1 , v (2) 2 k are connected for k = 1 , . . . , K (2) , where K (2) ≤ b K / 2 c . W e group all v (2) 2 k − 1 , v (2) 2 k to form a series of connected, yet nonov erlapping 12 subgraphs. Let S 1 , S 2 ⊂ V be two nonov erlapping supernodes. W e deﬁne the low-pass and high-pass Haar template basis vector are, respecti vely , g ( S 1 , S 2 ) = s S 1 || S 2 | | S 1 | + | S 2 |  1 S 1 | S 1 | + 1 S 2 | S 2 |  ∈ R N , h ( S 1 , S 2 ) = s S 1 || S 2 | | S 1 | + | S 2 |  1 S 1 | S 1 | − 1 S 2 | S 2 |  ∈ R N . Follo wing from the template, the basis vectors of the k th subgraph are v (2) k = g ( v (2) 2 k − 1 , v (2) 2 k ) , u (2) k = h ( v (2) 2 k − 1 , v (2) 2 k ) . W e collect all the low-pass basis vectors and high-pass basis vectors in the second layer to form a lo w-pass subspace and a high-pass subspace, respectiv ely , V (2) = span  { v (2) k } K (2) k =1  and U (2) = span  { u (2) k } K (2) k =1  . Let the residual subspace be R (2) = span  { 1 v (2) n } K n =2 K (2) +1  , where each basis vector only activ ates an individual supernode. Now V (2) ⊕ U (2) ⊕ R (2) = V (1) . For an y graph signal x ∈ R N , the reconstruction is x = K (2) X k =1 h x , v (2) k i v (2) k | {z } ∈ V (2) + K (2) X k =1 h x , u (2) k i u (2) k | {z } ∈ U (2) + K X k =1 h x , u (1) k i u (1) k | {z } ∈ U (1) + N X k =2 K +1 h x , 1 v k i 1 v k | {z } ∈ R (1) + K X k =2 K (2)+1 h x , 1 v (2) k i 1 v (2) k | {z } ∈ R (2) . W e can keep decomposing the low-pass subspace until there is only one constant basis vector . During the iterated decomposi- tion, we keep coarsening in the graph vertex domain, leading to larger supernodes and more global-wise basis vectors; we thus call this a ﬁne-to-coarse appr oach ; see Figure 9. Let the decomposition depth be L . By induction, the general reconstruction is x = K ( L ) X k =1 h x , v ( L ) k i v (2) k | {z } ∈ V ( L ) + L X ` =1 K ( ` ) X k =1 h x , u ( ` ) k i u ( ` ) k | {z } ∈ U ( ` ) + L X ` =1 K ( ` − 1) X k =2 K ( ` ) +1 h x , 1 v ( ` ) k i 1 v ( ` ) k | {z } ∈ R ( ` ) , where v (1) k = v k , K 1 = K and K 0 = N . Note that for discrete-time signals, the ordering of time stamps is naturally provided by time, leading to straightforward downsampling and shifting, and iterated ﬁlter banks, as a ﬁne- to-coarse approach, are ef ﬁcient architectures to implement the multiresolution analysis. For graph signals, the ordering in each multiresolution lev el is unknown and an efﬁcient ﬁne-to-coarse approach to implement the graph multiresolution analysis is not straightforward any more. This is why we consider the coarse- to-ﬁne approach in this paper; in other words, we con vert the problem of node ordering to the problem of graph partitioning, which is more efﬁcient and straightforward. B. Pr oof of Theorem 1 Pr oof. First, we show each vector has norm one.      s | S 1 || S 2 | | S 1 | + | S 2 |  1 | S 1 | 1 S 1 − 1 | S 2 | 1 S 2       2 2 ( a ) =      s | S 1 || S 2 | | S 1 | + | S 2 | 1 S 1 | S 1 |      2 2 +      s | S 1 || S 2 | | S 1 | + | S 2 | 1 S 2 | S 2 |      2 2 = 1 , where ( a ) follows from that S 1 ∩ S 2 = ∅ . Second, we show each vector is orthogonal to the other vectors. W e hav e 1 T w = s | S 1 || S 2 | | S 1 | + | S 2 | X i ∈ S 1 1 | S 1 | − X i ∈ S 2 1 | S 2 | ! = 0 . Thus, each vector is orthogonal to the ﬁrst vector , 1 V / p |V | . Each other indi vidual vector is generated from two node sets. Let S 1 , S 2 generate w i and S 3 , S 4 generate w j . Due to the construction, there are only two conditions, two node sets of one vector belong to one node set of the other vector , and all four node sets do not share element with each other . For the ﬁrst case, without losing generality , let ( S 3 ∪ S 4 ) ∩ S 1 = S 3 ∪ S 4 , we hav e w T i w j = s | S 1 || S 2 | | S 1 | + | S 2 | | S 3 || S 4 | | S 3 | + | S 4 | X i ∈ S 3 1 | S 3 | − X i ∈ S 4 1 | S 4 | ! = 0 . For the second case, the inner product between w i and w j is zero because their supports do not match. Third, we sho w that W spans R N . Since we recursiv ely partition the node set until the cardinalities of all the node sets are smaller than 2, there are N vectors in W . C. Pr oof of Theorem 2 Pr oof. W e ﬁrst show that V 1 , V 2 are connected and then bound the cardinality dif ference. Since the original graph is connected, D v i ,v j is ﬁnite, where v i , v j are two hubs. In Step 4, we partition the nodes according to their distances to two hubs. Every node in the node set S 1 is connected to v j ; thus, the subgraph induced by the node set S 1 is connected. In Step 5, we partition the boundary set S 2 into connected node sets, C 1 , C 2 , · · · , C M , and each of them connects to S 1 ; otherwise, the maximum element in the geodesic distance matrix D is inﬁnity . W e thus ha ve S 1 ∪ C 1 ∪ C 2 · · · ∪ C m is connected for all m = 1 , · · · , M . When we set m = m ∗ obtained in Step 7, we have V 1 = S 1 ∪ C 1 ∪ C 2 · · · ∪ C m ∗ is connected. Similarly , we can show that V 2 is also connected. In Step 3, we set p as the median v alue of the dif ferences to two hubs, which sets | S 1 | around |V 0 | / 2 . In Step 6, we sequen- tially add connected components to S 1 and ﬁnally choose the 13 one, whose cardinality is closest to |V 0 | / 2 . The last component added to S 1 is C m ∗ , which ensures that ||V 1 | − |V 0 | / 2 | ≤ | C m ∗ | and ||V 2 | − |V 0 | / 2 | ≤ | C m ∗ | . D. Pr oof of Theorem 3 Pr oof. When an edge e ∈ supp(∆ w ) , where w is one basis vector in the graph wa velet basis W , ∆ is the graph incident matrix, and supp denotes the edge indices activ ated by the nonzero elements of ∆ w ; we call that the edge e is acti vated by the wa velet basis vector w . Since in each lev el, the pieces are disjoint, each edge will be activ ated at most once in each lev el; in total, each edge will be activ ated by at most L wavelet basis vectors, where L the decomposition lev el. Let acti v ations( e ) be the number of wav elet basis vectors in W that activ ates e .   W T x   0 ≤ 1 + X e ∈ Supp(∆ w ) activ ations( e ) ≤ 1 + k ∆ x k 0 L, where 1 comes from the activ ation of the ﬁrst column vector , which is constant. Since we promote the bisection scheme, the decomposition lev el L is roughly 1 + log 2 N . E. Pr oof of Theorem 4 Pr oof. The main idea is that we approximate a bandlimited signal in the original graph by using bandlimited signals in subgraphs. Based on the eigen vectors of graph Laplacian matrix, we deﬁne the bandlimited space, where each signal can be represented as x = V ( K ) a , where V ( K ) is the submatrix of V containing the ﬁrst K columns in V . W e can show that this bandlimited space is a subspace of the small-variation space { x : x T L x ≤ λ K x T x } . x T L x = X i,j ∈E W i,j ( x i − x j ) 2 = X S c X i,j ∈E S c W i,j ( x i − x j ) 2 + X i,j ∈ ( E / ∪ c E S c ) W i,j ( x i − x j ) 2 = X S c x T S c L S c x S c + x T L cut x ≤ λ K x T x , where L S c is the graph Laplacian matrix of the subgraph G S c and L S c stores the residual edges, which are cut in the graph partition algorithm. Thus, { x : x T L x ≤ λ K x T x } is a subset of S S c { x S c : x T S c L S c x S c ≤ λ K x T x − x T L cut x } ; that is, any small- variation graph signal in the whole graph can be precisely represented by small-variation graph signals in the subgraphs. In each local set, when we use the bandlimited space { x : x = V S c ( K ) a } to approximate the space { x S c : x T S c L S c x S c ≤ c x T S c x S c } , the maximum error we suf fer from is c x T S c x S c /λ ( S c ) K +1 , which is solved by the follo wing optimization problem, max x   x − V S c V T S c x   2 2 sub ject to : x T L S c x ≤ c x T x . In other words, in each local set, the maximum error to represent { x S c : x T S c L S c x S c ≤ λ K x T x − x T L cut x } is ( λ K x T x − x T L cut x ) /λ ( S c ) K +1 . Since all the local sets share the variation budget of λ K x T x together , the maximum error we suffer from is  par = x T ( λ K I − L cut ) x min S c λ ( S c ) K +1 k x k 2 2 , which depends on the property of graph partitioning. In Corollary 1, we have shown that we need at most 2 L k ∆ x PC k 0 local sets to represent the piecewise-constant template of x . Since we use at most K eigen vectors in each local set, we obtain the results in Theorem 4.

Multiresolution Representations for Piecewise-Smooth Signals on Graphs

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment