Blind Community Detection from Low-rank Excitations of a Graph Filter

Blind Comm unit y Detection from Lo w-rank Excitations of a Graph Filter Hoi-T o W ai, San tiago Segarra, Asuman E. Ozdaglar, Anna Scaglione, Ali Jadbabaie ∗ April 16, 2019 Abstract This pap er considers a new framew ork to detect communities in a graph from the observ a- tion of signals at its no des. W e model the observ ed signals as noisy outputs of an unkno wn net work process, represen ted as a graph ﬁlter that is excited by a set of unkno wn low-rank inputs/excitations. Application scenarios of this model include diﬀusion dynamics, pricing exp erimen ts, and opinion dynamics. Rather than learning the precise parameters of the graph itself, w e aim at retrieving the communit y structure directly . The pap er shows that comm unities can b e detected by applying a spectral metho d to the cov ariance matrix of graph signals. Our analysis indicates that the communit y detection p erformance dep ends on a ‘low-pass’ property of the graph ﬁlter. W e also show that the p erformance can b e impro ved via a lo w-rank matrix plus sparse decomp osition metho d when the latent parameter vectors are known. Numerical exp erimen ts demonstrate that our approach is eﬀectiv e. 1 In tro duction The emerging ﬁeld of network scienc e and av ailability of big data ha v e motiv ated researc hers to extend signal pro cessing techniques to the analysis of signals deﬁned on graphs, prop elling a new area of research referred to as gr aph signal pr o c essing (GSP) [ 1 – 3 ]. As opp osed to signals on time deﬁned on a regular topology , the prop erties of gr aph signals are in timately related to the generally irregular top ology of the graph where they are deﬁned. The goal of GSP is to develop mathematical to ols to leverage this top ological structure in order to enhance our understanding of graph signals. A suitable wa y to capture the graph’s structure is via the so-called gr aph shift op er ator (GSO), which is a matrix that reﬂects the lo cal connectivity of the graph and is a generalization of the time shift or delay op erator in classical discrete signal pro cessing [ 1 ]. Admissible choices for the GSO include the graph’s adjacency matrix and the Laplacian matrix. When the GSO is known, the algebraic and sp ectral characteristics of a given graph signal can b e analyzed in an analogous wa y as in time-series analysis [ 1 ]. F urthermore, signal processing to ols suc h as sampling [ 4 , 5 ], in terp olation [ 6 , 7 ] and ﬁltering [ 8 , 9 ] can b e extended to the realm of graph signals. ∗ H.-T. W ai is with the Departmen t of SEEM, The Chinese Universit y of Hong Kong, Shatin, Hong Kong. E-mail: htwai@se.cuhk.edu.hk . S. Segarra is with Departmen t of ECE, Rice Univ ersity , TX, USA. E-mail: segarra@rice.edu . A. E. Ozdaglar is with LIDS, Massach usetts Institute of T echnology , MA, USA. A. Jadbabaie is with IDSS, Mas- sac husetts Institute of T echnology , MA, USA. E-mails: { asuman,jadbabai } @mit.edu . A. Scaglione is with School of ECEE, Arizona State Universit y , T emp e, AZ, USA. E-mail: Anna.Scaglione@asu.edu 1 This pap er considers an inverse problem in GSP where our fo cus is to infer information ab out the GSO (or the graph) from the observ ed graph signals. Naturally , gr aph or network infer enc e is relev ant to netw ork and data science, and has b een studied extensiv ely . Classical metho ds are based on partial correlations [ 10 ], Gaussian graphical mo dels [ 11 ], and structural equation mo dels [ 12 ], among others. Recently , GSP-based metho ds for graph inference ha v e emerged, which tackle the problem as a system identiﬁcation task. They postulate that the unknown graph is a structure enco ded in the observed signals and the signals are obtained from observ ations of netw ork dynamical pro cesses deﬁned on the graph [ 13 , 14 ]. Diﬀerent assumptions are put forth in the literature to aid the graph top ology inference, such as smoothness of the observ ed signals [ 15 – 17 ], richness of the inputs to the net work pro cess [ 18 – 21 ], and partial knowledge of the net work pro cess [ 12 , 22 ]. A drawbac k common to the prior GSP w ork on graph inference [ 15 – 21 ] is that they require the observ ed graph signals to b e ful l-r ank . Equiv alen tly , the signals observ ed are results of a net w ork dynamical pro cess excited b y a set of input signals that span a space with the same dimension as the num b er of nodes in the graph. Such assumption can b e unnecessarily stringent for a num b er of applications, esp ecially when the graph contains a large num b er of nodes. F or example, whenev er graph inference exp eriments can only b e p erformed b y exciting a few no des on the graph (suc h as rumor spreading initiated by a small num b er of sources and the gene p erturbation exp erimen ts in [ 23 ]); or the amoun t of data collected is limited due to cost and time constraints. Often times, inferring the entire graph structure is only the ﬁrst step since the ultimate goal is to obtain interpr etable information from the set of graph signals. T o this end, a feature that is often sough t in netw ork science is the c ommunity structur e [ 24 ] that oﬀers a coarse description of graphs. F or this task, applying con ven tional metho ds necessitates a two-step pro cedure which comprises of a graph learning and a comm unit y detection step. This pap er departs from the con v entional metho ds b y developing a dir e ct analysis framew ork to reco ver the comm unities based on the observ ation of graph signals. W e consider a setting where the observ ations are graph signals mo deled as the outputs of an unknown netw ork pro cess represen ted by a graph ﬁlter. Such signal mo del can b e applicable to observ ations from, e.g., diﬀusion dynamics, pricing exp erimen ts in consumer netw orks [ 25 , 26 ], and DeGroot dynamics [ 27 ] with stubb orn agen ts. In addition, unlike the prior w orks on graph learning, w e allow the excitations to the graph ﬁlter to b e low-r ank . This is a challenging y et practical scenario as w e demonstrate later. W e prop ose and analyze tw o blind c ommunity dete ction ( BlindCD ) methods that do not require learning the graph top ology nor knowing the dynamics gov erning the generation of graph signals explicitly . The ﬁrst metho d applies sp ectral clustering on the sampled cov ariance matrix, which is akin to a common heuristics used in data clustering, e.g., [ 28 ]. Here our contribution lies in showing when sampled co v ariance carries information ab out the comm unities. Under a mild assumption that the underlying graph ﬁlter is low-p ass with the GSO tak en as graph Laplacian, we show that the co v ariance matrix of observed graph signals is a sketch of the Laplacian matrix that retains coarse top ological features of the graph, like communities. W e quantify the sub optimality of the BlindCD metho d compared to the minimizer of a conv ex relaxation of the RatioCut ob jective deﬁned on the actual graph Laplacian. Our result helps in justifying the successful application of suc h heuristics on real data. F urthermore, the theoretical analysis of BlindCD iden tiﬁes the key b ottlenec k in the sp ectral metho d applied to some GSP mo dels. This leads to the developmen t of our second metho d, called b o osted BlindCD . The method w orks under an additional assumption that the laten t parameter vectors are av ailable and b o osts the p erformance of the ﬁrst method b y lev eraging a lo w-rank plus sparse structure in the linear transformation b et ween excitations and observ ed graph 2 signals. Performance b ound is also analyzed for this metho d. The organization of this pap er is as follo ws. In Section 2 , w e in tro duce notations b y describing the graph mo del and a formal deﬁnition for communities on graph. Section 3 presen ts the GSP signal mo del with real w orld examples. In Section 4 and 5 , we describ e and analyze the proposed BlindCD metho d and its b o osted version. In Section 6 , w e present numerical results on synthetic and real data to v alidate our ﬁndings. Notation — W e use b oldface lo wer-case ( r esp. upp er-case) letters to denote v ectors ( r esp. ma- trices). F or a v ector x , the notation x i denotes its i th elemen t and we use k x k 2 to denote the standard Euclidean norm. F or a matrix X , the notation X ij denotes its ( i, j )th elemen t whereas [ X ] i, : denotes its i th row vector and [ X ] I , : denotes the collection of its ro w v ectors in I . Also, R ( X ) ⊆ R N denotes the range space of X ∈ R N × M . Moreov er, k X k F ( r esp. k X k 2 ) denotes the F rob enius norm ( r esp. sp ectral norm). F or a symmetric matrix E , β i ( E ) denotes its i th largest eigen v alue. F or a matrix M ∈ R P × N , σ i ( M ) denotes its i th largest singular v alue and [ M ] K denotes its rank K appro ximation. Moreov er, M admits the partition M = [ M K M N − K ] where M K ( r esp. M N − K ) denotes the matrix consisting of the left-most K ( r esp. right-most N − K ) columns of M . Similarly , m ∈ R N is partitioned into m = [ m K ; m N − K ], where m K ( r esp. m N − K ) consists of its top K ( r esp. b ottom N − K ) elemen ts. F or an y integer K , we denote [ K ] : = { 1 , ..., K } . 2 Preliminaries 2.1 Graph Signal Pro cessing Consider an undirected graph G = ( V , E , A ) with N no des suc h that V = [ N ] : = { 1 , ..., N } and E ⊆ V × V is the set of edges where ( i, i ) / ∈ E for all i . The graph G is also asso ciated with a symmetric and weigh ted adjacency matrix A ∈ R N × N + suc h that A ij = A j i > 0 if and only if ( i, j ) ∈ E . The graph Laplacian matrix for G is deﬁned as L : = D − A , where D : = Diag ( A 1 ) is a diagonal matrix con taining the weigh ted degrees of G . As L is symmetric and p ositive semideﬁnite, it admits the follo wing eigendecomp osition L = V Λ V > , (1) where Λ = Diag ([ λ 1 , ..., λ N ]) and λ i is sorted in asc ending or der suc h that 0 = λ 1 ≤ λ 2 ≤ · · · ≤ λ N . A graph signal is deﬁned as a function on the no des of G , f : V → R , and can b e equiv alently represen ted as a vector x : = [ x 1 , x 2 , ..., x N ] ∈ R N , where x i is the signal v alue at the i th no de. The graph is endow ed with a graph shift op erator (GSO) that is set as the graph Laplacian L . Note that it is also p ossible to deﬁne alternativ e GSOs such as the adjacency matrix A and its normalized v ersions; see [ 1 ] for an o verview on the sub ject, y et the analysis result in this paper may diﬀer sligh tly for the latter cases. Ha ving deﬁned the GSO, the graph F ourier transform (GFT) [ 1 ] of x is giv en by ˜ x : = V > x . (2) The vector ˜ x is called the fr e quency domain representation of x with resp ect to (w.r.t.) the GSO L [ 1 , 3 ]. The GSO can b e used to deﬁne linear graph ﬁlters. These are linear graph signal op erators that can b e expressed as matrix p olynomials on L : H ( L ) : = T d − 1 X t =0 h t L t = V T d − 1 X t =0 h t Λ t ! V > , (3) 3 where T d is the or der of the graph ﬁlter. Note that by the Ca yley-Hamilton theorem, any matrix p olynomial (ev en of inﬁnite degree) can b e represented using the form ( 3 ) with T ≤ N . F or a giv en excitation graph signal x ∈ R N , the output of the ﬁlter is simply y = H ( L ) x , and carries the classical interpretation of being a linear combination of shifted v ersions of the input. The graph ﬁlter H ( L ) ma y also be represen ted by its frequency resp onse ˜ h , deﬁned as ˜ h i : = h ( λ i ) = P T d − 1 t =0 h t λ t i . (4) W e denominate the p olynomial h ( λ ) : = P T d − 1 t =0 h t λ t as the gener ating function of the graph ﬁlter. F rom ( 3 ) it follows that the frequency representations of the input and the output of a ﬁlter are related by ˜ x = ˜ h  ˜ z , (5) where  denotes the elemen t-wise product. This is analogous to the con v olution theorem for time signals. In Section 3 , w e utilize GSP to mo del the relationship b et ween the observ ed data and the unkno wn graph G . 2.2 Comm unit y Structure and its Detection In tuitively , a communit y on the graph G is a subset of no des, C ? k ⊆ V , that induces a densely connected subgraph while lo osely connected with no des not in C ? k . T o formally describ e a comm unity structure, in this pap er we refer to the common notion of r atio-cut [ 24 ] that measures the total cut w eight across the b oundary b etw een a disjoint partition of G = ( V , E , A ). In particular, for any disjoin t K partition of V , i.e., V = C 1 ∪ ... ∪ C K , deﬁne the function: RatioCut ( C 1 , ..., C K ) : = K X k =1 1 |C k | X i ∈C k X j / ∈C k A ij . (6) Throughout this pap er, we assume that there are K non-overlapping c ommunities in G as given b y C ? 1 , ..., C ? K , where the latter is a minimizer to the ratio-cut function and it results in a small ob jective v alue. F or instance, δ ? : = RatioCut ( C ? 1 , ..., C ? K ) ≤ RatioCut ( C 1 , ..., C K ) (7) where δ ?  1 indicates that the graph has K communities. Ha ving deﬁned the ab ov e notion, the c ommunity dete ction problem is solved by minimizing ( 6 ) with the given num b er of communities K and graph adjacency matrix A . Ho wev er, the ratio-cut minimization problem is combinatorial and diﬃcult to solve. As suc h, a p opular remedy is to apply a conv ex relaxation – a method kno wn as the sp e ctr al clustering [ 29 , 30 ]. T o describ e the method, let us deﬁne the left- K eigenmatrix of the graph Laplacian L as V K : =  v 1 v 2 · · · v K  ∈ R N × K , (8) where v i is the i th eigenv ector of L corresp onding to the i th eigenv alue λ i [cf. ( 1 ) ]. The K -means metho d [ 31 ] is applied on the ro w v ectors of V K , which seeks a partition C 1 , ..., C K that minimizes the distance of eac h row vector to their resp ective means. The sp ectral clustering minimizes F ( C 1 , ..., C K ) : = K X k =1 X i ∈C k    v row i − 1 |C k | X j ∈C k v row j    2 2 , (9) 4 where v row j : = [ V K ] j, : is the j th row vector of V K . F or general K , [ 32 ] prop osed a p olynomial-time algorithm that ﬁnds an (1 +  )-optimal solution, ˜ C 1 , ..., ˜ C K , to the K -means problem ( 9 ) satisfying F ( ˜ C 1 , ..., ˜ C K ) ≤ (1 +  ) min C 1 ,..., C K ⊆ V F ( C 1 , ..., C K ) , (10) under some statistical assumptions on { v row i } N i =1 . The sp ectral clustering metho d is shown to b e eﬀectiv e b oth in theory and in practice. In particular, when K = 2 and the graph of interest is dra wn from a sto chastic blo ck mo del ( SBM ) satisfying certain sp ectral gap conditions, the sp ectral metho d exactly reco v ers the ground truth c lusters in the SBM when N → ∞ [whic h also giv es a minimizer to ( 6 )], see [ 29 ]. 3 Graph Signal Mo del Consider a graph signal y ` ∈ R N deﬁned on the graph G describ ed in Section 2.1 . The graph signal is obtained by exciting the graph ﬁlter H ( L ) with an excitation x ` ∈ R N , y ` = H ( L ) x ` + w ` , ` = 1 , ..., L , (11) where w ` ∈ R N includes b oth the mo deling and measurement error in data collection. W e assume that w ` is zero mean and sub-Gaussian with E [ w ` ( w ` ) > ] = σ 2 w I . Consider a low-rank excitation setting where { x ` } L ` =1 b elong to an R -dimensional subspace of R N . Assume K ≤ R  N , where K is the num b er of comm unities sp eciﬁed in Section 2.2 . Let B ∈ R N × R and x ` = B z ` , (12) where z ` ∈ R R is a laten t parameter v ector con trolling the excitation signal. Under this mo del, the sampled cov ariance matrix of { y ` } L ` =1 is low rank with at most rank R . As mentioned, under such setting it is diﬃcult to reconstruct L from { y ` } L ` =1 using the existing methods [ 15 – 20 ]. Before discussing the prop osed metho ds for inferring comm unities from { y ` } L ` =1 in Section 4 and 5 , let us justify the mo del ( 11 ), ( 12 ) with three motiv ating examples. 3.1 Example 1: Diﬀusion Dynamics The ﬁrst example describ es graph signals resulting from a diﬀusion pro ces s. F or example, this mo del is commonly applied to temp eratures within a geographical region [ 21 ]. Under this mo del, each no de in the graph of Section 2.1 is a location and the weigh ts A ij = A j i represen t the strengths of relativ e inﬂuence b et ween i and j suc h that P N j =1 A ij = 1 for i = 1 , ..., N . The ` th sample graph signal obtained is the result of a diﬀusion o ver T steps, described as y ` = ((1 − α ) I + α A ) T x ` + w ` = ( I − α L ) T x ` + w ` , (13) where α ∈ (0 , 1) is the speed of the diﬀusion pro cess. As ( I − α L ) T is a p olynomial of the graph’s Laplacian, we observe that y ` is an output of a graph ﬁlter ( 11 ). On the other hand, the excitation signal x ` ma y mo del the changes in temp erature in the region due to a weather condition. The n um b er of mo des of temp erature changes maybe limited, e.g., a t ypical h urricane in North America aﬀects the east coast of the US. This eﬀect can b e captured b y having a tall matrix B , i.e., the excitation lies in a lo w-dimensional space. The columns of B represen ts the p oten tial mo des on which w eather conditions may aﬀect the region. 5 3.2 Example 2: Pricing Exp erimen ts in Consumers’ Game This example is concerned with graph signals obtained as the equilibrium consumption levels of a consumers’ game sub ject to pricing experiments [ 25 , 26 ]. Here, the graph described in Section 2.1 represen ts a net work of N agen ts where A ij = A j i ≥ 0 is the inﬂuence strength betw een agents i and j . W e assume that A 1 = c 1 suc h that eac h agen t exp eriences the same level of inﬂuence from the others. It has b een suggested in [ 26 ] that conducting a set of pricing experiments and observing the equilibrium b ehavior of agen ts can un veil the inﬂuence netw ork b etw een agents. Let ` b e the index of a pricing exp eriment. Agen t i c ho oses to consume y i units of a pro duct dep ending on (i) the price of the pro duct p ` i and (ii) the consumption levels of other agents who are neigh b ors of him/her in the netw ork, weigh ted by the inﬂuence strength A ij . The consumption lev el y i is determined by maximizing the utility u i ( y i , y − i , p ` i ) : = ay i − b 2 y 2 i + y i N X j =1 A ij y j − p ` i y i , (14) where y − i : = ( y j ) j 6 = i and a, b ≥ 0 are mo del parameters. As the utility function ab ov e dep ends on y − i , the equilibrium consumption lev el for the i th agen t can b e solved b y the follo wing net work game: y ` i = arg max y i ∈ R + u i ( y i , y ` − i , p ` i ) , ∀ i . (15) Under the conditions that b > P N j =1 A ij and a > p ` i , the equilibrium to the ab ov e game is unique [ 25 ] and it satisﬁes y ` = ( b I − A ) − 1 ( a 1 − p ` ) . (16) Remo ving the mean from y ` giv es the graph signal: ˜ y ` : = 1 L P L τ =1 y τ − y ` = ( b I − A ) − 1 ˜ p ` , (17) where ˜ p ` : = p ` − (1 /L ) P L l =1 p l can b e in terpreted as a vector of discounts to agen ts during the ` th pricing exp eriment. In fact, ( 17 ) can b e interpreted as a ﬁltered graph signal as in ( 11 ) b y recognizing ˜ p ` as the excitation signal and ˜ y ` as the observed graph signal. Since b > c and A 1 = c 1 , ( b I − A ) − 1 = 1 b − c ∞ X t =0  1 b − c L  t , (18) whic h is a matrix p olynomial in L . This shows that the linear op erator ( b I − A ) − 1 is indeed a graph ﬁlter. Next, we study the t yp es of discounts oﬀered in the pricing exp eriment. A practical case is that due to the limitation of market, the pricing experiments only control the prices on R agen ts, while the prices of the rest are unc hanged across exp erimen ts. This gives rise to a lo w-rank structure for the excitation signal. Note that ˜ p ` = B z ` holds with [ B ] I , : = I , [ B ] [ N ] \I , : = 0 , (19) where I ⊂ [ N ] is the index set of R agen ts whom prices are controlled, and z ` ∈ R R is simply a vector of the price v ariations from the mean. The latter can b e assumed as kno wn in a con trolled exp eriment setting. The discount oﬀered in the ` th exp erimen t is a sp ecial case of low-rank excitation. 6 3.3 Example 3: DeGro ot Dynamics with Stubb orn Agen ts The last example is related to a social net work with N agen ts where the graph signals are opinions sampled from the agents on diﬀeren t topics, e.g., v otes casted by Senators on diﬀeren t topics [ 33 ]. The netw ork is represen ted by a dir e cte d graph G = ( V , E , A ) such that A ij ≥ 0 captures the amoun t of ‘trust’ that agent i has on agent j . The agents are inﬂuenced by R stubb orn agents in the sense that their opinions are not inﬂuenced b y the others [ 34 – 36 ]. Consider the discussions on the ` th topic, the agents exc hange opinions according to the DeGro ot opinion dynamics [ 27 ] — let y ` i ( τ ) ( r esp. z ` j ) b e the opinion of the i th agent ( r esp. j th stubb orn agen t) at time τ , e.g., y ` i ( τ ) ∈ [0 , 1] represents the probabilit y for agent i to agree, w e hav e y ` ( τ + 1) = Ay ` ( τ ) + B z ` , τ = 1 , 2 , ... , (20) where B ∈ R N × R is a w eight matrix describing the bipartite graph that connects the stubborn agen ts to the agen ts in G . W e assume that the concatenated matrix is stochastic such that [ A , B ] 1 = 1 and therefore the updated opinions are conv ex combinations of the opinions of neigh b oring agents; see [ 22 ] for detailed description on the mo del. Note that it is p ossible to estimate the latent parameter z ` as well since the latter represents the opinions of stubb orn agents. Let us fo cus on the ste ady-state opinions, i.e., the opinions when τ → ∞ . Under mild assumptions, it holds [ 22 , 37 ] y ` : = lim τ →∞ y ` ( τ ) = ( I − A ) − 1 B z ` = (Diag( 1 − A 1 ) + L ) − 1 B z ` ≈ c − 1 ( I + c − 1 L ) − 1 B z ` , (21) where the last appro ximation holds when there exists c > 0 such that c 1 ≈ 1 − A 1 = B 1 , e.g., when the out-degrees of the stubb orn agents are almost the same. F rom ( 21 ) it follows that the steady state opinions is a sp ecial case of ( 11 ), ( 12 ). 4 Blind Comm unit y Detection W e study the blind c ommunity dete ction problem, whose goal is to infer a disjoint partition of the no des V that corresp onds to the communities, C ? 1 , ..., C ? K , in the graph G = ( V , E , A ) as deﬁned in Section 2.2 , when the only giv en inputs are the observ ed graph signals { y ` } L ` =1 [cf. ( 11 ) , ( 12 ) ] and the desired num b er of communities K . Only in this section, w e assume that the laten t parameter v ector z ` is a random, zero-mean, sub-Gaussian v ector with E [ z ` ( z ` ) > ] = I . The co v ariance matrix of y ` is given by C y : = E [ y ` ( y ` ) > ] = H ( L ) B B > H > ( L ) + σ 2 w I . (22) W e also denote by C y : = H ( L ) B B > H > ( L ) the cov ariance of y ` in the absence of measurement error. Observe that H ( L ) B = V Diag( ˜ h ) V > B , (23) whic h is due to ( 3 ) , ( 4 ) . W e can interpret H ( L ) B as a sketch of the graph ﬁlter H ( L ), where B is a sketc h matrix that compresses the right dimension from N to R . 7 Algorithm 1 Blind Communit y Detection ( BlindCD ). 1: Input : Graph signals { y ` } L ` =1 ; desired num b er of comm unities K . 2: Compute the sample co v ariance b C y as b C y = (1 /L ) P L ` =1 y ` ( y ` ) > . (24) 3: Find the top- K eigen vectors of b C y (with the eigenv alues sorted in desc ending order). Denote the set of eigen vectors as b V K ∈ R N × K . 4: Apply the K -means method, whic h seeks to optimize min C 1 ,..., C K ⊆ V K X k =1 X i ∈C k    ˆ v row i − 1 |C k | X j ∈C k ˆ v row j    2 2 , (25) where ˆ v row i : = [ b V K ] i, : ∈ R K . 5: Output : K comm unities ˆ C 1 , ..., ˆ C K . T o p erform blind communit y detection based on { y ` } L ` =1 , let us gain intuition by considering the scenario when the noise is small ( σ 2 w ≈ 0), the ﬁrst K elemen ts in ˜ h are non-zero whic h hav e larger magnitudes than the rest of elements, and the columns of B span the same space as span { v 1 , ..., v K } . In this scenario, from ( 22 ) and ( 23 ) , we observe that V K can be estimated (up to a rotation) b y simply obtaining the top- K eigen vectors of C y . This intuition suggests that w e can detect comm unities by applying sp ectral clustering on C y , similar to the one applied to the Laplacian L in Section 2.2 . The proposed BlindCD metho d is summarized in Algorithm 1 . The computation complexity of BlindCD is dominated by co v ariance estimation and eigen v alue decomp osition in Line 2 - 3 , which costs O ( N 2 ( L + K )) FLOPS for large N . This is signiﬁcantly less complex than a tw o-step pro cedure using a sophisticated graph learning step, e.g., [ 18 ]. In addition to estimating the co v ariance, the latter requires a linear program with O ( N 2 ) v ariables and constraints. This learning step entails a total complexity of O ( N 2 L + N 7 log  − 1 acc ) FLOPS with the interior p oint metho d in [ 38 ] 1 , where  acc > 0 is the accuracy . Similar metho d to the BlindCD metho d hav e b een prop osed in the data clustering literature [ 28 ], oﬀering a simple interpretation of C y as the similarit y graph b etw een no des. W e pro vide a diﬀerent in terpretation here. Precisely , we view C y as a sp ectral sk etch of the Laplacian L and analyze the p erformance of BlindCD as an indirect algorithm to appr oximately ﬁnd the ground truth comm unities in L . 4.1 Lo w-pass Graph Filters F ollo wing ( 23 ) and the ensuing discussion, the p erformance of BlindCD dep ends on ˜ h , the frequency resp onse of the graph ﬁlter. I n particular, a desirable situation would b e one where ˜ h con tains only signiﬁcan t en tries o ver the ﬁrst K elemen ts; in this wa y , the graph ﬁlter H ( L ) is approximately rank K and retains all the eigenv ectors required for sp ectral clustering. T o quantify the ab o ve conditions, w e formally in tro duce the notion of a low-p ass gr aph ﬁlter (LPGF) as follo ws. 1 In practice, the said linear program can b e solved eﬃciently with a tailor-made solv er such as [ 39 ]. 8 Deﬁnition 1 A gr aph ﬁlter H ( L ) is a ( K, η ) -LPGF if η : = max  | ˜ h K +1 | , ...., | ˜ h N |  min  | ˜ h 1 | , ...., | ˜ h K |  < 1 , (26) wher e ˜ h i is deﬁne d in ( 4 ) . The LPGF is ideal if η = 0 . Note that a small η implies a ‘go o d’ LPGF, since η  1 implies that most of the energy is concen trated in the ﬁrst K frequency bins of the graph ﬁlter. In fact, as we show later in Section 4.2 , the low-pass co eﬃcient η pla ys an imp ortan t role in the p erformance of BlindCD . W e now surv ey a few graph ﬁlter designs that are LPGF and commen t on their lo w-pass co eﬃcien ts η . Example 1 Consider the ﬁlter or der T d < ∞ and H 1 ( L ) = ( I − α L ) T d − 1 , α ∈ (0 , 1 /λ N ) . (27) This ﬁlter mo dels a discr ete time diﬀusion pr o c ess after ( T d − 1) time instanc es on the gr aph [ 40 ]. In p articular, η 1 =  1 − αλ K +1 1 − αλ K  T d − 1 . (28) Observe that the c o eﬃcient η 1 impr oves exp onential ly with T d . Example 2 Consider H 2 ( L ) = ( I + c − 1 L ) − 1 , (29) for some c > 0 . This ﬁlter is analo gous to a single-p ole inﬁnite impulse r esp onse (IIR) ﬁlter in classic al signal pr o c essing. Its low-p ass c o eﬃcient c an b e b ounde d as η 2 = 1 + c − 1 λ K 1 + c − 1 λ K +1 = 1 − c − 1 λ K +1 − λ K 1 + c − 1 λ K +1 . (30) Observe that the c o eﬃcient η 2 ≈ 1 for λ K +1  1 or c  1 . Example 1 is related to the diﬀusion dynamics in Section 3.1 , while Example 2 is related to the consumers’ game and opinion dynamics in Sections 3.2 and 3.3 . F or further reference, an ov erview of graph ﬁlters and their relev ant net work pro cesses can b e found in [ 1 , 3 ]. W e conclude this subsection by characterizing the low-pass co eﬃcien t η from the prop erties of the generating function h ( λ ). T o simplify the analysis, w e consider the class of ﬁlters such that h ( λ ) satisﬁes the following assumption. Assumption 1 The gener ating function h ( λ ) is non-ne gative and non-incr e asing for al l λ ≥ 0 . Note that Assumption 1 holds for the graph ﬁlters in Examples 1 and 2. The following observ ation giv es a b ound on η using the ﬁrst and second order deriv atives of h ( λ ). 9 Observ ation 1 Supp ose that Assumption 1 holds and that h ( λ ) is L h -smo oth and µ h -str ongly c onvex for λ ∈ [ λ K , λ K +1 ] , wher e 0 ≤ µ h ≤ L h . Then, the gr aph ﬁlter H ( L ) is a ( K, η ) -LPF G with η ≤ 1 − 1 ˜ h K  µ h 2 ∆ λ 2 K − h 0 ( λ K +1 )∆ λ K  , η ≥ 1 − 1 ˜ h K  L h 2 ∆ λ 2 K − h 0 ( λ K +1 )∆ λ K  , (31) wher e ∆ λ K : = λ K +1 − λ K is the sp e ctr al gap of L . The observ ation can b e veriﬁed using the deﬁnitions of L h -smo oth and µ h -strongly conv ex functions [ 41 ]. Note that Assumption 1 implies that h ( λ ) is conv ex and the deriv ativ e h 0 ( λ K +1 ) is non-p ositive. Consequen tly , the upp er bound on η dep ends on the sp ectral gap ∆ λ K and the magnitude of ˜ h K . In particular, for a c onstant sp e ctr al gap , a small ˜ h K leads to η ≈ 0 and th us a goo d LPGF. 4.2 P erformance Analysis This subsection shows that under the GSP model ( 11 ), ( 12 ) and using Deﬁnition 1 , we can bound the ‘sub optimality’ of the communities obtained b y BlindCD compared to the ‘optimal’ ones found using sp ectral clustering on L [cf. Section 2.2 ]. T ogether with recent adv ances in the theoretical analysis of sp ectral clustering [ 29 ], this result allo ws us to quan tify the accuracy of BlindCD to p erform blind c ommunity dete ction and pro vides new insights on how to impro v e its p erformance. T o pro ceed, ﬁrst let us tak e the K -means ob jectiv e function F ( · ) in ( 9 ) constructed from eigen vectors of L as our p erformance metric. Let us denote F ? : = min C 1 ,..., C K ⊆ V F ( C 1 , ..., C K ) (32) as the optimal ob jectiv e v alue. F urthermore, b C y is the sampled cov ariance of { y ` } L ` =1 and C y is the co v ariance of y ` in the absence of noise. The ensuing p erformance guarantee follows: Theorem 1 Under the fol lowing c onditions: 1. H ( L ) is a ( K, η ) -LPGF [cf. Deﬁnition 1 ], 2. rank( V K diag( ˜ h K ) V > K B Q K ) = K , wher e Q K is the top- K right singular ve ctor of H ( L ) B . 3. rank( H ( L ) B ) ≥ K , 4. Ther e exists δ > 0 such that δ : = β K ( C y ) − β K +1 ( C y ) − k b C y − C y k 2 > 0 , (33) wher e β K ( C y ) is the K th lar gest eigenvalue of C y . F or any  > 0 , if the p artition ˆ C 1 , ... ˆ C K found by BlindCD is a (1 +  ) -optimal solution 2 to pr oblem ( 25 ) , then, q F ( ˆ C 1 , ..., ˆ C K ) − p (1 +  ) F ? ≤ (2 +  ) √ 2 K s γ 2 1 + γ 2 + k b C y − C y k 2 δ ! , (34) 2 This means that the ob jective v alue obtained is at most (1 +  ) times the optimal v alue. See [ 32 ] for a p olynomial- time algorithm achieving this. 10 wher e γ is b ounde d by γ ≤ η k V > N − K B Q K k 2 k ( V > K B Q K ) − 1 k 2 . (35) The pro of (inspired b y [ 42 ], also see [ 43 ]) can b e found in App endix A . Condition 1) requires that the graph ﬁlter inv olv ed is an LPGF. This natural requisite imp oses that the frequency resp onse m ust b e higher for those eigenv ectors that capture the communit y structure in the graph. Conditions 2) and 3) are technical requirements implying that the rank R of the excitation matrix B cannot b e smaller than the num b er of clusters K that we are trying to recov er. Lastly , condition 4) imp oses a restriction on the distance b etw een the true co v ariance C y and the observed one b C y . This condition ma y b e violated if the sp ectral gap β K ( C y ) − β K +1 ( C y ) is small or, relying on Lemma 1 , if the noise p ow er σ 2 w is large. Moreo ver, Eq. ( 34 ) in Theorem 1 b ounds the optimality gap for the communities found applying BlindCD compared to F ? in ( 32 ) . W e ﬁrst observ e that the p erformance decreases when the num b er of communities K increases, which is natural. This bound consists of the sum of t w o con tributions. The ﬁrst term is a function of γ , which in turn depends on the low-pass co eﬃcient η of the LPGF in volv ed as well as the alignmen t betw een the matrices B Q K and V N − K . F rom ( 35 ) , the reco vered comm unities are more accurate when: 1) the LPGF is close to ideal ( η ≈ 0) and 2) the distortion induced by B on the relev an t eigen vectors V K is minimal. The second term in ( 34 ) dep ends on the distance b etw een b C y and C y , capturing the com bined eﬀect of noise in the observ ations (via σ 2 w ) as w ell as the ﬁnite sample size. T o further control this term, if we deﬁne ∆ : = b C y − C y , the next result follows. Lemma 1 [ 44 , R emark 5.6.3, Exer cise 5.6.4] Supp ose that i) y 1 , ..., y L ar e indep endent, and ii) they ar e b ounde d almost sur ely with k y ` k 2 ≤ Y . L et the eﬀe ctive r ank of C y b e r : = T r ( C y ) / k C y k 2 , then for every c > 0 with pr ob ability at le ast 1 − c , one has that k ∆ k 2 ≤ σ w + C  s Y 2 r log ( N /c ) L + Y 2 r log ( N /c ) L  , (36) for some c onstant C that is indep endent of N , r , L, c , and σ y . Condition ii) in Lemma 1 is satisﬁed if y ` is sub-Gaussian and N  1. F rom Lemma 1 it follows that the error conv erges to σ w at the rate of O ( p r K 2 log( N ) /L ). F or our mo del, it can b e veriﬁed that r ≈ R  N , where R is the rank of B and the sampling complexity is signiﬁcantly reduced compared to a signal mo del with full-rank excitations. In a nutshell, Theorem 1 illustrates the eﬀects that the observ ation noise, the ﬁnite n umber of observ ations, and the low-pass structure of the ﬁlter ha ve on the sub optimality of the comm unities obtained. As discussed ab ov e, the lo w-pass co eﬃcien t η pla ys an imp ortant role in the p erformance of BlindCD . While η is determined by the dynamics that generates the graph signals { y ` } L ` =1 , it is p ossible to improv e this coeﬃcient, as describ ed in the next section. 5 Bo osted Blind Comm unit y Detection The p erformance analysis in the previous section shows that the performance of BlindCD dep ends on the low-pass ﬁlter coeﬃcient η . While it is impossible to c hange the graph ﬁlter that generates the data, this section presents a ‘b o osting’ tec hnique that extracts an improv ed low-pass ﬁltered 11 comp onen t, i.e., one with a smaller η , from the observ ed graph signals. F or the application of the b o osting technique, w e shall w ork with low-pass graph ﬁlters satisfying Assumption 1 and consider a data model where, apart from the access to the graph signals y ` ∈ R N w e also hav e access to the laten t parameter vector z ` ∈ R R [cf. ( 11 ) , ( 12 ) ]. This scenario can b e justiﬁed in the example of pricing exp eriments [cf. Section 3.2 ] when the price discoun ts are directly controlled by the seller attempting to estimate the net w ork; or in the example of DeGro ot dynamics [cf. Section 3.3 ] where the latent parameter vectors are the opinions of the stubb orn agen ts. First, the input-output pairs { z ` , y ` } L ` =1 enable us to estimate the N × R matrix H ( L ) B via the least square estimator H ? ∈ arg min b H ∈ R N × R 1 L L X ` =1    y ` − b H z `    2 2 , (37) where the solution is unique when L ≥ R and { z ` } L ` =1 spans R R . Imp ortantly , we note the decomp osition: H ( L ) B = e H ( L ) B + ˜ h N B , (38) where e H ( L ) : = H ( L ) − ˜ h N I (39) is a graph ﬁlter with the generating function ˜ h ( λ ) = h ( λ ) − ˜ h N . The graph ﬁlter e H ( L ) is called a b o oste d LPGF as it has a smaller lo w-pass coeﬃcient, denoted b y ˜ η , than the low-pass co eﬃcient of the original H ( L ). This can b e seen since (i) the magnitude of the b o osted K th frequency resp onse is reduced to ˜ h K − ˜ h N ; (ii) the ﬁrst and second order deriv ativ es of ˜ h ( λ ) are the same as h ( λ ). Applying Observ ation 1 it follows that e H ( L ) has a smaller low-pass co eﬃcient ˜ η b y replacing ˜ h K b y ˜ h K − ˜ h N in ( 31 ). Concretely , we observe the example. Example 3 (Bo oste d single-p ole IIR ﬁlter). Consider H 3 ( L ) : = H 2 ( L ) − (1 + c − 1 λ N ) − 1 I , (40) wher e H 2 ( L ) was deﬁne d in ( 29 ) and we note that ˜ h N = (1 + c − 1 λ N ) − 1 . We have η 3 = λ N − λ K +1 λ N − λ K 1 + c − 1 λ K 1 + c − 1 λ K +1 =  λ N − λ K +1 λ N − λ K  η 2 . It fol lows that η 3  η 2 whenever λ K +1  λ K . In general, the discussion ab o ve shows that it is p ossible to reduce the low-pass co eﬃcient η signiﬁcan tly b y adjusting the constan t lev el of the frequency resp onses in graph ﬁlters. As a result, applying sp ectral clustering based on the top- K left singular vectors of e H ( L ) B will return a more accurate communit y detection result. In order to estimate e H ( L ) B from H ∗ as in ( 38 ) , one needs, in principle, to ha ve access to B and the frequency resp onse ˜ h N . How ever, our goal is to obtain a b o osting eﬀect in the absence of kno wledge about B and ˜ h N . A k ey tow ards achieving this goal is to notice that e H ( L ) B is close to a rank- K matrix since e H ( L ) has a small lo w-pass co eﬃcient ˜ η . Hence, for R > K , it follows from ( 38 ) that H ? can b e decomp osed in to a lo w-rank matrix and a scaled version of the sk etch matrix B . This motiv ates us to consider the noisy matrix decomp osition problem prop osed in [ 45 ]: min b S , b B ∈ R N × R 1 2 k H ? − b S − b B k 2 F + κ k b S k σ, 1 + ρg  b B  s . t . g ? ( b S ) ≤ α , (41) 12 Algorithm 2 Bo osted BlindCD metho d. 1: Input : Graph signals and excitation signals { y ` , z ` } L ` =1 ; desired num b er of comm unities K . 2: Solv e the conv ex optimization problem ( 41 ) and denote its solution as ( b S ? , b B ? ). 3: Find the top- K left singular vectors of b S ? and denote the set of singular vectors as e S K ∈ R N × K . 4: Apply the K -means method on the row vectors of e S K . 5: Output : K comm unities ˜ C 1 , ..., ˜ C K . where k b S k σ, 1 is the trace norm of the matrix b S , H ? is a solution to ( 37 ) , α, κ, ρ > 0 are predeﬁned parameters, g ( · ) is a decomp osable regularizer of b B , which is a norm chosen according to the prior kno wledge on the unknown sketc h matrix B and g ? ( · ) is its dual norm. A few examples for c hoices of g ( · ) are listed b elow. • L o c alize d excitation : W e set g 1 ( b B ) = k v ec( b B ) k 1 , g ? 1 ( b S ) = k v ec( b S ) k ∞ . (42) This regularization forces the solution b B ? to ( 41 ) to b e an elemen t-wise sparse matrix. This corresp onds to the scenario where each element of the laten t v ariables in z ` excites only a few of the no des in our graph. • Smal l numb er of excite d no des : Let b b row i b e the i th ro w vector of b B . W e then set g 2 ( b B ) = N X i =1 k b b row i k 2 , g ? 2 ( b S ) = max i =1 ,....,N k ˆ s row i k 2 . (43) This regularization is motiv ated b y the group-sparsity form ulation in [ 46 ] which forces the solution b B ? to ( 41 ) to b e ro w-sparse. Notice that this is relev ant when the graph ﬁlter is excited on a small num b e r of no des. • Smal l p erturb ation : W e set g 3 ( b B ) = k b B k F , g ? 3 ( b S ) = k b S k F . (44) This regularization mo dels each entry of ˜ h N B as a Gaussian random v ariable of small, identical v ariance. This can b e used when there is no prior kno wledge on B . Notice that for ev ery choice of the regularizer g ( · ) discussed, ( 41 ) is a conv ex problem that can b e solv ed in p olynomial time. Let the optimal solution to ( 41 ) b e b S ? , b B ? . W e apply sp ectral metho d on b S ? based on its top- K left singular v ectors. The b o osted BlindCD method is ov erview ed in Algorithm 2 . 5.1 P erformance Analysis This section analyzes the p erformance of the b o oste d BlindCD method, mimicking the ideas in Section 4.2 . Due to the space limitation, w e focus on the sp ecial case where B is sparse and select g 1 ( b B ) in ( 42 ) when solving ( 41 ). 13 Our ﬁrst step to wards deriving a theoretical b ound for the p erformance of bo osted BlindCD is to c haracterize the estimation error of H ( L ) B when solving ( 37 ), deﬁned as E : = H ? − H ( L ) B . (45) Lemma 2 Supp ose that L ≥ R , { z ` } L ` =1 sp ans R R , and k w ` ( z ` ) > k < ∞ almost sur ely. F or every c > 0 and with pr ob ability at le ast 1 − 2 c , it holds that k E k 2 = O  σ w log(( N + R ) /c ) √ L  . (46) The pro of can b e found in Appendix B . Lem ma 2 captures the exp ected b ehavior of a v anishing estimation error when L → ∞ . Next, we show that b L ? from ( 41 ) is close to ˜ H ( L ) B b y leveraging the fact that the latter is approximately rank- K . Lemma 3 [ 45 , Cor ol lary 1] Consider pr oblem ( 41 ) with κ ≥ 4 k E k 2 , ρ ≥ 4  α √ N R + k vec( E ) k ∞  , α ≥ √ N R k v ec( e H ( L ) B ) k ∞ . (47) L et R ≥ K . Ther e exists c onstants c 1 , c 2 such that k b S ? − e H ( L ) B k 2 F + k b B ? − ˜ h N B k 2 F ≤ c 1 κ 2  K + 1 κ R X j = K +1 σ j ( e H ( L ) B )  + c 2 ρ 2 k v ec( B ) k 0 . (48) The term P R j = K +1 σ j ( e H ( L ) B ) is negligible when e H ( L ) B is approximately rank- K . Therefore, the implication is that the distance b etw een b S ? and e H ( L ) B can b e b ounded by the sum of t wo terms — one that is dep endent on E , and one that is dep enden t on α/ √ N R . Overall, it sho ws that the error reduces when the excitation rank R and n umber of observ ations L increases. On the other hand, ( 47 ) suggests that one should set κ = c 1 / √ L , ρ = c 2 / √ RL in ( 41 ) for some c 1 , c 2 for the optimal p erformance. Ha ving established these results, the bo osted BlindCD metho d is an appro ximation of BlindCD op erating on the b o osted LPGF e H ( L ) B . Next, w e deﬁne the SVD of ˜ H ( L ) B as e V e Σ e Q > and analyze the p erformance of the b o osted BlindCD through a minor mo diﬁcation of Theorem 1 . Corollary 1 Supp ose that Conditions 1 to 3 in The or em 1 ar e met when r eplacing Q K by e Q K and H ( L ) by e H ( L ) . L et e ∆ : = b S ? − e H ( L ) B and assume that e δ : = σ K ( e H ( L ) B ) − σ K +1 ( e H ( L ) B ) − k e ∆ k 2 > 0 . (49) If Step 4 in the b o oste d BlindCD metho d ﬁnds an (1 +  ) optimal solution to the K -me ans pr oblem, wher e  > 0 , then, q F ( ˜ C 1 , ..., ˜ C K ) − p (1 +  ) F ? ≤ (2 +  ) √ 2 K s ˜ γ 2 1 + ˜ γ 2 + k e ∆ k 2 ˜ δ ! , (50) 14 wher e F ( · ) , F ? ar e deﬁne d in ( 9 ) , ( 32 ) , r esp e ctively, and ˜ γ ≤ ˜ η k V > N − K B e Q K k 2 k ( V > K B e Q K ) − 1 k 2 , (51) wher e ˜ η is the low-p ass c o eﬃcient of the b o oste d LPGF e H ( L ) B . The pro of of Corollary 1 can b e found in App endix C . W e see that the performance of the b o osted BlindCD metho d dep ends on ˜ η , the lo w-pass co eﬃcien t of the bo osted LPGF. As ˜ η  η due to our prior discussions, it is anticipated that the b o osted metho d achiev es a muc h b etter p erformance, esp ecially when the original LPGF is not mark edly low-pass. While the bound in ( 50 ) is similar to that in Theorem 1 , we observ e that applying Lemma 2 and Lemma 3 yields k e ∆ k 2 ≤ k e ∆ k F = O  σ K +1 ( e H ( L ) B ) + 1 √ L + 1 √ N R  . F rom the deﬁnition of ˜ δ we hav e k e ∆ k 2 ˜ δ = O σ K +1 ( e H ( L ) B ) + 1 / √ L + 1 / √ N R σ K ( e H ( L ) B ) − C σ K +1 ( e H ( L ) B ) ! , (52) for some constan t C . Substituting ( 52 ) in to ( 50 ) sho ws that the sub-optimality of b o osted BlindCD can b e minimized when 1) the sp ectral gap for the sk etched matrix e H ( L ) B , 2) the num b er of samples L , and 3) the excitation rank R , are large. W e remark that it is possible to undertak e analogous p erformance analysis for the other proposed regularizers on B [cf. ( 43 ) and ( 44 ) ]. F or example, this can be done using [ 47 ] and replacing Lemma 3 with the corresp onding result. These extensions, how ever, are b eyond the scope of the current pap er. 6 Numerical Examples T o illustrate the eﬃcacy of the BlindCD metho ds, we study three application examples that p ertain to consensus dynamics, consumer netw orks, and so cial netw orks. Numerical examples will b e given for these applications, which were in tro duced in Sections 3.1 through 3.3 . Unless otherwise sp eciﬁed, the graphs used in the simulations will be generated according to a sto chastic blo ck mo del (SBM) [ 48 ], denoted by G ∼ SBM ( N , K, a, b ), such that G has N no des, K equal-sized non-o v erlapping comm unities and the intra ( r esp. inter) communit y connectivit y probabilit y is a ∈ [0 , 1] ( r esp. b ∈ [0 , 1]). The w eights on the graph, A ij , are set to 1 if ( i, j ) ∈ E and 0 otherwise. W e use the ground truth comm unity membership in generating the SBM graphs when ev aluating the accuracies. The error rate is given by P e : = E h 1 N min π :[ K ] → [ K ] P N i =1 1 π ( c i ) 6 = c true i i , (53) and the ab ov e is approximated via Mon te-Carlo sim ulations, where 1 E is an indicator function for the ev ent E , π : [ K ] → [ K ] is a p ermutation function and c i ∈ [ K ] ( r esp. c true i ) is the detected ( r esp. true) comm unit y mem b ership of node i . 15 10 2 10 3 10 4 10 5 10 6 10 − 2 10 − 1 10 0 Sample Size L Erro r Rate P e SC on S T d = 11 T d = 16 T d = 21 T d = 26 Figure 1: Comm unit y detection p erformance v ersus sample size L . W e consider graphs generated as G ∼ SBM ( N , K, 8 log N / N , log N / N ) with N = 150 , K = 3 and ﬁx the excitation rank at R = 15. The solid ( r esp. dashed) lines sho w the p erformance of BlindCD on the sampled output co v ariance b C y ( r esp. true and noiseless co v ariance C y ). 6.1 Diﬀusion Dynamics W e ﬁrst ev aluate the p erformance of BlindCD using graph signals generated according to the observ ation mo del in ( 11 ) . W e fo cus on the diﬀusion dynamics in Section 3.1 . W e p erform Monte- Carlo sim ulations to ev aluate the communit y detection p erformance on random graphs. In this example, the SBM graphs generated are G ∼ SBM ( N , K, 8 log N / N , log N / N ) with N = 150 and K = 3. W e sim ulate a scenario where the graph ﬁlter is excited on only R no des. In this case, the sk etch matrix B is generated b y ﬁrst pic king R ro ws uniformly from the N a v ailable rows, and the elemen ts in eac h selected ro ws are set to one uniformly with probabilit y p b = 0 . 5. F or the b o osted BlindCD metho d, we test the formulation of ( 41 ) with regularizers g 1 ( b B ) and g 2 ( b B ) [cf. ( 42 ) and ( 43 ) ] by setting κ = 2 / √ L and ρ = 0 . 5 / √ RL . The v ariance of observ ation noise is σ 2 w = 10 − 2 and eac h element of z ` is generated indep endently as [ z ` ] i ∼ U [ − 1 , 1]. The ﬁrst example examines the eﬀect of the graph ﬁlter’s lo w-pass co eﬃcien t η and sample size L on the p erformance of BlindCD . In particular, Fig. 1 shows the performance of comm unity detection for diﬀerent ﬁlter orders T d against the num b er of samples L accrued. Notice that the lo w-pass co eﬃcien t η decreases with the ﬁlter order T d [cf. ( 28 ) ]. As such, we observe that the p erformance impro ves with T d . The error rate approaches that achiev ed by applying sp ectral clustering on the actual L . An interesting observ ation is that for sample co v ariances, as T d increases, the sample size L required to reac h the p erformance of noiseless cov ariance also increases. This can b e explained with the condition ( 33 ) in Theorem 1 . In particular, as T d increases, the absolute v alue of β K ( C y ) − β K +1 ( C y ) decreases, therefore restricting k b C y − C y k 2 to b e smaller [cf. ( 33 ) ]. The latter is satisﬁed when the num b er of samples accrued is suﬃciently large. The second example sho ws the eﬀect of the excitation rank R . The results are shown in Fig. 2 where we hav e ﬁxed L = 10 3 and T d = 16. In this example, w e hav e compared the p erformance of 16 5 10 15 20 25 30 35 40 45 10 − 2 10 − 1 10 0 Excitation Rank R Erro r Rate P e SC on S BlindCD on C y BlindCD on ˆ C y Bo osted w/ g 1 ( ˆ B ) Bo osted w/ g 2 ( ˆ B ) 2-step w/ [ 18 ] Figure 2: Comm unit y detection p erformance v ersus excitation rank R . W e consider graphs generated as G ∼ SBM ( N , K, 8 log N / N , log N / N ) with N = 150 , K = 3 and ﬁxed T d = 16, L = 10 3 . BlindCD to a 2-step pro cedure whic h uses [ 18 ] (with eﬃcien t implementation in [ 39 ]) to reco v er the GSO, then it applies sp ectral clustering on the reco vered GSO to detect comm unities. F or BlindCD , w e observe that the p erformance impro ves with the rank R , while the 2-step pro cedure p erforms p o orly 3 . As predicted by Corollary 1 , the bo osting technique enhances the p erformance of BlindCD . The example in Fig. 3 sho ws the p erformance of an instance of BlindCD on the Zachary’s Karate Club netw ork when the graph signals are generated from the diﬀusion dynamics. T o capitalize on the b eneﬁt of the b o osting technique, we consider a scenario with a ﬁlter order of T d = 6, observ ation rank of R = 5 (the graph is excited on just 5 nodes) and we observe L = 10 3 noisy samples of the graph signals. Observ e that the lo w-pass coeﬃcient for the ﬁlter ma y b e close to 1 as T d is small. This explains the p o orer performance of BlindCD in Fig. 3 . (b) . The b o osted BlindCD , instead, delivers go o d p erformance as it iden tiﬁes the t wo communities in the netw ork except for a miss-classiﬁcation of agen t 17. Through sorting the ro w sums of the estimated b B , we also detected the sites of the excitations, as shown in the Fig. 3 .a. 6.2 Net w ork Dynamics Mo dels W e describ e applications of our BlindCD methods on detecting comm unities in consumer and social net works, where the mo dels hav e b een studied in Sections 3.2 and 3.3 . In the Monte-Carlo sim ulations b elow, w e generate the graphs as G ∼ SBM (150 , 3 , 8 log N / N , log N / (2 N )), N = 150. F or the c onsumer games, A is tak en as the binary adjacency matrix of G and B is c hosen as in ( 19 ) where the set of aﬀected agents I is selected uniformly . F urthermore, in the utilit y ( 14 ) , we set b = 2 k A 1 k ∞ and a = 2 max ` k p ` k ∞ suc h that the equilibrium alwa ys satisﬁes ( 16 ) . F or the so cial netw orks, w e ﬁrst generate the supp ort of B as a sparse bipartite graph with connectivit y 2 log N / N , then the w eights on A , B are assigned uniformly suc h that all the ro ws in the concatenated matrix [ A , B ] sum up to one. This mo dels a setting where the stubb orn agents 3 The 2-step metho d with [ 18 ] provides accurate result only when R ≥ 100. 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 (a) Zacha ry’s k arate club netw ork. Highlighted nodes in magenta are the actual sites of excitations, while no des mark ed as rectangles a re the detected sites of excitations using bo osted BlindCD. The only mismatches w ere node ‘4’ and ‘34’. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 (b) BlindCD result (c) Boosted BlindCD result Figure 3: Exp erimen ts on Zac hary’s k arate club netw ork . The netw ork consists of N = 34 no des and (appro ximately) K = 2 communities. The graph ﬁlter mo dels a diﬀusion dynamics with an order of T d = 6 and the graph signals observed hav e only rank R = 5 as only 5 no des are injected with input signals. The b ottom plots sho w the result of b oth BlindCD metho ds. are connected sparsely to the others, i.e., they are located at the p eriphery of the c ommunities . Note the supp ort of A is symmetric with A ij 6 = 0 ⇔ A j i 6 = 0. Snapshots of the set-ups for both net works are found in Fig. 4 . Despite the similarit y to the previous examples, it is imp ortan t to note that for the so cial net works, the Laplacian matrix L can b e asymmetric. Nevertheless, we anticipate that BlindCD w ould w ork in this case provided that L is approximately symmetric. This symmetry in L is consisten t with assuming that trust in so cial netw orks is of mutual nature 4 . The consumption levels and steady-state opinions can b oth b e generated from the graph ﬁlter 4 Additional numerical experiments sho w that, on a directed graph where the trusts are not mutual, the BlindCD metho d recov ers the same sets of no des that are discov ered b y p erforming sp ectral clustering on the eigenve ctors of L . W e omit these in teresting results here since their interpretation requires a diﬀerent notion of comm unity for directed graphs, e.g., see [ 49 ]. F urther in vestigations on this sub ject are left for future work. 18 Figure 4: Snapshots of set-ups for case studies on netw ork dynamics . (Left) A consumers net work, where the highligh ted no des are the agen ts that the pricing exp erimen ts were p erformed on. (Right) A so cial net work, where the highlighted no des are the stubb orn agen ts. Both netw orks are generated according to SBM (150 , 3 , 8 log N / N , log N / (2 N )). in Example 2 . The diﬀerence b etw een the t wo cases rests on the design of the sk etc h matrix B . In the following, we ﬁx the n umber of samples at L = 10 4 with a noise v ariance of σ 2 w = (10 − 1 /b 2 ) 2 for consumer games and σ 2 w = 10 − 2 for so cial netw orks. F or the b o osted BlindCD metho d, we set κ = 2 / √ L, ρ = 4 / √ RL for consumer games and κ = 2 / √ L, ρ = 1 / √ RL for so cial netw orks; and w e test the formulation of ( 41 ) with the regularizer g 1 ( b B ). F or the so cial net w ork, w e included a comparison to a 2-step pro cedure whic h ﬁrst reco vers the graph top ology using [ 22 ], and then applying sp ectral clustering on the inferred top ology . The results of our numerical exp eriments are shown in Fig. 5 , where w e compare the communit y detection p erformance as the excitation rank R increases in b oth systems. Similar to the previous exp erimen t in Fig. 2 , for b oth cases w e observe that the p erformance impro ves with R and the b o osted BlindCD method deliv ers the b est p erformance consisten tly . Ov erall, the performance impro vemen t with b o osted BlindCD is greater than in the previous example [cf. Fig. 2 ]. The reason b ehind this is the fact that the I IR graph ﬁlter has a p o or low-pass co eﬃcient dep ending on the parameter c  1 for the scenario w e hav e considered. Another observ ation is that the communit y detection performance of the un-b o osted BlindCD saturates at R ≈ 25 for the opinion dynamics exp erimen ts while it contin ues to impro ve with R for pricing ones. This is due to the diﬀerent mo del used for the sk etch matrix B . In particular, for the pricing exp eriments, B is merely a sub-matrix of the identit y matrix. Recall from Theorem 1 that the p erformance of BlindCD depend on the pro duct k V > N − K B Q K k 2 k ( V > K B Q K ) − 1 k 2 , which is an ticipated to decrease since B approac hes a p erm utation of I as R approac hes N , yielding a b etter p erformance. The same observ ation does not apply for opinion dynamics as the sk etch matrix does not appro ximate the iden tity matrix as R gro ws. W e then illustrate an application on real netw ork topologies for the tw o netw ork dynamics. Fig. 6 shows an example of sim ulated pricing exp eriments on the net work highschool from [ 50 ], whic h is a friendship net work b et ween N = 70 high school studen ts with | E | = 273 undirected edges. 19 10 20 30 40 10 − 2 10 − 1 Excitation Rank R Erro r Rate P e BlindCD-noiseless cov. BlindCD-sampled cov. Bo osted BlindCD 10 20 30 40 10 − 2 10 − 1 10 0 Excitation Rank R BlindCD-noiseless cov. BlindCD-sampled cov. Bo osted BlindCD 2-step w/ [ 22 ] Figure 5: Comm unit y detection p erformance on cases of net work dynamics . (Left) Pricing exp erimen ts on consumers netw ork. (Right) Opinion dynamics with stubb orn agents on so cial net works. In b oth cases, we consider netw orks generated as G ∼ SBM ( N , 3 , 8 log N / N , log N / (2 N )), N = 150. On the other hand, Fig. 7 sho ws the case study for opinion dynamics on the F aceb o ok net work of ReedCollege [ 51 ], which is a friendship net work with N = 962 college students with | E | = 18 , 812 undirected edges, and we inﬂuence the net work using R = 150 stubb orn agents. T o handle the high dimensionalit y , w e applied the fast algorithm from [ 52 ] to solv e the robust PCA problem in ( 41 ) . In b oth cases, w e observ e that the b o osted BlindCD metho d recov ers the comm unities in the netw orks, as evidenced from the illustrations and the ratio-cut scores. 6.3 Application to US Senate Rollcall Records W e consider applying the BlindCD metho ds to the US Senate rollcall record on https://voteview. com for the 110th congress. The dataset con tains 657 rollcalls during the p erio d from 2007 to 2009. T o represent the opinions of the states during a rollcall, we consider the votes from the t wo Senators of a state by counting a ‘y a y’ as 1, while a ‘nay’ or ‘absen t’ is counted as 0. By treating eac h state as a no de on a graph with 50 no des, this results in L = 657 samples of graph signals with v alues { 0 , 1 , 2 } . As argued in [ 33 ], the rollcall data may b e mo deled as the equilibrium of an opinion dynamics pro cess with stubb orn agents. Therefore, we selected 4 states – Massach usetts (MA), New Y ork (NY), Alabama (AL), Louisiana (LA), whic h are the most lib eral/conserv ative states [ 53 ], as the ‘stubb orn’ states mo deled in Section 3.3 . W e then apply the BlindCD metho ds to detect commu nities for the remaining N = 46 states. F or the b o osted metho d, we use the sparse regularizer g 1 ( b B ) to promote sparsit y in the b B ? comp onen t of the solution. Fig. 8 shows the K = 2 communities detected using the prop osed metho ds. W e observe that the b o osted BlindCD method successfully iden tiﬁes Maine to b e in the same communit y as T exas, where b oth states were controlled by Republicans in this congress. Fig. 9 sho ws the inferred b B ? 20 matrix mo deled in Section 3.3 , where w e labeled the rows as the stubb orn states and the columns as the regular states. A large num b er in the table indicates strong inﬂuence from the stubb orn to regular state. W e observe consistent results, e.g., NY ( r esp. LA) p ositiv ely inﬂuencing Illinois ( r esp. Ark ansas) as b oth are Demo crat ( r esp. Republican) states in this congress; NY is negatively inﬂuencing Idaho (Republican in this congress). 7 Conclusions This paper prop oses tw o blind communit y detection metho ds for inferring communit y structure from graph signals. W e consider a challenging and realistic setting where the observed graph signals are outcomes of a graph ﬁlter with low-rank excitations. The BlindCD metho ds rely on an intrinsic low-p ass property of the graph ﬁlters that mo dels the netw ork dynamics. This property holds for common netw ork processes and the accuracy of BlindCD is analyzed by viewing the graph signals as sketc hes of the graph ﬁlters. W e prop ose a bo osting tec hnique to impro v e the p erformance of BlindCD . The technique leverages the latent ‘low-rank plus sparse’ structure related to the graph signals. Extensive numerical exp eriments v erify our ﬁndings. A Pro of of Theorem 1 T o simplify the notations while proving the theorem, let us deﬁne the follo wing indicator matrices for the communities. Firstly , the matrix c X ∈ R N × K is asso ciated with the communities { ˆ C 1 , ..., ˆ C K } found with BlindCD and deﬁned as ˆ X ij : = ( 1 / q | ˆ C j | , if i ∈ ˆ C j , 0 , otherwise . (54) W e hav e k b V K − c X c X > b V K k 2 F = K X k =1 X i ∈ ˆ C k    ˆ v row i − 1 | ˆ C k | X j ∈ ˆ C k ˆ v row j    2 2 . Deﬁne X as the set of all p ossible indicator matrices of partitions. Using Condition 1 in Theorem 1 , w e hav e that k b V K − c X c X > b V K k 2 F ≤ (1 +  ) min X ∈X k b V K − X X > b V K k 2 F ≤ (1 +  ) k b V K − X ? ( X ? ) > b V K k 2 F , (55) where we ha v e deﬁned X ? ∈ R N × K b y replacing ˆ C i in ( 54 ) with C ? i suc h that C ? 1 , . . . , C ? K is an optimal set of comm unities found by minimizing F ( C 1 , ..., C K ) [cf. ( 9 ) ]. On the other hand, by the deﬁnition, k V K − X ? ( X ? ) > V K k 2 F = min X ∈X k V K − X X > V K k 2 F = min C 1 ,..., C K F ( C 1 , ..., C K ) = F ? , (56) and furthermore k V K − c X c X > V K k 2 F = F ( ˆ C 1 , ..., ˆ C K ). 21 Deﬁne the error matrix as E : = V K V > K − b V K b V > K . W e observe the following c hain of inequalities: k V K − c X c X > V K k F = k ( I − c X c X > )( b V K b V > K + E ) k F ≤ k ( I − c X c X > ) b V K b V > K k F + k ( I − c X c X > ) E k F ≤ k ( I − c X c X > ) b V K b V > K k F + k E k F , (57) where the ﬁrst equality is due to V > K V K = I and the last inequality is due to I − c X c X > is a pro jection matrix. Using ( 55 ), we hav e that k ( I − c X c X > ) b V K b V > K k F + k E k F ≤ √ 1 +  k ( I − X ? ( X ? ) > )( V K V > K − E ) k F + k E k F ≤ √ 1 +  k ( I − X ? ( X ? ) > ) V K V > K k F + √ 1 +  k ( I − X ? ( X ? ) > ) E k F + k E k F ≤ √ 1 +  k ( I − X ? ( X ? ) > ) V K V > K k F + (2 +  ) k E k F = p (1 +  ) F ? + (2 +  ) k E k F , (58) where we hav e used the fact I − X ? ( X ? ) > is a pro jection matrix and √ 1 +  ≤ 1 +  in the third inequalit y . The ﬁnal step is to b ound k E k F , where we rely on the following results. Lemma 4 [ 42 , L emma 7] F or any A , B ∈ R N × K with N ≥ K and A > A = B > B = I , it holds that k AA > − B B > k 2 F ≤ 2 K k AA > − B B > k 2 2 . (59) Prop osition 1 Under Conditions 2 to 4 in The or em 1 , we have k V K V > K − V K V > K k 2 2 = (1 + γ 2 ) − 1 γ 2 , (60) wher e the c olumns of V K ar e the top K eigenve ctors of C y and γ is b ounde d as state d in ( 35 ) . Prop osition 2 Under Condition 5 in The or em 1 , it holds that k V K V > K − b V K b V > K k 2 ≤ k b C y − C y k 2 /δ . (61) The pro ofs of the propositions can b e found in the subsections A and B of this app endix. Applying Lemma 4 we obtain that k E k F ≤ √ 2 K k V K V > K − b V K b V > K k 2 . (62) Com bining ( 60 ), ( 61 ) and using the triangle inequality yields q F ( ˆ C 1 , ..., ˆ C K ) = k ( I − c X c X > ) V K V > K k F ≤ p (1 +  ) F ? + (2 +  ) √ 2 K  s γ 2 1 + γ 2 + k b C y − C y k 2 δ  , concluding the pro of. 22 A.1 Pro of of Prop osition 1 W e b egin our pro of by establishing the relationships b et ween V K , V K and the left singular vectors of H ( L ) B . Denote the rank- K appro ximation to H ( L ) as [ H ( L )] K : = V K diag ( ˜ h K ) V > K . This expression is v alid due to the lo w pass prop ert y of H ( L ). Deﬁne ˜ B : = B Q K , we observe that R ([ H ( L )] K ) = R ([ H ( L )] K ˜ B ) , (63) whic h is due to Condition 3 in Theorem 1 such that the linear transformation ˜ B do es not mo dify the range space of [ H ( L )] K . Similarly , [ C y ] K : = V K diag ( σ K ) 2 V > K is the rank K appro ximation to C y . W e observ e the equiv alences R ([ C y ] K ) = R ([ H ( L ) B ] K ) = R ( H ( L ) ˜ B ) (64) where the last equalit y is due to H ( L ) ˜ B = H ( L ) B Q K = V K diag ( σ K ), as w e recall that the columns of Q K are the top K righ t singular v ectors of H ( L ) B . F urthermore, R ([ H ( L )] K ˜ B ) ⊥ R (( H ( L ) − [ H ( L )] K ) ˜ B ) . (65) Let the columns of e V K and e V K b e resp ectively the top- K singular vectors of [ H ( L )] K ˜ B and H ( L ) ˜ B , therefore ( 63 ) and ( 64 ) imply that V K V > K = e V K e V > K and V K V > K = e V K e V > K . Inv oking ( 65 ) with [ 54 , Lemma 8] through setting D = H ( L ) ˜ B , C = [ H ( L )] K ˜ B and E = ( H ( L ) − [ H ( L )] K ) ˜ B therein, and applying [ 55 , Theorem 2.6.1], we obtain that k V K V > K − V K V > K k 2 2 = k e V K e V > K − e V K e V > K k 2 2 = 1 − β K  [ H ( L )] K ˜ B Π † ([ H ( L )] K ˜ B ) >  , (66) where w e hav e deﬁned Π : = ( H ( L ) ˜ B ) > H ( L ) ˜ B and β K ( · ) denotes the K th largest eigen v alue. Under Condition 4 in Theorem 1 , the K × K matrix Π is non-singular. W e observe the follo wing c hain of equalities β K  [ H ( L )] K ˜ B Π − 1 ([ H ( L )] K ˜ B ) >  = β K  diag( ˜ h K ) V > K ˜ B Π − 1 (diag( ˜ h K ) V > K ˜ B ) >  = 1 β 1  (diag( ˜ h K ) V > K ˜ B ) −> Π (diag( ˜ h K ) V > K ˜ B ) − 1  , (67) where the ﬁrst equality is due to β K ( U AU > ) = β K ( A ) for an y symmetric A and U ∈ R N × K with orthogonal columns, and the second equality follo ws since the argument in β K ( · ) is of rank K . Moreo ver, Π admits the decomp osition Π = ( H ( L ) ˜ B ) > H ( L ) ˜ B = ˜ B > H ( L ) > H ( L ) ˜ B = ˜ B > V K diag( ˜ h K ) 2 V > K ˜ B + ˜ B > V N − K diag( ˜ h N − K ) 2 V > N − K ˜ B . (68) 23 Th us, yielding that β K  [ H ( L )] K ˜ B Π − 1 ([ H ( L )] K ˜ B ) >  =  1 + β 1  (diag( ˜ h K ) V > K ˜ B ) −> ˜ B > V N − K diag( ˜ h N − K ) 2 V > N − K ˜ B (diag( ˜ h K ) V > K ˜ B ) − 1  − 1 = 1 1 + k diag( ˜ h N − K ) V > N − K ˜ B (diag( ˜ h K ) V > K ˜ B ) − 1 k 2 2 =  1 + γ 2  − 1 , where we hav e deﬁned γ such that γ : = k diag( ˜ h N − K ) V > N − K ˜ B (diag( ˜ h K ) V > K ˜ B ) − 1 k 2 ≤ η k V > N − K B Q K k 2 k ( V > K B Q K ) − 1 k 2 . (69) Substituting the ab ov e into ( 66 ) concludes the pro of. A.2 Pro of of Prop osition 2 Denote the SVD of the sampled cov ariance as b C y = b V b Σ b V > . The left hand side of ( 61 ) can b e written as k V K V > K − b V K b V > K k 2 = k b V > N − K V K k 2 , (70) where the equality is due to [ 55 , Theorem 2.6.1]. Deﬁne ∆ : = b C y − C y . Condition 5 in Theorem 1 implies that the largest eigen v alue in ˆ Σ N − K will never exceed β K ( C y ) − δ since β max ( ˆ Σ N − K ) = β K +1 ( b C y ) ≤ β K +1 ( C y ) + β 1 ( ∆ ) ≤ β K +1 ( C y ) + k ∆ k 2 , (71) where the ﬁrst inequality is due to W eyl’s inequality [ 55 ]. The p erturb ed matrix b C y th us satisﬁes the requirement of the Da vis-Kahan’s sin(Θ) theorem [ 56 ] k b V > N − K V K k 2 ≤ δ − 1 k b V > N − K ∆ V K k 2 . (72) The inequality in ( 61 ) is obtained by observing that b oth V K and b V N − K are orthogonal matrices. B Pro of of Lemma 2 Fix 1 ≥ c > 0. Under the conditions stated in the lemma, the least-squares optimization ( 37 ) admits a closed form solution H ? − H ( L ) B =  L X ` =1 w ` ( z ` ) >  L X ` =1 z ` ( z ` ) >  − 1 , (73) 24 where w ` w as introduced in ( 11 ). Denoting the right hand side in ( 73 ) by E , we ha ve that k E k 2 =       1 L L X ` =1 w ` ( z ` ) >  1 L L X ` =1 z ` ( z ` ) >  − 1      2 ≤      1 L L X ` =1 w ` ( z ` ) >      2       1 L L X ` =1 z ` ( z ` ) >  − 1      2 . (74) Observ e that 1 L P L ` =1 z ` ( z ` ) > con verges to I such that with probabilit y at least 1 − c ,      1 L L X ` =1 z ` ( z ` ) > − I      2 ≤ C 0 r R log(1 /c ) L , (75) for some constant C 0 . Applying [ 57 , Proposition 2.1] w e get that       1 L L X ` =1 z ` ( z ` ) >  − 1      2 ≤ 1 −    1 L L X ` =1 z ` ( z ` ) > − I    2 ! − 1 ≤ 1 − C 0 r R log(1 /c ) L ! − 1 . On the other hand, observe that E [ w ` ( z ` ) > ] = 0 and k w ` ( z ` ) > k ≤ C w almost surely . Applying the matrix Bernstein’s inequality [ 58 , Theorem 1.6] sho ws that with probabilit y at least 1 − c and for suﬃcien tly large L ,      1 L L X ` =1 w ` ( z ` ) >      2 ≤ C 1 r σ 2 w log(( N + R ) /c ) L , (76) for some constant C 1 . Finally , with probabilit y at least 1 − 2 c , k E k 2 ≤ C 1 p σ 2 w log(( N + R ) /c ) √ L − C 0 p R log(1 /c ) = O ( σ w / √ L ) . (77) C Pro of of Corollary 1 Let e V K and e S K b e the top K left singular vectors of e H ( L ) B and b S ? , resp ectively . W e can repeat the pro of for Theorem 1 up to ( 58 ) b y re-deﬁning the error matrix E therein as e E = V K V > K − e S K e S > K . This entails q F ( ˜ C 1 , ..., ˜ C K ) − p (1 +  ) F ? ≤ (2 +  ) k e E k F . (78) Next, we b ound k e E k F . Applying Lemma 4 and using the triangle inequalit y w e get that k e E k F ≤ √ 2 K k V K V > K − e S K e S > K k 2 ≤ √ 2 K  k V K V > K − e V K e V > K k 2 + k e V K e V > K − e S K e S > K k 2  , Prop osition 1 implies that k V K V > K − e V K e V > K k 2 ≤ p ˜ γ / (1 + ˜ γ ) , (79) 25 where ˜ γ is b ounded as in ( 51 ). Our remaining task is to b ound k e V K e V > K − e S K e S > K k 2 . Observe that k e V K e V > K − e S K e S > K k 2 = k e S > R − K e V K k 2 (80) and σ K ( b S ? ) ≥ σ K ( e H ( L ) B ) − k e ∆ k 2 , (81) where we recalled the deﬁnition e ∆ = b S ? − e H ( L ) B and applied the W eyl’s inequality [ 55 ]. F rom ( 49 ), we hav e that σ K ( e H ( L ) B ) − k e ∆ k 2 = σ K +1 ( e H ( L ) B ) + ˜ δ , (82) with ˜ δ > 0. Finally , applying the W edin theorem [ 59 ] yields k e S > R − K e V K k 2 ≤ ( ˜ δ ) − 1 k e ∆ k 2 . (83) References [1] A. Sandryhaila and J. M. Moura, “Discrete signal pro cessing on graphs,” IEEE T r ans. Signal Pr o c ess. , v ol. 61, no. 7, pp. 1644–1656, 2013. [2] D. Shuman, S. Narang, P . F rossard, A. Ortega, and P . V andergheynst, “The emerging ﬁeld of signal pro cessing on graphs: Extending high-dimensional data analysis to netw orks and other irregular domains,” IEEE Signal Pr o c ess. Mag. , vol. 30, no. 3, pp. 83–98, May 2013. [3] A. Ortega, P . F rossard, J. Ko v aˇ cevi ´ c, J. M. Moura, and P . V andergheynst, “Graph signal pro cessing,” arXiv pr eprint arXiv:1712.00468 , 2017. [4] A. G. Marques, S. Segarra, G. Leus, and A. Rib eiro, “Sampling of graph signals with successive lo cal aggregations,” IEEE T r ans. Signal Pr o c ess. , v ol. 64, no. 7, pp. 1832–1843, April 2016. [5] S. Chen, R. V arma, A. Sandryhaila, and J. Ko v a ˇ cevi ´ c, “Discrete signal pro cessing on graphs: Sampling theory ,” IEEE T r ans. Signal Pr o c ess. , vol. 63, no. 24, pp. 6510–6523, Dec 2015. [6] D. Romero, M. Ma, and G. B. Giannakis, “Kernel-based reconstruction of graph signals,” IEEE T r ans. Signal Pr o c ess. , vol. 65, no. 3, pp. 764–778, F eb 2017. [7] S. Segarra, A. G. Marques, G. Leus, and A. Rib eiro, “Reconstruction of graph signals through p ercolation from seeding nodes,” IEEE T r ans. Signal Pr o c ess. , vol. 64, no. 16, pp. 4363–4378, Aug 2016. [8] S. Segarra, A. G. Marques, and A. Rib eiro, “Optimal graph-ﬁlter design and applications to distributed linear net work op erators,” IEEE T r ans. Signal Pr o c ess. , v ol. 65, no. 15, pp. 4117–4131, Aug 2017. [9] E. Isuﬁ, A. Louk as, A. Simonetto, and G. Leus, “Autoregressive mo ving av erage graph ﬁltering,” IEEE T r ans. Signal Pr o c ess. , v ol. 65, no. 2, pp. 274–288, Jan 2017. [10] J. F riedman, T. Hastie, and R. Tibshirani, “Sparse in verse cov ariance estimation with the graphical lasso,” Biostatistics , v ol. 9, no. 3, pp. 432–441, 2008. 26 [11] E. Pa vez and A. Ortega, “Generalized Laplacian precision matrix estimation for graph signal pro cessing,” in IEEE Intl. Conf. A c oust., Sp e e ch and Signal Pr o c ess. (ICASSP) , Shanghai, China, Mar. 20-25, 2016. [12] Y. Shen, B. Baingana, and G. B. Giannakis, “Kernel-based structural equation mo dels for top ology identiﬁcation of directed net w orks,” IEEE T r ans. Signal Pr o c ess. , vol. 65, no. 10, pp. 2503–2516, 2017. [13] X. Cai, J. A. Bazerque, and G. B. Giannakis, “Sparse structural equation mo deling for inference of gene regulatory netw orks exploiting genetic p erturbations,” PL oS, Computational Biolo gy , Jun. 2013. [14] H.-T. W ai, A. Scaglione, U. Harush, B. Barzel, and A. Leshem, “Rids: Robust iden tiﬁ- cation of sparse gene regulatory net works from perturbation experiments,” arXiv pr eprint arXiv:1612.06565 , 2016. [15] X. Dong, D. Thanou, P . F rossard, and P . V andergheynst, “Learning Laplacian matrix in smo oth graph signal representations,” IEEE T r ans. Signal Pr o c ess. , vol. 64, no. 23, pp. 6160–6173, Dec 2016. [16] V. Kalofolias, “How to learn a graph from smo oth signals,” in Intl. Conf. Artif. Intel. Stat. (AIST A TS) . J Mach. Learn. Res., 2016, pp. 920–929. [17] S. P . Chepuri, S. Liu, G. Leus, and A. O. Hero, “Learning sparse graphs under smoothness prior,” in IEEE Intl. Conf. A c oust., Sp e e ch and Signal Pr o c ess. (ICASSP) , Marc h 2017, pp. 6508–6512. [18] S. Segarra, A. G. Marques, G. Mateos, and A. Ribeiro, “Net w ork top ology inference from sp ectral templates,” IEEE T r ans. Signal and Info. Pr o c ess. over Networks , vol. 3, no. 3, pp. 467–483, 2017. [19] S. Segarra, G. Mateos, A. G. Marques, and A. Rib eiro, “Blind iden tiﬁcation of graph ﬁlters,” IEEE T r ans. Signal Pr o c ess. , v ol. 65, no. 5, pp. 1146–1159, Marc h 2017. [20] R. Shaﬁp our, S. Segarra, A. G. Marques, and G. Mateos, “Netw ork top ology inference from non-stationary graph signals,” in Pr o c. ICASSP , March 2017, pp. 5870–5874. [21] H. E. Egilmez, E. P av ez, and A. Ortega, “Graph learning from ﬁltered signals: Graph system and diﬀusion kernel iden tiﬁcation,” IEEE T r ansactions on Signal and Information Pr o c essing over Networks , 2018. [22] H.-T. W ai, A. Scaglione, and A. Leshem, “Activ e sensing of so cial net works,” IEEE T r ans. Signal and Info. Pr o c ess. over Networks , v ol. 2, no. 3, pp. 406–419, 2016. [23] D. Marbach, J. C. Costello, R. K ¨ uﬀner, N. M. V ega, R. J. Prill, D. M. Camacho, K. R. Allison, A. Aderhold, R. Bonneau, Y. Chen et al. , “Wisdom of cro wds for robust gene netw ork inference,” Natur e metho ds , vol. 9, no. 8, p. 796, 2012. [24] S. F ortunato, “Communit y detection in graphs,” Physics r ep orts , vol. 486, no. 3, pp. 75–174, 2010. 27 [25] O. Candogan, K. Bimpikis, and A. Ozdaglar, “Optimal pricing in net works with externalities,” Op er ations R ese ar ch , vol. 60, no. 4, pp. 883–905, 2012. [26] B. Ata, A. Belloni, and O. Candogan, “Latent agents in net works: Estimation and pricing,” arXiv:1808.04878v1 , August 2018. [27] M. H. DeGroot, “Reaching a consensus,” Journal of the Americ an Statistic al Asso ciation , v ol. 69, no. 345, pp. 118–121, 1974. [28] H. Zha, X. He, C. Ding, M. Gu, and H. D. Simon, “Sp ectral relaxation for k-means clustering,” in A dvanc es in neur al information pr o c essing systems , 2002, pp. 1057–1064. [29] E. Abb e, “Communit y detection and sto chastic block mo dels: recen t developmen ts,” The Journal of Machine L e arning R ese ar ch , vol. 18, no. 1, pp. 6446–6531, 2017. [30] U. von Luxburg, “A tutorial on sp ectral clustering,” Statistics and Computing , v ol. 17, no. 4, pp. 395–416, Dec 2007. [31] J. A. Hartigan and M. A. W ong, “Algorithm AS 136: A k-means clustering algorithm,” Journal of the R oyal S tatistic al So ciety. Series C (Applie d Statistics) , vol. 28, no. 1, pp. 100–108, 1979. [32] A. Kumar, Y. Sabharw al, and S. Sen, “A simple linear time (1+  )-appro ximation algorithm for k-means clustering in an y dimensions,” in FOCS . IEEE, 2004, pp. 454–462. [33] S. X. W u, H.-T. W ai, and A. Scaglione, “Estimating so cial opinion dynamics mo dels from v oting records,” IEEE T r ansactions on Signal Pr o c essing , v ol. 66, no. 16, pp. 4193–4206, 2018. [34] D. Acemoglu, A. Ozdaglar, and A. ParandehGheibi, “Spread of (mis) information in so cial net works,” Games and Ec onomic Behavior , vol. 70, no. 2, pp. 194–227, 2010. [35] E. Yildiz, A. Ozdaglar, D. Acemoglu, A. Sab eri, and A. Scaglione, “Binary opinion dynamics with stubb orn agents,” ACM T r ansactions on Ec onomics and Computation (TEA C) , vol. 1, no. 4, p. 19, 2013. [36] P . Jia, A. MirT abatabaei, N. E. F riedkin, and F. Bullo, “Opinion dynamics and the evolution of so cial p o wer in inﬂuence net works,” SIAM r eview , vol. 57, no. 3, pp. 367–397, 2015. [37] M. E. Yildiz and A. Scaglione, “Computing along routes via gossiping,” IEEE T r ans. on Signal Pr o c ess. , v ol. 58, no. 6, pp. 3313–3327, 2010. [38] A. Ben-T al and A. Nemiro vski, L e ctur es on mo dern c onvex optimization: analysis, algorithms, and engine ering applic ations . Siam, 2001, vol. 2. [39] R. Shaﬁp our, A. Hashemi, G. Mateos, and H. Vik alo, “Online topology inference from streaming stationary graph signals,” submitte d to DSW 2019 , 2019. [40] J. Tsitsiklis, “Problems in decentralized decision making and computation,” Ph.D. dissertation, Dept. of Electrical Engineering and Computer Science, M.I.T., Boston, MA, 1984. [41] D. P . Bertsek as, Nonline ar pr o gr amming . Athena scientiﬁc Belmont, 1999. 28 [42] C. Boutsidis, P . Kam badur, and A. Gittens, “Spectral clustering via the p ow er method-prov ably ,” in International Confer enc e on Machine L e arning , 2015, pp. 40–48. [43] N. T rembla y , G. Puy , R. Grib onv al, and P . V andergheynst, “Compressiv e sp ectral clustering,” in International Confer enc e on Machine L e arning , 2016, pp. 1002–1011. [44] R. V ersh ynin, High-Dimensional Pr ob ability . Cam bridge Univ ersit y Press, 2017. [45] A. Agarwal, S. Negah ban, and M. J. W ainwrigh t, “Noisy matrix decomp osition via conv ex relaxation: Optimal rates in high dimensions,” The A nnals of Statistics , pp. 1171–1197, 2012. [46] J. Huang, T. Zhang, and D. Metaxas, “Learning with structured sparsity ,” JMLR , vol. 12, no. No v, pp. 3371–3412, 2011. [47] S. N. Negah ban, P . Ravikumar, M. J. W ain wright, and B. Y u, “A uniﬁed framew ork for high-dimensional analysis of m-estimators with decomp osable regularizers,” Statistic al Scienc e , pp. 538–557, 2012. [48] P . W. Holland, K. B. Laskey , and S. Leinhardt, “Sto chastic blo ckmodels: First steps,” So cial networks , vol. 5, no. 2, pp. 109–137, 1983. [49] F. D. Malliaros and M. V azirgiannis, “Clustering and c omm unity detection in directed netw orks: A survey ,” Physics R ep orts , vol. 533, no. 4, pp. 95–142, 2013. [50] J. S. Coleman et al. , “In tro duction to mathematical so ciology .” Intr o duction to mathematic al so ciolo gy. , 1964. [51] A. L. T raud, P . J. Muc ha, and M. A. Porter, “So cial structure of F aceb o ok net w orks,” Physic a A: Statistic al Me chanics and its Applic ations , v ol. 391, no. 16, pp. 4165–4180, 2012. [52] A. Ara vkin, S. Beck er, V. Cevher, and P . Olsen, “A v ariational approach to stable principal comp onen t pursuit,” in UAI , July 2014. [53] P . R. Cen ter, “Political ideology by state.” [Online]. Av ailable: h ttp://www.p ewforum.org/ religious- landscap e- study/compare/p olitical- ideology/by/state/ [54] A. Gittens, P . Kambadur, and C. Boutsidis, “Approximate sp ectral clustering via randomized sk etching,” , 2013. [55] G. H. Golub and C. F. V an Loan, Matrix c omputations . JHU Press, 2012, vol. 3. [56] C. Da vis and W. M. Kahan, “The rotation of eigen vectors b y a p erturbation. iii,” SIAM Journal on Numeric al Analysis , vol. 7, no. 1, pp. 1–46, 1970. [57] R. V ershynin, “How close is the sample cov ariance matrix to the actual co v ariance matrix?” Journal of The or etic al Pr ob ability , vol. 25, no. 3, pp. 655–686, 2012. [58] J. A. T ropp, “User-friendly tail b ounds for sums of random matrices,” F oundations of c ompu- tational mathematics , vol. 12, no. 4, pp. 389–434, 2012. [59] P .-A. W edin, “P erturbation bounds in connection with singular v alue decomposition,” BIT Numeric al Mathematics , v ol. 12, no. 1, pp. 99–111, 1972. 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 (a) Highschool netw ork. The p ricing exp eriments modify p rices for the agents ma rked with squa re . The above clustering on the netw o rk is computed from the true Laplacian matrix L , which has a RatioCut of 3 . 618 . 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 (b) BlindCD, RatioCut = 6 . 769 . (c) Boosted BlindCD, RatioCut = 4 . 701 . Figure 6: Pricing exp erimen ts on the Highschool netw ork . The netw ork comprises of N = 70 agen ts and (appro ximately) K = 3 communities. The utility parameters in ( 14 ) are a = R and b = 2 k A 1 k ∞ . The netw ork’s equilibrium consumption levels are collected for L = 10 3 samples, each observ ed with a noise v ariance of σ 2 w = 10 − 2 /b 4 . The pricing exp eriments are conducted through con trolling the prices for R = 18 agents. 30 (a) ReedCollege netw ork. The clustering above is found by applying spectral clustering on the true L . The obtained RatioCut is 0 . 1419 w.r.t. A . (b) Applying BlindCD metho d. The obtained RatioCut is 0 . 8953 w.r.t. A . (c) Applying Bo osted BlindCD method. The obtained RatioCut is 0 . 5249 w.r.t. A . Figure 7: Opinion dynamics exp eriments on ReedCollege netw ork. The netw ork comprises of N = 962 agen ts and (approximately) K = 2 comm unities. The netw ork’s steady-state opinions are collected with L = 10 4 samples, eac h observed with a noise v ariance of σ 2 w = 10 − 2 . There are R = 150 stubb orn agents in the exp erimen ts, which are connected to the so cial netw ork through a random bi-partite graph with connectivit y log N/ N . 31 Figure 8: Applying BlindCD methods on the 110th US Senate Rollcall records. The states mark ed in red / blue are found to b e in diﬀeren t communities; while the states marked in gray are mark ed as the ‘stubb orn’ states as explained in the text. (Left) Results of BlindCD . (Righ t) Results of b o osted BlindCD . CT ME NH RI VT DE NJ P A IL IN MI OH WI IA KS MN MA 0 0 0 9 4 0 0 8 0 0 0 5 0 0 0 0 NY 11 3 0 5 4 0 14 0 29 9 6 0 0 0 0 0 AL 0 0 31 0 0 0 0 0 0 0 0 0 0 5 5 0 LA 0 0 0 -2 0 0 0 0 0 2 13 0 0 0 0 0 TN WV AZ CO ID MT NV NM UT WY CA OR W A AK HI MA 2 8 0 4 6 0 0 0 2 0 0 6 0 0 0 NY 0 0 0 0 -9 0 0 0 0 0 10 0 0 0 0 AL 19 0 7 0 19 0 18 0 28 34 0 0 0 9 0 LA 0 5 0 0 0 28 0 0 0 0 0 0 0 0 0 OK MO NE ND SD V A AR FL GA MS NC SC TX KY MD MA -7 6 0 0 1 0 1 0 0 -2 0 0 0 0 0 NY 0 0 0 0 0 0 0 8 0 0 0 0 0 0 8 AL 15 0 0 0 0 0 0 0 18 16 11 26 4 12 0 LA 0 19 11 20 2 0 34 5 0 5 0 0 5 0 0 Figure 9: Illustrating the b B ? matrix found with b o osted BlindCD metho d [cf. ( 20 ) , ( 41 ) ]. Note that the v alues within the table hav e b een rescaled. 32

Blind Community Detection from Low-rank Excitations of a Graph Filter

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment