Including Node Textual Metadata in Laplacian-constrained Gaussian Graphical Models

Including Node T e xtual Metadata in Laplacian-constrained Gaussian Graphical Models Jianhua W ang ∗ † , Killian Cressant † , Pedro Braconnot V elloso † , Arnaud Breloy † ∗ Universit ´ e P aris-Nanterr e, LEME, IUT V ille d’A vray , V ille d’A vray , F rance † Conservatoir e National des Arts et M ´ etiers, CN AM, P aris, F r ance Abstract —This paper addresses graph lear ning in Gaussian Graphical Models (GGMs). In this context, data matrices often come with auxiliary metadata (e.g., textual descriptions associ- ated with each node) that is usually ignored in traditional graph estimation processes. T o ﬁll this gap, we propose a graph learning approach based on Laplacian-constrained GGMs that jointly leverages the node signals and such metadata. The resulting f or - mulation yields an optimization pr oblem, for which we de velop an efﬁcient majorization-minimization (MM) algorithm with closed- form updates at each iteration. Experimental results on a real- world ﬁnancial dataset demonstrate that the pr oposed method signiﬁcantly improves graph clustering performance compared to state-of-the-art approaches that use either signals or metadata alone, thus illustrating the inter est of fusing both sources of information. Index T erms —Graph signal processing, graph learning, textual metadata, majorization-minimization I . I N T R O D U C T I O N Graphs are fundamental mathematical structures in both theoretical and applied sciences, providing a natural frame- work to model entities (nodes) and their relationships (edges). This representation enables the dev elopment of a wide range of methodologies for classical tasks such as signal ﬁltering, anomaly detection, clustering, and classiﬁcation. Representa- tiv e e xamples include graph clustering methods [1], [2], graph signal processing [3]–[5], and graph neural networks [6]. Most of the aforementioned tools, howe ver , are built upon the assumption that the underlying graph topology is known. In many practical scenarios, this assumption does not hold, and the graph structure must instead be inferred from data, giving rise to the problem of graph learning. A prominent line of research in statistical learning relates the graph topology to the conditional dependency structure among variables associated with the nodes. W ithin this framework, Gaussian graphical models (GGM), also referred to as Gaussian Markov random ﬁelds (GMRF), assume that the data is sampled from a multiv ariate Gaussian distribution and allow us to estimate the graph from the support of the precision matrix (the in verse of the co variance matrix) [7]. Furthermore, Laplacian-constrained GGMs impose the estimated precision to be a Laplacian matrix [8]. This framew ork bridges statistics (Gaussian models) to the ﬁeld of graph signal processing [9]: the resulting learned graphs are interpreted as the ones who fa vor smooth signal This work was supported by the MASSILIA project (ANR21-CE23-0038- 01) of the French National Research Agency (ANR). representations [10], [11]. Learned eigen vectors of the preci- sion matrix are also linked to a graph Fourier basis [4], [12]. In many real-world applications, additional side information describing node attributes is often av ailable. For example, textual metadata such as v ariable descriptions can be appended to the data matrix [13]–[16]. Classical graph learning methods usually discard such auxiliary information. This could result in degraded graph estimates and limit their ef fectiv eness in downstream tasks. T o address this limitation: • W e inv estigate the problem of learning a Laplacian matrix within the GMRF framew ork [8], [14], while explicitly incorporating side information in the objectiv e function. • W e develop an efﬁcient optimization algorithm based on the majorization–minimization (MM) principle [17], leading to a computationally ef ﬁcient algorithm with closed-form solutions at each iteration step. • W e illustrate the interest of our method on the real- world ﬁnancial dataset. Notably , it allows for improving the clustering performance compared to state-of-the-art approaches that use either the graph signals, or the metadata only . I I . B AC K G RO U N D W e consider an undirected, weighted graph represented by the triplet G = ( V , E , W ) , where V = { 1 , 2 , . . . , p } denotes the set of vertices (nodes), and E ⊆ {{ u, v } : u, v ∈ V , u  = v } is the edge set, i.e., a subset of all possible unordered pairs of nodes. The matrix W ∈ R p × p + denotes the symmetric weighted adjacency matrix satisfying W ii = 0 , W ij > 0 if { i, j } ∈ E , and W ij = 0 otherwise. The combinatorial graph Laplacian matrix L is deﬁned as L ≜ D − W , where D ≜ diag ( W1 ) is the degree matrix. In this work, we restrict our attention to estimate a com- binatorial Laplacian graph matrix, where the set of Laplacian matrices associated with connected graphs can be deﬁned as S L = { L ∈ S p +   L ij = L j i ≤ 0 , ∀ i  = j, L1 = 0 , rank( L ) = p − 1 } (1) The objecti ve of the sparse graph learning under the Laplacian- constrained GMRF is to estimate the precision of a Gaussian model x ∼ N ( 0 , Σ ) under the constraint that Σ + = L ∈ S L . T o deﬁne a parameterization that respects this constraint, we will adopt the linear operator L ( · ) from Deﬁnition 3.2 of [14], which maps a vector w ∈ R m ( m = p ( p − 1) / 2 ) 1 to a matrix L ( w ) ∈ R p × p . Concretely , this operator simply maps the vectorization of the upper triangular element of adjacency matrix W to the corresponding Laplacian matrix. The Laplacian set S L , deﬁned in (1), can thus be equiv alently expressed as S L =  L ( w ) ∈ S p +   w ∈ ( R + ) m  , (2) where the element-wise constraint w ≥ 0 enforces the non- negati vity of all edge weights. In the following, L w will be used in place of L ( w ) for notional simplicity . In conclusion, instead of directly optimizing the Laplacian matrix L , we will optimize the vector w I I I . P R O B L E M F O R M U L A T I O N The data matrix is denoted by X = [ x 1 , x 2 , . . . , x n ] ∈ R p × n , with x i ∈ R p being one observation of a graph signal on all the p nodes. The objectiv e is to learn the non- negati ve graph weight vector w from the signals X . In many practical scenarios, additional side-information associated with the graph nodes is also av ailable. W e denote the embedding of this side-information for node i by a v ector y i ∈ R d , where d may vary depending on the embedding method. A representati ve example we will use in this paper’ s e xperi- ments is SP500 stock dataset. In that case node represent com- panies, graph signals are their stock market values, and node metadata consists in textual descriptions of each stock (e.g., the company’ s principal activities extracted from the ofﬁcial website). These textual metadata can then be transformed into embedding vectors y i using representation learning techniques such as W ord2vec [18]. a) Graph learning fr om smooth signals: It is commonly assumed that the observed signals are smooth over the un- derlying graph. In Laplacian constrained GMRF , the data is assumed to be zero-mean Gaussian, i.e., x ∼ N ( 0 , Σ ) , in which Σ + = L is the precision matrix. Hence the precision matrix is directly identiﬁed to the graph Laplacian. Sparse graph learning under the Laplacian-constrained GGM can be formulated as a penalized maximum likelihood estimation problem. Follo wing [14], the objective function to be mini- mized is giv en by f λ X ( w ) = tr( S L w ) − log det( L w + J ) + X i h λ ( w i ) (3) where S = 1 n P n t =1 x t x ⊤ t denotes the sample covariance matrix and h λ ( · ) is a sparsity promoting penalty , with reg- ularization parameter λ ≥ 0 . As advocated in [14], we adopt the non-con vex penalty SCAD [19]. The matrix J = 1 p 11 ⊤ ensures that the log-det term is well deﬁned [8]. b) Gaussian kernel graph fr om side information: In con- trast to the observed signals, the side-information vectors y i are generally not smooth over the graph. For instance, textual descriptions transformed into embeddings do not necessarily satisfy a smoothness assumption. Ho wev er , a spatial graph is often of relev ance for such embeddings. This can be obtained from the Gaussian kernel similarities, leading to the adjacency matrix: W i,j = exp  − Z i,j σ 2  , (4) with Z i,j = ∥ y i − y j ∥ 2 2 encoding semantic distances between node features, and where σ 2 allows for adjusting the chosen width of a spatial neighborhood. In fact, the Gaussian kernel similarity in (4) can be inter- preted as the closed-form solution of a more general graph weight optimization problem proposed in [11]. The corre- sponding objectiv e function is written as f σ Y ( w ) = w ⊤ z + σ 2 X i w i (log( w i ) − 1) , (5) where w is constrained to hav e positive entries, and where z ∈ R m + collects the upper-triangular elements of Z . c) Learning fr om smooth signals with side-information: T o interpolate between the two aforementioned approaches, we propose to jointly integrate signal observations and side information within a uniﬁed optimization framework. The resulting objectiv e function of the graph learning problem is formulated as f ( w ) = α f λ X ( w ) + (1 − α ) f σ Y ( w ) (6) where the hyperparameter α ∈ [0 , 1] controls the relative conﬁdence assigned to the signal-driven and side-information- driv en terms. For analytical con venience, we rewrite the ob- jectiv e function as f ( w ) = α f 1 ( w ) + (1 − α ) f 2 ( w ) + αf 3 ( w ) (7) in which we deﬁne f 1 ( w ) ≜ − log det( L w + J ) + tr( S L w ) f 2 ( w ) ≜ w ⊤ z + σ 2 X i w i (log( w i ) − 1) f 3 ( w ) ≜ X i h λ ( w i ) . (8) I V . P R O P O S E D A L G O R I T H M T o solve the problem, we adopt the majorization- minimization (MM) framew ork [17], which iteratively min- imizes a sequence of surrogate functions. Each MM iteration consists of two steps. In the majorization step, a surrogate function f ( w | w ( k ) ) is constructed to locally upper-bound the original objectiv e F ( w ) at the current iterate w ( k ) , such that f ( w | w ( k ) ) ≥ F ( w ) , f ( w ( k ) | w ( k ) ) = F ( w ( k ) ) . (9) In the minimization step, the surrogate function f ( w | w ( k ) ) is minimized to produce the next iterate w ( k +1) w ( k +1) = arg min w f ( w | w ( k ) ) . (10) From (9) and (10), it immediately follows that F ( w ( k +1) ) ≤ f ( w ( k +1) | w ( k ) ) ≤ f ( w ( k ) | w ( k ) ) = F ( w ( k ) ) , (11) 2 which implies that the sequence { F ( w ( k ) ) } k ≥ 0 generated by the MM algorithm is non-increasing. In the following, we derive an MM algorithm to minimize the objectiv e in (8). The key step is to construct a tight surrogate function for each component of the objective. T o this end, we ﬁrst establish the following lemmas. Lemma 1. The function f 1 ( w ) in (8) admits the following upper bound at a given point w 0 f 1 ( w ) ≤ tr( R diag ( w )) + tr  Q diag( ˜ w ) − 1  (12) with equality for w = w 0 , wher e Q = diag( ˜ w 0 ) G ⊤ ( G diag( ˜ w 0 ) G ⊤ ) − 1 G diag( ˜ w 0 ) , (13) and R = E ⊤ SE with ˜ w 0 ≜ [ w ⊤ 0 , 1 /p ] ⊤ , ˜ w = [ w ⊤ , 1 /p ] ⊤ ∈ R m +1 and G = [ E , 1 ] ∈ R p × ( m +1) . The matrix E = [ ξ 1 , . . . , ξ m ] ∈ R p × m consists of column vectors ξ k = e i,k − e j,k corr esponding to the edge ( i, j ) with i > j and e i,k is the i -th canonical basis vector . The index k maps the edge to the column of E via k = i − j + j − 1 2 (2 p − j ) . Pr oof. The proof of this lemma comes from [20], and is reported to introduce some required matrix notations. The log-determinant function is concave over the cone of positi ve deﬁnite matrices and therefore satisﬁes the following ﬁrst- order upper-bound inequality log det( X ) ≤ log det( X 0 ) + tr  X − 1 0 ( X − X 0 )  , (14) for any X ≻ 0 and any reference point X 0 ≻ 0 [21]. Applying this inequality yields a majorizer for the negati ve log-determinant term − log det( L w + J ) = log det  ( L w + J ) − 1  ≤ tr  F 0 ( L w + J ) − 1  + const. , (15) where F 0 = L w 0 + J . Using the Laplacian factorization [22], L w + J = E diag( w ) E ⊤ + J = G diag( ˜ w ) G ⊤ , (16) and the cyclic property of the trace, we obtain tr( S L w ) = tr( SE diag ( w ) E ⊤ ) = tr( R diag( w )) (17) Applying Example 19 in [17] for ˜ w ≥ 0 , yields  G diag( ˜ w ) G ⊤  − 1 ≤ G − 1 0 G diag( ˜ w 0 ) diag( ˜ w ) − 1 diag( ˜ w 0 ) G ⊤ G − 1 0 (18) where G 0 ≜ G diag( ˜ w 0 ) G ⊤ . Then, applying the cyclic property of the trace, we hav e tr  G 0  G diag( ˜ w ) G ⊤  − 1  ≤ tr  Q diag( ˜ w ) − 1  (19) with Q as deﬁned in (12). Combining these results completes the proof, where the majorizing function for f 1 ( w ) is F 1 ( w | ˜ w 0 ) ≜ tr( R diag( w )) + tr  Q diag( ˜ w ) − 1  (20) Lemma 2. The function f 2 ( w ) in (8) admits the following upper bound at a given point w 0 f 2 ( w ) ≤ w ⊤ z + σ 2 m X i =1  1 w i 0 w 2 i + (log w i 0 − 2) w i  (21) with equality for w = w 0 , with w i 0 denoting the i -th element. Pr oof. The logarithm function is concav e, so for any x, w 0 > 0 we have log x ≤ log w 0 + 1 w 0 ( x − w 0 ) . Multiplying both sides by x > 0 giv es x log x ≤ x log w 0 + x ( x − w 0 ) w 0 = 1 w 0 x 2 + (log w 0 − 1) x, Thus, we ha ve w i log( w i ) ≤ 1 w i 0 w 2 i + (log w i 0 − 1) w i , which can directly giv e the majorizing function for f 2 ( w ) F 2 ( w | ˜ w 0 ) ≜ w ⊤ z + σ 2 m X i =1  1 w i 0 w 2 i + (log w i 0 − 2) w i  (22) Lastly , the penalty term f 3 ( w ) is majorized thank to the concavity of the SCAD penalty h λ : it allo ws us to construct a linear majorizer via ﬁrst-order T aylor approximation f 3 ( w ) ≤ F 3 ( w | ˜ w 0 ) ≜ X i h ′ λ ( w i 0 ) w i + const. (23) with equality for w = w 0 . Combining Lemma 1, Lemma 2 and the term (23) leads to the following surrogate objecti ve function F ( w ( k ) ) = αF 1 ( w | ˜ w 0 ) + (1 − α ) F 2 ( w | ˜ w 0 ) + αF 3 ( w | ˜ w 0 ) (24) Minimizing F ( w ( k ) ) yields the update w ( k +1) by solving decoupled minimization problems ov er each component of w . Speciﬁcally , the stationary condition for the i -th component is giv en by 0 = ∂ F ( w ( k ) ) ∂ w i = α  R i − Q i w 2 i + h ′ λ ( w i 0 )  + (1 − α )  z i + σ 2  2 w i 0 w i + log w i 0 − 2  (25) where R i = [diag ( R )] i and Q i = [diag ( Q m )] i with Q m denoting the leading m × m principal submatrix of Q , i.e., Q m = Q 1: m, 1: m . Then, deﬁning a i ≜ (1 − α ) 2 σ 2 w i 0 , (26) C i ≜ α  R i + h ′ λ ( w i 0 )  + (1 − α )  z i + σ 2 (log w i 0 − 2)  , (27) leads to the cubic equation a i w 3 i + C i w 2 i − α Q i = 0 . (28) Equation (28) is solved by computing the eigen values of the companion matrix [23], and retaining the positiv e real root. The complete MM procedure is summarized in Algorithm 1. 3 Algorithm 1: MM algorithm for problem (24) Input: R , S , initial vector w (0) = 1 , ϵ , α , σ 2 , λ Init: k ← 0 1 while k ≤ maxiter do // Majorization step at w ( k ) 2 Update Q using (13) ev aluated at w ( k ) ; // Minimization step 3 for each component i = 1 , . . . , m do 4 Compute a i and C i using (26) and (27) ; 5 Solve the cubic equation (28) and set w ( k +1) i to its positiv e real root ; 6 end 7 Update w ( k +1) = [ w ( k +1) 1 , . . . , w ( k +1) m ] ⊤ ; 8 if ∥ w ( k +1) − w ( k ) ∥ F ≤ ϵ then 9 Stop 10 k ← k + 1 ; 11 end V . E X P E R I M E N T A L R E S U LT S In this section, we conduct experiments on real-world ﬁnancial data to ev aluate the performance of our algorithm. The dataset consists of p = 30 stocks selected from the S&P 500 index ov er the period ranging from January 2018 to July 2018. This time span yields n = 200 daily observations for each stock. The data are represented by a matrix of log- returns, deﬁned entrywise as X i,j = log P i,j − log P i,j − 1 , where P i,j denotes the closing price of the i -th stock on day j . The p = 30 stocks are grouped into 3 sectors : Utilities , Materials , and Health Car e , The ground-truth sector labels are obtained according to the Global Industry Classiﬁcation Standard (GICS). In addition to price-based information, we incorpo- rate stock metadata to enrich the similarity structure. The metadata associated with the selected stocks are re- triev ed using the yfinance package 1 . Speciﬁcally , we ex- tract the description ﬁeld for each stock and employ Sentence-BERT [24] to embed the textual descriptions. Pairwise distances between the resulting metadata representa- tions are then computed and used as side information, denoted by the matrix Z , in the graph learning process. W e ev aluate the performance of the proposed method on a stock clustering task, with the objectiv e of recovering the three underlying market sectors. Clustering performance is assessed using standard ev aluation metrics, including modu- larity (MOD) [25]. Additionally , edge detection performance is ev aluated through the F-score (FS), deﬁned as FS = 2 tp 2 tp + fp + fn , (29) where tp represents the number of true positiv es (correctly identiﬁed edges), fp denotes the number of false positiv es (incorrectly identiﬁed edges), and fn denotes the number of 1 https://ranaroussi.github .io/yﬁnance/ false ne gati ves (missed true edges). The F-score takes v alues in the interval [0 , 1] , with FS = 1 indicating perfect recovery of the underlying graph structure. In this ev aluation, the ground- truth graph is assumed to contain edges only between stocks belonging to the same sector . W e ﬁrst perform cross-validation over a coarse grid of 20 v alues of λ ranging from 0 . 1 to 10 using only signal- based information. The value of λ yielding the best clustering performance is then set ﬁxed, and not further reﬁned for the rest of the experiments. In the follo wing, we rather focus on studying the effect of the fusion parameter α . As illustrated in Fig. 1, we analyze the graph estimated from stock log-returns (price-based information) and from side-information, as well as their jointly optimization, corre- sponding to α ∈ (0 , 1) . When only price data is used (i.e, α = 1 ), the method reduces to a purely signal-driv en graph learning approach, which is representativ e of most e xisting Laplacian-constrained graph learning algorithms [14]. In this case, the Utilities cluster (sho wn in green) is reasonably well separated, whereas the separation between the Health Care (red) and Materials (blue) sectors remains ambiguous. By incorporating the side-information (i.e., α < 1 ), a clearer separation between the Health Car e and Materials clusters emerges, demonstrating the beneﬁt of fusing heterogeneous sources of information. Howe ver , when relying solely on metadata (the case α = 0 ), the resulting clustering becomes less meaningful, as shown in Fig. 1(a). This observation highlights that neither price- based data nor metadata alone is suf ﬁcient to fully capture the underlying sector structure. V I . C O N C L U S I O N In this paper , we proposed a side-information-aw are graph learning formulation that simultaneously exploits observed signals and additional node attributes. T o efﬁciently solve the resulting optimization problem, we dev eloped a MM algorithm, leading to a sequence of tractable subproblems with closed-form updates. Experimental results on ﬁnancial data demonstrated that the proposed method signiﬁcantly improves clustering performance, outperforming existing graph learning approaches that rely only on signal observations. These results highlight the importance of incorporating heterogeneous side information for enhanced graph inference and downstream learning tasks. Future work will analyze the joint tuning of the fusion parameter α and the sparsity parameter λ . R E F E R E N C E S [1] A. Hollocou, T . Bonald, and M. Lelar ge, “Modularity-based sparse soft graph clustering, ” in The 22nd International Conference on Artiﬁcial Intelligence and Statistics . PMLR, 2019, pp. 323–332. [2] S. W ang, J. Y ang, J. Y ao, Y . Bai, and W . Zhu, “ An overvie w of advanced deep graph node clustering, ” IEEE T ransactions on Computational Social Systems , vol. 11, no. 1, pp. 1302–1314, 2023. [3] A. Ortega, P . Frossard, J. Ko va ˇ cevi ´ c, J. M. Moura, and P . V andergheynst, “Graph signal processing: Overview , challenges, and applications, ” Pro- ceedings of the IEEE , vol. 106, no. 5, pp. 808–828, 2018. [4] E. Isuﬁ, F . Gama, D. I. Shuman, and S. Segarra, “Graph ﬁlters for signal processing and machine learning on graphs, ” IEEE T ransactions on Signal Processing , vol. 72, pp. 4745–4781, 2024. 4 (a) α = 0 (b) α = 0 . 4 (c) α = 0 . 6 (d) α = 0 . 8 (e) α = 1 (f) F-score and modularity as function of α . Fig. 1: Evolution of the reconstructed graph as a function of α ∈ [0 , 1] . The extreme case α = 0 corresponds to using only side information, whereas α = 1 corresponds to using only signal information. Intra-sector connections are represented by the corresponding sector colors, while gray-colored edges indicate connections between nodes belonging to different sectors. [5] G. Leus, A. G. Marques, J. M. Moura, A. Ortega, and D. I. Shuman, “Graph signal processing: History , development, impact, and outlook, ” IEEE Signal Processing Magazine , vol. 40, no. 4, pp. 49–60, 2023. [6] Z. W u, S. Pan, F . Chen, G. Long, C. Zhang, and P . S. Y u, “ A comprehensiv e survey on graph neural networks, ” IEEE transactions on neural networks and learning systems , vol. 32, no. 1, pp. 4–24, 2020. [7] S. L. Lauritzen, Graphical models . Clarendon Press, 1996, vol. 17. [8] H. E. Egilmez, E. Pavez, and A. Ortega, “Graph learning from data under Laplacian and structural constraints, ” IEEE Journal of Selected T opics in Signal Pr ocessing , vol. 11, no. 6, pp. 825–841, 2017. [9] D. I. Shuman, S. K. Narang, P . Frossard, A. Ortega, and P . V an- derghe ynst, “The emerging ﬁeld of signal processing on graphs: Ex- tending high-dimensional data analysis to networks and other irregular domains, ” IEEE signal pr ocessing magazine , vol. 30, no. 3, pp. 83–98, 2013. [10] X. Dong, D. Thanou, P . Frossard, and P . V andergheynst, “Learning Laplacian matrix in smooth graph signal representations, ” IEEE T rans- actions on Signal Pr ocessing , vol. 64, no. 23, pp. 6160–6173, 2016. [11] V . Kalofolias, “How to learn a graph from smooth signals, ” in Artiﬁcial Intelligence and Statistics . PMLR, 2016, pp. 920–929. [12] S. Kumar , J. Y ing, J. V . de Miranda Cardoso, and D. Palomar , “Struc- tured graph learning via laplacian spectral constraints, ” Advances in neural information pr ocessing systems , vol. 32, 2019. [13] B. Lake and J. T enenbaum, “Discovering structure by learning sparse graphs, ” in Pr oceedings of the Annual Meeting of the Cognitive Science Society , v ol. 32, no. 32, 2010. [14] J. Y ing, J. V . de Miranda Cardoso, and D. P alomar , “Nonconv ex sparse graph learning under Laplacian constrained graphical model, ” Advances in Neural Information Processing Systems , vol. 33, pp. 7101–7113, 2020. [15] T . H. Phi, A. Hippert-Ferrer , F . Bouchard, and A. Breloy , “Leveraging low-rank factorizations of conditional correlation matrices in graph learning, ” IEEE T ransactions on Signal Pr ocessing , 2026. [16] A. Hippert-Ferrer, F . Bouchard, A. Mian, T . V ayer , and A. Breloy , “Learning graphical factor models with Riemannian optimization, ” in Joint European Conference on Machine Learning and Knowledge Dis- covery in Databases . Springer , 2023, pp. 349–366. [17] Y . Sun, P . Babu, and D. P . Palomar , “Majorization-minimization algo- rithms in signal processing, communications, and machine learning, ” IEEE T ransactions on Signal Pr ocessing , vol. 65, no. 3, pp. 794–816, 2016. [18] S. J. Johnson, M. R. Murty , and I. Navakanth, “ A detailed revie w on word embedding techniques with emphasis on word2vec, ” Multimedia T ools and Applications , vol. 83, no. 13, pp. 37 979–38 007, 2024. [19] J. Fan and R. Li, “V ariable selection via nonconcave penalized like- lihood and its oracle properties, ” Journal of the American statistical Association , v ol. 96, no. 456, pp. 1348–1360, 2001. [20] A. Jav aheri, A. Amini, F . Marv asti, and D. P . Palomar , “Joint signal recovery and graph learning from incomplete time-series, ” in ICASSP 2024-2024 IEEE International Confer ence on Acoustics, Speech and Signal Pr ocessing (ICASSP) . IEEE, 2024, pp. 13 511–13 515. [21] S. P . Boyd and L. V andenberghe, Conve x optimization . Cambridge univ ersity press, 2004. [22] S. Kumar, J. Y ing, J. V . d. M. Cardoso, and D. P . Palomar , “ A uniﬁed framework for structured graph learning via spectral constraints, ” Journal of Machine Learning Research , vol. 21, no. 22, pp. 1–60, 2020. [23] R. A. Horn and C. R. Johnson, Matrix analysis . Cambridge university press, 2012. [24] N. Reimers and I. Gure vych, “Sentence-bert: Sentence embeddings using siamese bert-networks, ” in Pr oceedings of the 2019 Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, 11 2019. [Online]. A vailable: https://arxiv .org/abs/1908.10084 [25] M. E. Newman, “Modularity and community structure in networks, ” Pr oceedings of the national academy of sciences , vol. 103, no. 23, pp. 8577–8582, 2006. 5

Including Node Textual Metadata in Laplacian-constrained Gaussian Graphical Models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment