Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach

Spectral graph neural networks are proposed to harness spectral information inherent in graph-structured data through the application of polynomial-defined graph filters, recently achieving notable success in graph-based web applications. Existing st…

Authors: Guoming Li, Jian Yang, Shangsong Liang

Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach
Polynomial Selection in Spe ctral Graph Neural Networks: An Error-Sum of Function Slices Appr oach Guoming Li ∗ Mohamed bin Zayed University of Articial Intelligence Abu Dhabi, United Arab Emirates paskardli@outlook.com Jian Y ang University of Chinese Academy of Sciences Beijing, China jianyang0227@gmail.com Shangsong Liang Sun Y at-sen University Guangzhou, China liangshangsong@gmail.com Dongsheng Luo ∗ Florida International University Miami, United States dluo@u.edu Abstract Spectral graph neural networks are proposed to harness spectral information inherent in graph-structured data through the applica- tion of polynomial-dened graph lters, recently achieving notable success in graph-based w eb applications. Existing studies reveal that various polynomial choices greatly impact spectral GNN per- formance, underscoring the importance of polynomial selection. Howev er , this selection process remains a critical and unresolved challenge. Although prior work suggests a conne ction between the approximation capabilities of polynomials and the ecacy of spectral GNNs, there is a lack of theoretical insights into this rela- tionship, rendering polynomial selection a largely heuristic process. T o address the issue, this paper examines polynomial selection from an error-sum of function slices perspective. Inspired by the conventional signal decomposition, we represent graph lters as a sum of disjoint function slices. Building on this, w e then bridge the polynomial capability and spectral GNN ecacy by proving that the construction error of graph convolution layer is bounded by the sum of polynomial approximation err ors on function slices. This result leads us to develop an advanced lter based on trigonometric polynomials, a widely adopted option for appr oximating narrow signal slices. The proposed lter remains provable parameter e- ciency , with a novel T aylor-based parameter decomposition that achieves streamlined, eective implementation. With this foun- dation, we propose TFGNN, a scalable spectral GNN op erating in a decouple d paradigm. W e validate the ecacy of TFGNN via benchmark node classication tasks, along with an example graph anomaly detection application to show its practical utility . CCS Concepts • Computing methodologies → Machine learning . ∗ Corresponding Authors This work is licensed under a Creativ e Commons Attribution 4.0 International License. WW W ’25, Sydney , NSW , Australia © 2025 Copyright held by the owner/author(s). ACM ISBN 979-8-4007-1274-6/25/04 https://doi.org/10.1145/3696410.3714760 Ke ywords Spectral graph neural networks, Polynomial graph lters, Polyno- mial approximation, Node classication A CM Reference Format: Guoming Li, Jian Y ang, Shangsong Liang, and Dongsheng Luo. 2025. Poly- nomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach. In Proceedings of the ACM W eb Conference 2025 (WW W ’25), A pril 28-May 2, 2025, Sydney , NSW , Australia. ACM, New Y ork, NY, USA, 16 pages. https://doi.org/10.1145/3696410.3714760 1 Introduction Graph neural networks (GNNs) [ 75 , 86 ] have emerged as powerful tools to capture structural information from graph data, facilitating advanced performance across numerous web applications, such as web search [ 5 , 81 ], recommender system [ 27 , 76 ], social network analysis [ 9 ], anomaly detection [ 14 , 18 , 48 ], etc. Among GNN vari- eties, spectral GNNs stand out for their ability to exploit the spe ctral properties of graph data using polynomial-dened graph lters, recently achieving notable success in graph-related tasks [3]. Numerous existing studies have empirically revealed that various polynomial choices greatly impact spectral GNN performance [ 24 – 26 , 31 , 38 , 74 ], underscoring the importance of p olynomial selection. Howev er , despite the various w orks that incorporate dier ent poly- nomials, their primar y focus has be en on other factors, such as convergence rate [ 24 , 74 ], rather than explicitly targeting the en- hancement of spectral GNN ecacy . As far as we are awar e, there is no existing work that directly associates spectral GNN ecacy with polynomial capability , which renders polynomial selection a crucial yet unresolved challenge, often appr oached heuristically . T o tackle this issue, we investigate polynomial selection through a novel lens of error-sum of function slices in this paper . Drawing inspiration from signal decomposition techniques [ 21 ], we uni- formly represent graph lters as a sum of disjoint function slices. W e present the rst proof establishing that the construction error of graph convolution layers is bounded by the sum of polynomial approximation errors on these function slices. This explicitly links the capability of polynomials to the eectiv eness of spectral GNNs, supported by intuitive numerical validations that arm the practi- cality of our theoretical framework. This nding emphasizes that enhanced spectral GNN ecacy can be attained by utilizing graph lters created with “narrow slice-preferred polynomials” . Conse- quently , we introduce an innovative lter based on trigonometric WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. polynomials [ 88 ], a standard appr oach for approximating narr ow signal slices in the signal pr ocessing domain. Our proposed lter showcases prov en parameter eciency , leveraging a novel T aylor- based parameter decomposition that facilitates streamlined and eective implementation. Building upon this foundation, we intro- duce TFGNN, a scalable spectral GNN operating in a widely adopted decoupled GNN architecture [ 10 , 19 , 25 , 38 , 83 ]. Empirically , we vali- date TFGNN’s capacity via benchmark node classication tasks and highlight its real-world ecacy with an example graph anomaly detection application. Our contributions are summarized below: • W e provide the inaugural proof that connects the ecacy of spectral GNN to their polynomial capabilities, framed through the lens of appro ximation error on function slices. Our numeri- cal experiments reinforce the practical utility of this connection. This nding oers an informed strategy to r ene polynomial selection, leading to enhanced spe ctral GNNs. • W e introduce an advanced graph lter base d on trigonometric polynomials, showcasing provable parameter eciency . Our novel approach incorporates a T aylor-based parameter decom- position to achie ve a streamlined implementation. Based on this lter , we further develop TFGNN, a scalable sp ectral GNN characterized by its decoupled architecture. • W e validate TFGNN’s eectiveness with extensive experiments in b enchmark node classication and an illustrative application in graph anomaly dete ction. The r esults rev eal that TFGNN not only exceeds previous methods in standard tasks but also yields results comparable to specialized models in real-world settings, demonstrating its signicant practical value. 2 Backgrounds and Preliminaries Graph notations. Let G = ( 𝑨 , 𝑿 ) be an undirecte d and unweighted graph with adjacency matrix 𝑨 ∈ { 0 , 1 } 𝑛 × 𝑛 and node feature 𝑿 ∈ R 𝑛 × 𝑚 . In addition, 𝑳 = 𝑰 − 𝑫 − 1 2 𝑨𝑫 − 1 2 is the normalized graph Laplacian [ 11 ], with 𝑰 , 𝑫 being the identity matrix and the degree matrix, respectively . The eigen-decomposition of 𝑳 is given by 𝑳 = 𝑼 𝑑 𝑖 𝑎𝑔 ( 𝝀 ) 𝑼 𝑇 , wher e 𝑼 ∈ R 𝑛 × 𝑛 denotes the eigenvectors, and 𝝀 ∈ [ 0 , 2 ] 𝑛 represents the corresponding eigenvalues. Graph lters. The concept of graph lters originates in the eld of Graph Signal Processing ( GSP) [ 57 , 63 , 66 ], a eld dedicated to developing specialize d tools for processing signals generated on graphs, grounded in spectral graph theory [ 11 ]. A graph lter is specically a point-wise mapping 𝑓 : [ 0 , 2 ] ↦→ R applied to graph Laplacian’s eigenvalues, 𝝀 , facilitating the processing of the graph signal 𝒙 ∈ R 𝑛 through a ltering operation as shown below [64]: 𝒛 ≜ 𝑼 𝑑 𝑖 𝑎𝑔 ( 𝑓 ( 𝝀 ) ) 𝑼 𝑇 𝒙 , (1) where 𝒛 ∈ R 𝑛 represents the lter ed output. This formulation is often identied as the graph conv olution [ 63 ] operation. Due to the intensive computation cost associate d with eigendecomposition, the mapping 𝑓 is typically implemented via polynomial approximations in practice, resulting in the derivation of Eq. 1 as below: 𝒛 = 𝑼 𝑑𝑖 𝑎𝑔 𝐷  𝑑 = 0 𝜃 𝑑 T 𝑑 ( 𝝀 ) ! 𝑼 𝑇 𝒙 = 𝐷  𝑑 = 0 𝜃 𝑑 T 𝑑 ( 𝑳 ) 𝒙 . (2) T 𝑑 denotes the 𝑑 -th term of a polynomial, with coecient 𝜃 𝑑 . = + + Figure 1: Example of function slicing. 𝑓 ( 𝑥 ) is disse cted into three comp onents, determine d by its eigenvalues. Spectral-based GNNs. Spectral-base d GNNs emerge from the in- tegration of graph lters with graph-structured data. By treating each column of the node feature matrix 𝑿 as an individual graph signal, a 𝐿 -layer spectral GNN is ar chitected as multi-layer neu- ral network that processes the hidden feature through ltering operations, as formulated below [3]: 𝑯 ( 𝑙 + 1 ) = 𝜎 ( 𝑙 ) " 𝐷  𝑑 = 0 𝜃 𝑑𝑙 T 𝑑 ( 𝑳 ) 𝑯 ( 𝑙 ) 𝑾 ( 𝑙 ) # , 𝑯 ( 0 ) ≜ 𝑿 . (3) Here, 𝑯 ( 𝑙 ) and 𝑾 ( 𝑙 ) correspond to the hidden layer representation and weight matrix at the 𝑙 -th layer , respectively , with 𝜎 ( 𝑙 ) represent- ing a non-linear function commonly applie d in neural networks. Each 𝑙 -th layer is termed a graph convolution layer , representing a critical building block in spectral GNNs and the subsequent devel- opments in the eld [13, 23, 24, 26, 30, 31, 38–41, 74, 83]. 3 Connecting Polynomial Capability with Spectral GNN Ecacy This section seeks to connect polynomial capability with the ecacy of spectral GNN. W e examine the relationship between polynomial approximation errors and feature construction errors in graph con- volution lay er , providing theoretical analysis alongside intuitive numerical evaluations. This exploration yields vital insights that contribute to the progression of spectral GNNs in a polynomial context. W e begin by dening several essential concepts. Denition 3.1. ( Function slices ). Let 𝑓 : [ 0 , 2 ] ↦→ R be a con- tinuous and dierentiable lter mapping. Denote the eigenvalues 𝜆 1 , 𝜆 2 , . .., 𝜆 𝑛 of 𝑳 , satisfying 0 = 𝜆 1 ≤ 𝜆 2 ≤ . . . ≤ 𝜆 𝑛 ≤ 2 . The function slices of 𝑓 ( 𝑥 ) are given by a set of disjoint functions 𝑓 𝑠 , 𝑠 = 1 , 2 , . .., 𝑛 , satisfying the following conditions: 𝑓 𝑠 ( 𝑥 ) =  𝑓 ( 𝑥 ) 𝑥 ∈ [ 𝜆 𝑠 − 1 , 𝜆 𝑠 ] , 0 𝑂 𝑡 ℎ𝑒 𝑟 𝑤 𝑖 𝑠 𝑒 . (4) Therefore, for any arbitrary function 𝑓 , we can represent it by summing its slices, as illustrated below: 𝑓 ( 𝑥 ) = 𝑛  𝑠 = 1 𝑓 𝑠 ( 𝑥 ) . (5) Figure 3.1 pr ovides an intuitive example of function slicing. This concept parallels the signal decomp osition techniques found in the conventional signal processing eld [21]. Denition 3.2. ( Polynomial’s approximation error ). Let T 0: 𝐷 ( 𝑥 ; 𝑓 ) represent a p olynomial function of degree 𝐷 that achieves the least Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia squares error (LSE) [ 60 , 67 ] in approximating a specied function 𝑓 ( 𝑥 ) . Accordingly , we can dene both continuous and discrete forms of the approximation error relative to the target lter function 𝑓 ( 𝑥 ) using T 0: 𝐷 as follows: (Continuous) 𝜖 ≜ ∫ 2 0 | T 0: 𝐷 ( 𝑥 ; 𝑓 ) − 𝑓 ( 𝑥 ) | 2 𝑑 𝑥 , (6) (Discrete) 𝜖 ≜ ∥ T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ∥ 𝐹 , (7) where ∥ · ∥ 𝐹 denotes Frobenius norm [69]. Our analysis centers on the continuous form, with derived in- sights adapte d to the discrete form for application in spectral GNNs. Denition 3.3. ( Construction error of graph convolution layer ). Let 𝒀 denote the target output of a graph convolution layer , e xpressed as 𝒀 = 𝑼 𝑑 𝑖 𝑎𝑔 ( 𝑓 ( 𝝀 ) ) 𝑼 𝑇 𝑿 𝑾 , where 𝑓 serves as the “optimal” lter function for constructing 𝒀 . The construction error of the graph convolution layer on 𝒀 through a 𝐷 -degree polynomial lter func- tion T 0: 𝐷 is dened as: 𝜉 ≜ ∥ 𝑼 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) 𝑼 𝑇 𝑿 𝑾 ∥ 𝐹 . (8) Note that the error 𝜉 , analogous to 𝜖 in Denition 3.2, is measured as the dierence between the target function 𝑓 and the polynomial T 0: 𝐷 that achiev es the least squares error (LSE) appro ximation. The graph convolution layer introduced in Denition 3.3 aligns with a one-layer linear GNN , a conguration similarly explored in previous studies [ 74 , 78 ]. These prior works hav e e xamined the eectiveness of a one-layer linear GNN in constructing node labels to evaluate the overall performance of GNNs, which inspired us to examine the construction error within the graph convolution layer . 3.1 Theoretical insights Polynomial capability is quantied by the function approximation error [ 60 , 67 ], whereas spectral GNN ecacy is typically reected by prediction error in downstream tasks [ 10 , 30 , 35 , 38 , 70 , 74 , 83 ]. Consequently , a natural step toward linking polynomial capabilities with spe ctral GNN ecacy is to establish a bridge between the polynomial approximation error , 𝜖 , as dene d in Denition 3.2, and the graph convolution layer’s construction error , 𝜉 . In particular , as described in Denition 3.1, for an “optimal” lter function 𝑓 ( 𝑥 ) , the approximation error of a 𝐷 -degree polynomial T 0: 𝐷 ( 𝑥 ; 𝑓 ) satises the conditions outlined in the follo wing Lemma: Lemma 3.4. Let 𝑓 ( 𝑥 ) be a function composed of function slices 𝑓 𝑠 ( 𝑥 ) , 𝑠 = 1 , 2 , . .., 𝑛 . Let T 0: 𝐷 ( 𝑥 ; 𝑓 ) be a 𝐷 -degree polynomial that provides LSE approximation of 𝑓 ( 𝑥 ) with error 𝜖 . Specially , dene 𝜖 𝑠 , 𝑠 = 1 , 2 , . .., 𝑛 , as the polynomial approximation error of each slice 𝑓 𝑠 ( 𝑥 ) when approximated by the 𝐷 -degree p olynomial T 0: 𝐷 ( 𝑥 ; 𝑓 𝑠 ) . A n inequality that bounds 𝜖 in terms of 𝜖 𝑠 are formulated b elow: 𝑛  𝑠 = 1 𝜖 𝑠 ≤ 𝜖 ≤ ( 𝑛  𝑠 = 1 √ 𝜖 𝑠 ) 2 . (9) Proof can be found in App endix. Lemma 3.4 establishes both up- per and lower bounds for the approximation error of a polynomial in relation to an arbitrary function 𝑓 , base d on the errors associate d with its slices 𝑓 𝑠 . This result suggests that the capacity of the poly- nomial can be equivalently evaluated through the appro ximation error of its slices. 0 0.5 1.0 1.5 2.0 x f(x) (a) 𝑓 1 ( 𝑥 ) . 0 0.5 1.0 1.5 2.0 x f(x) (b) 𝑓 2 ( 𝑥 ) . 0 0.5 1.0 1.5 2.0 x f(x) (c) 𝑓 3 ( 𝑥 ) . 0 0.5 1.0 1.5 2.0 x f(x) (d) 𝑓 4 ( 𝑥 ) . 0 0.5 1.0 1.5 2.0 x f(x) (e) 𝑓 5 ( 𝑥 ) . 0 0.5 1.0 1.5 2.0 x f(x) (f ) 𝑓 6 ( 𝑥 ) . Figure 2: The functions served as target lters. Additional mathematical details are available in Appendix. Drawing from the insights of bounded err or above, we can now propose an inequality that bounds the construction error of the graph convolution layer , utilizing the polynomial approximation error as outlined in the theorem below: Theorem 3.5. Let 𝛿 𝑿 and 𝛿 𝑾 denote the minimum singular values of 𝑿 and 𝑾 , respectively . Consider a r egularization on the weight matrix 𝑾 , namely L 2 regularization, expressed as ∥ 𝑾 ∥ 𝐹 ≤ 𝑟 . The construction error 𝜉 , satises the following inequality: 𝛿 𝑿 𝛿 𝑾 𝑛  𝑠 = 1 𝜖 𝑠 ≤ 𝜉 ≤ 𝑟 ∥ 𝑿 ∥ 𝐹 ( 𝑛  𝑠 = 1 √ 𝜖 𝑠 ) 2 . (10) Proof can be found in Appendix. Theorem 3.5 establishes a di- rect connection between the polynomial approximation error and the construction error of the graph convolution layer through the approximation err or of function slices, 𝜖 𝑠 . This insight is novel and, to our knowledge, has not been documented b efore. 3.2 Numerical validation W e conduct extensive numerical experiments to validate our theo- retical ndings. Inspired by lter learning experiments from prior spectral GNN studies [ 24 , 25 , 49 , 72 ], we design more challenging tasks with (i) incr eased graph sizes and (ii) complex target functions for learning. Specically , we generate random graphs with 50 , 000 nodes, substantially larger than the typical 10 , 000 -node setups in previous studies. Additionally , we utilize six intricate target lters, visualized in Figure 2. The experiments comprise two primary tasks: • Using eigenvalue-based slices of each function, we assess the approximation quality of ve polynomials commonly adopted in spectral GNN literature, with the sum of squared errors (SSE) across 50000 slices as the metric. • With a random 50000 × 100 matrix as no de feature 𝑿 , we apply six target functions as lters, obtaining output 𝒀 1 to 𝒀 6 . W e train spectral GNNs on ( 𝑿 , 𝒀 ) to learn the target functions with polynomial lters, with the Frobenius norm of the dierence between the learned and target lters as the metric. WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. T able 1: Numerical experiment results. # A vg Rank 1 denotes the average rank in polynomial approximation, and # A vg Rank 2 refers to the average rank in lter learning. Method Slice-wise approximation Filter Learning # A vg Rank 1 # A vg Rank 2 Polynomial GNN 𝑓 1 ( 𝑥 ) 𝑓 2 ( 𝑥 ) 𝑓 3 ( 𝑥 ) 𝑓 4 ( 𝑥 ) 𝑓 5 ( 𝑥 ) 𝑓 6 ( 𝑥 ) 𝑓 1 ( 𝑥 ) 𝑓 2 ( 𝑥 ) 𝑓 3 ( 𝑥 ) 𝑓 4 ( 𝑥 ) 𝑓 5 ( 𝑥 ) 𝑓 6 ( 𝑥 ) Monomial GPRGNN [10] 139.9 289.1 466.1 398.3 1.83 97.83 167.2 366.4 566.3 468.7 15.91 139.2 5 5 Bernstein BernNet [25] 32.78 247.3 398.5 306.5 0.058 22.92 68.23 313.2 448.2 415.2 7.79 95.84 4 4 Chebyshev ChebNetII [26] 23.45 85.19 244.8 187.2 0.018 13.13 64.22 168.4 402.5 347.5 6.83 86.25 3 3 Jacobian JacobiConv [74] 22.18 80.77 239.2 155.3 0.017 11.82 48.56 95.92 338.1 266.4 5.33 65.13 2 2 Learnable OptBasis [24] 20.75 80.53 225.7 152.7 0.017 11.20 43.44 89.48 289.5 238.1 4.98 61.70 1 1 Numerical insights. T able 1 reveals that reducing the sum of the polynomial approximation error over function slices yields lower lter learning errors in spectral GNNs, consistently rank- ing both tasks. Although these results are derived from numerical experiments and may introduce certain biases, they conrm our theoretical analysis, showing a strong positive relationship b etween the polynomial’s capability and the ecacy of spectral GNNs. 3.3 Summary In this section, we summarize the signicant ndings from the preceding analysis and delve into discussions on enhancing spe ctral GNNs through the introduction of informed polynomial selection. Specically , as discussed in Section 3.1 and 3.2, the construction error of spectral GNNs is intricately connected to the polynomial approximation error summed over function slices. Moreov er , refer- ring to Theorem 3.5, note that 𝑿 is typically a constant property of the graph data, the construction error of graph convolution layer , 𝜉 , therefore depends entirely on the slice-wise errors 𝜖 𝑠 , 𝑠 = 1 , 2 , . .., 𝑛 . Consequently , for a graph data G = ( 𝑨 , 𝑿 ) with node label 𝒀 , an intuitive solution to reduce spectral GNN construction error is to utilize polynomials adept at approximating these slices. Furthermore, practical graphs often contain millions of nodes [ 29 , 46 ] and complex target lters [ 10 , 25 , 47 , 77 ]. These characteris- tics result in very narro w and sharp slices of the target functions. Consequently , to minimize construction err ors in spectral GNNs and improve their eectiveness, it is vital to incorporate “narro w function-preferred” p olynomials in the development of graph lters. This insight not only represents a key contribution of this paper but also illuminates potential avenues for advancing spe ctral GNNs, paving the way for the introduction of a more advanced method. 4 The proposed TFGNN Based on the previous analysis, this se ction introduces a novel trigonometric polynomial-base d graph lter to enhance spectral GNNs. W e begin with the trigonometric lter , discuss its ecient implementation via T aylor-base d parameter decomposition, and present our T rigonometric F ilter G raph N eural N etwork (TFGNN) as a decoupled GNN. A complexity analysis concludes the section. 4.1 Parameter-ecient trigonometric lter Trigonometric polynomials, among the most e xtensively utilized, have found widespr ead applications in approximating the functions with complicated patterns [ 21 , 65 , 88 ]. More importantly , extensive prior studies in traditional signal pr ocessing domain have consis- tently highlighted the eectiveness of trigonometric polynomials over other polynomial types in modeling the functions localized within narrow intervals [ 12 , 16 , 22 , 73 , 80 , 82 ]. This prompts us to pioneer the development of graph lters through leveraging the power of trigonometric polynomials. Explicitly , the denition of our trigonometric graph lter is as follows: 𝑓 Trigo ( 𝝀 ) = 𝐾  𝑘 = 0 [ 𝛼 𝑘 sin ( 𝑘 𝜔 𝝀 ) + 𝛽 𝑘 cos ( 𝑘 𝜔 𝝀 ) ] . (11) Here, 𝐾 denotes the order of the truncated trigonometric polyno- mial. The coecients 𝛼 𝑘 , and 𝛽 𝑘 are parameterized, while the hyper- parameter 𝜔 (base frequency) is chosen from within the range ( 0 , 𝜋 ) , enabling the trigonometric polynomial approximation to cover the interval [ 0 , 2 ] , which corresponds to the range of 𝝀 . similar to other types of polynomials, trigonometric polynomials oer considerable approximation capability for arbitrar y functions, thus ensuring comprehensive lter cov erage in practical applications [65, 88]. Guaranteed parameter-eciency . Apart from their recognized approximation capacities, trigonometric polynomials grant the graph lter 𝑓 Trigo with a unique, pr ovable eciency regarding its parameters ( 𝛼 𝑘 , 𝛽 𝑘 ) , 𝑘 ∈ N , as demonstrated in the the orem below . Theorem 4.1. (Parameter-eciency). Given a 𝑓 ( 𝑥 ) formulated as 𝑓 Trigo ( 𝑥 ) , its coecients 𝛼 𝑘 and 𝛽 𝑘 satises: lim 𝑘 →+∞ 𝛼 𝑘 = 0 , lim 𝑘 →+∞ 𝛽 𝑘 = 0 . (12) Proof can be found in Appendix. Theorem 4.1 establishes a solid basis by rev ealing that polynomial terms with larger values of 𝑘 within 𝑓 Trigo ( 𝑥 ) correspond to smaller weights. This insight indi- cates that the contribution of high-order terms is relatively insignif- icant, allowing for their practical omission without substantial loss in approximation accuracy . As a r esult, 𝑓 Trigo ( 𝑥 ) can achieve sub- stantial eectiveness with only a small 𝐾 , reducing the complexity of the lters while retaining signicant accuracy . This reinfor ces the superiority of 𝑓 Trigo ( 𝑥 ) over other graph lter designs. 4.2 T aylor-base d parameter de composition As detailed in Eq. 11, implementing the standar d 𝑓 Trigo ( 𝑥 ) requires an eigen-decomposition, which imp oses substantial computational complexity and limits scalability compared to alternative meth- ods. W e tackle this challenge through the introduction of T aylor- based parameter decomposition (TPD). TPD rst reformulates the trigonometric terms sin ( 𝑘 𝜔 𝝀 ) and cos ( 𝑘 𝜔 𝝀 ) into p olynomial forms through the T aylor expansion [1, 60, 67], as shown below: sin ( 𝑘 𝜔 𝝀 ) = 𝐷  𝑑 = 0 𝛾 𝑘𝑑 𝝀 𝑑 , cos ( 𝑘 𝜔 𝝀 ) = 𝐷  𝑑 = 0 𝜃 𝑘𝑑 𝝀 𝑑 . (13) Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia The constants 𝛾 𝑘𝑑 and 𝜃 𝑘𝑑 depend exclusively on the typ es of functions (sine and cosine), the index 𝑘 , and the hyp erparame- ter 𝜔 . The eectiveness of T aylor expansion for mo deling functions within localized intervals has been thoroughly established in the literature [ 32 , 52 , 54 , 56 , 59 , 85 ], especially for trigonometric func- tions [ 7 , 33 , 36 ]. Since 𝝀 is restricted to the range [ 0 , 2 ] , the T aylor expansion emerges as a viable and crucial strategy for eciently approximating these trigonometric functions. With the updated formulations, TPD alters the convolution op- eration with 𝑓 Trigo ( 𝑥 ) on the node feature 𝑿 as detailed below: 𝒁 = 𝑼 𝐾  𝑘 = 0 " 𝛼 𝑘 𝐷  𝑑 = 0 𝛾 𝑘𝑑 𝑑 𝑖 𝑎𝑔 ( 𝝀 𝑑 ) + 𝛽 𝑘 𝐷  𝑑 = 0 𝜃 𝑘𝑑 𝑑 𝑖 𝑎𝑔 ( 𝝀 𝑑 ) # 𝑼 𝑇 𝑿 , = 𝐷  𝑑 = 0 𝑳 𝑑 𝑿 ( 𝜶 𝚪 : 𝑑 + 𝜷 𝚯 : 𝑑 ) . (14) Here, 𝜶 and 𝜷 denote the 𝐾 + 1 -dimensional vectors with elements being 𝛼 𝑘 and 𝛽 𝑘 , r espectively . 𝚪 and 𝚯 refer to the ( 𝐾 + 1 ) × ( 𝐷 + 1 ) matrices formed with 𝛾 𝑘𝑑 , 𝜃 𝑘𝑑 . Eq. 14 illustrates a streamlined convolution with 𝑓 Trigo ( 𝑥 ) , oering two signicant benets: • Reduced complexity . Utilizing the TPD, the graph convolution with 𝑓 Trigo ( 𝑥 ) eliminates the need for computation-intensive eigen-decomposition. This reduction in computational ov erhead brings the costs in line with those of standard polynomial-based lters, leading to signicant eciency gains. • Parameter de composition. TPD integrates all trigonometric functions into polynomial forms, allowing for increases in 𝐾 to only inuence trivial computations of 𝜶 𝚪 and 𝜷 𝚯 , enhancing precision of 𝑓 Trigo ( 𝑥 ) with negligible additional cost. 4.3 Modeling TFGNN as decoupled paradigm TFGNN is a decoupled GNN architecture that separates graph convolution from feature transformation. This design principle, rst proposed by [ 19 ], has b ecome a de facto choice in modern spectral GNNs for its signicant ecacy and computational e- ciency [ 10 , 24 – 26 , 30 , 31 , 37 , 38 , 41 , 74 ], and even stands out as a promising solution for scalable GNNs [ 44 , 45 ]. Specically , in- corporating the trigonometric lter 𝑓 Trigo ( 𝑥 ) with the introduced T aylor-based parameter decomposition, we pr esent two versions of TFGNN to cater to dierent graph sizes: ❶ In the case of medium-to-large graphs like Cora [ 79 ] and Arxiv [29], TFGNN operates as describe d below: 𝒁 = 𝐷  𝑑 = 0 𝑳 𝑑 𝑯 ( 𝜶 𝚪 : 𝑑 + 𝜷 𝚯 : 𝑑 ) , 𝑯 = MLP ( 𝑿 ) . (15) MLP ( · ) denotes a multi-layer perceptron for feature transformation. ❷ In the case of exceptionally large graphs, such as Wiki [ 46 ] and Papers100M [29], TFGNN is implemented as follows: 𝒁 = MLP ( 𝑯 ) , 𝑯 = 𝐷  𝑑 = 0 𝑳 𝑑 𝑿 ( 𝜶 𝚪 : 𝑑 + 𝜷 𝚯 : 𝑑 ) , (16) The dierent implementations of TFGNN come from hardware constraints and introduce notable benets: (i) for medium-to-large graphs, the graph data can be fully stored on GP Us; therefore, T able 2: Comple xity comparison of TFGNN against others. The complexity p ertains to the graph convolution layers. Method Computation Parameter GPRGNN [10] O ( 𝑚𝐸 𝐷 ) O ( 𝐾 ) ChebNetII [26] O ( 𝑚𝐸 𝐷 ) O ( 𝐾 ) OptBasis [24] O ( 𝑚𝐸 𝐷 ) O ( 𝐾 ) JacobiConv [74] O ( 𝑚𝐸 𝐷 ) O ( 𝐾 ) UniFilter [31] O ( 𝑚𝐸 𝐷 ) O ( 𝐾 ) TFGNN ( ours ) O ( 𝑚𝐸 𝐷 ) O ( 𝐾 ) by simply r educing the feature dimensions with MLP, the subse- quent convolution process could achieves high eciency; (ii) for exceptionally large graphs, where GP U memor y limitations b e- come a substantial challenge, TFGNN precomputes features, 𝑳 𝑑 𝑿 , 𝑑 = 1 , 2 , . .. 𝐷 , and stores them as static data les. This allows for ecient graph conv olution operation via repeated reads of precom- puted features, mitigating the intense computational complexity associated with GNN training; an ecient MLP is applied later . 4.4 Complexity analysis of TFGNN This subsection presents the complexity analysis of TFGNN, with a particular emphasis on graph convolution layers, as the complexity of feature transformation MLPs is already well-understood. T o start, we consider a graph G with 𝑛 nodes, 𝐸 edges, and 𝑚 feature dimensions. Across all spectral GNNs, the maximum polynomial order is set to 𝐷 . The trigonometric polynomial degree is capped at 𝐾 . A summar y of the complexity analysis is outlined in T able 2. Computational complexity . As shown in Eq. 14, for each order 𝑑 , TFGNN rst computes 𝜶 𝚪 : 𝑑 and 𝜷 𝚯 : 𝑑 , requiring 2 ( 𝐾 + 1 ) operations, followed by propagating with 𝑳 , which requires 𝑚𝐸 computations. Since the number of edges 𝐸 is millions of times larger than both 𝐾 and 𝐷 , the practical complexity is gov erned by 𝑚𝐸 for each order 𝑑 , leading to an overall complexity of O ( 𝑚𝐸 𝐷 ) . This is comparable to other spectral GNNs like GPRGNN [ 10 ], which utilizes recursive computation of the propagated feature 𝑳 𝑑 𝑿 . Thus, our TFGNN achieves complexity on par with prior methods. Additionally , for exceptionally large graphs, TFGNN reduces complexity further by precomputing all 𝑳 𝑑 𝑿 , thus avoiding redundant repeated computa- tions during training. Parameter complexity . Our TFGNN achieves a parameter com- plexity of O ( 2 ( 𝐾 + 1 ) ) , in contrast to traditional spectral GNNs’ O ( 𝐾 + 1 ) , where each polynomial basis order is assigned a parame- terized coecient. This increase is, however , trivial, as the feature transformation MLP constitutes the majority of parameters, greatly outweighing the graph convolution lay ers. As such, TFGNN’s pa- rameter complexity remains eectively on par with that of other spectral GNNs when considering the entire model. 5 Empirical Studies This section details the empirical evaluations, including numeri- cal experiments same as those in Se ction 3.2, a benchmark node classication task, and a practical application in graph anomaly detection. A demo code implementation is available through the GitHub repository - https://github.com/vasile- paskardlgm/TFGNN. WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. T able 3: Node classication results on medium-to-large graphs. #Improv . denotes the performance gain of TFGNN over the b est baseline result. Boldface represents the rst result, while underline d indicates the runner-up. GNN T ype Method homophilic graphs heterophilic graphs Cora Cite. Pubmed Arxiv Roman. Amazon. Ques. Gamers Genius i H2GCN 87 . 33 ± 0 . 6 75 . 11 ± 1 . 2 88 . 39 ± 0 . 6 71 . 93 ± 0 . 4 61 . 38 ± 1 . 2 37 . 17 ± 0 . 5 64 . 42 ± 1 . 3 64 . 71 ± 0 . 4 90 . 12 ± 0 . 2 GLOGNN 88 . 12 ± 0 . 4 76 . 23 ± 1 . 4 88 . 83 ± 0 . 2 72 . 08 ± 0 . 3 71 . 17 ± 1 . 2 42 . 19 ± 0 . 6 74 . 42 ± 1 . 3 65 . 62 ± 0 . 3 90 . 39 ± 0 . 3 LINKX 84 . 51 ± 0 . 6 73 . 25 ± 1 . 5 86 . 36 ± 0 . 6 71 . 14 ± 0 . 2 67 . 55 ± 1 . 2 41 . 57 ± 0 . 6 63 . 85 ± 0 . 8 65 . 82 ± 0 . 4 91 . 12 ± 0 . 5 OrderGNN 87 . 55 ± 0 . 2 75 . 46 ± 1 . 2 88 . 31 ± 0 . 3 71 . 90 ± 0 . 5 71 . 69 ± 1 . 6 40 . 93 ± 0 . 5 70 . 82 ± 1 . 0 66 . 09 ± 0 . 3 89 . 45 ± 0 . 4 LRGNN 87 . 48 ± 0 . 3 75 . 29 ± 1 . 0 88 . 65 ± 0 . 4 71 . 69 ± 0 . 3 72 . 35 ± 1 . 4 42 . 56 ± 0 . 4 71 . 82 ± 1 . 1 66 . 29 ± 0 . 5 90 . 38 ± 0 . 7 ii GCN 86 . 48 ± 0 . 4 75 . 23 ± 1 . 0 87 . 29 ± 0 . 2 71 . 77 ± 0 . 1 72 . 33 ± 1 . 6 42 . 09 ± 0 . 6 75 . 17 ± 0 . 8 63 . 29 ± 0 . 5 86 . 73 ± 0 . 5 GCNII 86 . 77 ± 0 . 2 76 . 57 ± 1 . 5 88 . 86 ± 0 . 4 71 . 72 ± 0 . 4 71 . 62 ± 1 . 7 40 . 89 ± 0 . 4 72 . 32 ± 1 . 0 65 . 11 ± 0 . 3 90 . 60 ± 0 . 6 ChebNet 86 . 83 ± 0 . 7 74 . 39 ± 1 . 3 85 . 92 ± 0 . 5 71 . 52 ± 0 . 3 64 . 44 ± 1 . 5 38 . 81 ± 0 . 7 70 . 42 ± 1 . 2 63 . 62 ± 0 . 4 87 . 42 ± 0 . 2 A CMGCN 87 . 21 ± 0 . 4 76 . 03 ± 1 . 4 87 . 37 ± 0 . 4 71 . 70 ± 0 . 3 66 . 48 ± 1 . 2 39 . 53 ± 0 . 9 67 . 84 ± 0 . 5 64 . 73 ± 0 . 3 83 . 45 ± 0 . 7 Specformer 88 . 19 ± 0 . 6 75 . 87 ± 1 . 5 88 . 74 ± 0 . 2 71 . 88 ± 0 . 2 71 . 69 ± 1 . 4 42 . 06 ± 0 . 8 70 . 75 ± 1 . 2 65 . 80 ± 0 . 2 89 . 39 ± 0 . 6 iii GPRGNN 88 . 26 ± 0 . 5 76 . 24 ± 1 . 2 88 . 81 ± 0 . 2 71 . 89 ± 0 . 2 64 . 49 ± 1 . 6 41 . 48 ± 0 . 6 64 . 58 ± 1 . 2 66 . 23 ± 0 . 1 90 . 92 ± 0 . 6 BernNet 87 . 57 ± 0 . 4 75 . 81 ± 1 . 8 88 . 48 ± 0 . 3 71 . 72 ± 0 . 3 65 . 44 ± 1 . 4 40 . 74 ± 0 . 7 65 . 53 ± 1 . 6 65 . 74 ± 0 . 3 89 . 75 ± 0 . 3 ChebNetII 88 . 17 ± 0 . 4 76 . 41 ± 1 . 3 88 . 98 ± 0 . 4 72 . 13 ± 0 . 3 66 . 77 ± 1 . 2 42 . 44 ± 0 . 9 71 . 28 ± 0 . 6 66 . 44 ± 0 . 5 90 . 60 ± 0 . 2 OptBasis 88 . 35 ± 0 . 6 76 . 22 ± 1 . 4 89 . 38 ± 0 . 3 72 . 10 ± 0 . 2 64 . 28 ± 1 . 8 41 . 63 ± 0 . 8 69 . 60 ± 1 . 2 66 . 81 ± 0 . 4 90 . 97 ± 0 . 5 JacobiConv 88 . 53 ± 0 . 8 76 . 27 ± 1 . 3 89 . 51 ± 0 . 2 71 . 87 ± 0 . 3 70 . 10 ± 1 . 7 42 . 18 ± 0 . 4 72 . 16 ± 1 . 3 64 . 17 ± 0 . 3 89 . 32 ± 0 . 5 NFGNN 88 . 06 ± 0 . 4 76 . 22 ± 1 . 4 88 . 43 ± 0 . 4 72 . 15 ± 0 . 3 72 . 46 ± 1 . 2 42 . 19 ± 0 . 3 75 . 49 ± 0 . 9 66 . 64 ± 0 . 4 90 . 87 ± 0 . 5 AdaptKry 88 . 23 ± 0 . 7 76 . 54 ± 1 . 2 88 . 38 ± 0 . 6 72 . 33 ± 0 . 3 71 . 40 ± 1 . 3 42 . 31 ± 1 . 1 72 . 55 ± 1 . 0 66 . 27 ± 0 . 3 90 . 55 ± 0 . 3 UniFilter 88 . 31 ± 0 . 7 76 . 38 ± 1 . 1 89 . 30 ± 0 . 4 72 . 87 ± 0 . 4 71 . 22 ± 1 . 5 41 . 37 ± 0 . 6 73 . 83 ± 0 . 8 65 . 75 ± 0 . 4 90 . 66 ± 0 . 2 TFGNN ( Ours ) 89.21 ± 0 . 4 77.68 ± 0 . 8 90.00 ± 0 . 2 75.23 ± 0 . 2 74.94 ± 1 . 1 45.04 ± 0 . 6 81.55 ± 0 . 9 69.46 ± 0 . 2 92.40 ± 0 . 2 #Improv . 0 . 68% 1 . 11% 0 . 49% 2 . 36% 2 . 48% 2 . 48% 6 . 06% 2 . 65% 1 . 28% T able 4: Results of slice approximation and lter learning. Method Poly . approx. Filter Learn. Poly . GNN 𝑓 2 ( 𝑥 ) 𝑓 3 ( 𝑥 ) 𝑓 4 ( 𝑥 ) 𝑓 2 ( 𝑥 ) 𝑓 3 ( 𝑥 ) 𝑓 4 ( 𝑥 ) Cheby . ChebNetII 85.19 244.8 187.2 168.4 402.5 347.5 Jacobi. JacobiConv 80.77 239.2 155.3 95.92 338.1 266.4 Learn. OptBasis 80.53 225.7 152.7 89.48 289.5 238.1 Trigo . TFGNN 23.69 71.13 59.88 65.19 102.3 105.3 5.1 Slice approximation and lter learning W e conduct the same numerical experiments as outline d in Se c- tion 3.2 to evaluate the pr oposed trigonometric graph lters and TFGNN. T o ensure a fair and informativ e comparison, the trigono- metric polynomial used in our numerical experiments is not in its naive form; rather , we employ the formulation that incorporates a 10 degree T aylor-based parameter decomposition, akin in that of Section 4.2. The polynomial degree 𝐾 is set to 10 , yielding 𝐾 + 1 co- ecients in total. Thus, TFGNN pr eserves both the maximum order of 𝝀 and the number of coecients used by other counterparts. Results. T able 4 summarizes the performance of TFGNN alongside the three leading alternatives—Chebyshev , Jacobian, and Learn- able—with boldface marking the highest scores due to space con- straints. Mor e results can b e found in App endix. According to these results, trigonometric polynomials and TFGNN consistently out- perform other metho ds. Particularly , for target functions e xhibiting complex patterns, such as 𝑓 2 ( 𝑥 ) , 𝑓 3 ( 𝑥 ) , and 𝑓 4 ( 𝑥 ) , TFGNN obtains notable improvements over competitors, showing the ecacy of our method. The following sections will show how TFGNN attains leading performance on real-world datasets, arming that the nu- merical outcomes correspond well to real-world scenarios. 5.2 Benchmark node classication tasks W e further assess TFGNN via benchmark no de classication tasks. 5.2.1 Datasets and baselines. Datasets. W e utilize 13 benchmark datasets with varie d sizes and heterophily levels [ 84 ]. For homophilic datasets, we comprise cita- tion graphs (Cora, CiteSeer , PubMed)[ 79 ] and large OGB graphs (ogbn- Ar xiv , ogbn-Products, ogbn-Papers100M)[ 29 ]. For heterophilic datasets, we select three latest datasets (Roman-empire , Amazon- ratings, Questions) [ 61 ] and four large ones (Gamers, Genius, Snap- patent, Pokec) [ 46 ]. (W e exclude conventional dataset choices [ 58 ] due to the recognized data-leakage issues in these datasets [61].) Baselines and settings. W e include 18 advanced baselines tai- lored for both heterophilic and homophilic scenarios, which can be categorized into three classes as follows: • Non-spectral GNNs: H2GCN [ 87 ], GLOGNN [ 42 ], LINKX [ 46 ], OrderGNN [68], LRGNN [43]. • Non-decoupled spe ctral GNNs: GCN [ 35 ], GCNII [ 8 ], Cheb- Net [13], A CMGCN [51], Specformer [2]. • Decoupled spe ctral GNNs: GPRGNN [ 10 ], BernNet [ 25 ], Cheb- NetII [ 26 ], OptBasis [ 24 ], NFGNN [ 83 ], JacobiConv [ 74 ], Adap- tKry [30], UniFilter [31]. For the widely adopte d baselines (GCN and ChebNet), we adopt consistent implementations drawn from previous research [ 24 – 26 , 30 , 41 , 70 , 74 , 83 ]. For the r emaining baselines, we inherit the hyperparameter tuning settings from their original publications. Implementation of TFGNN. T o ensure experimental fairness, we x the order of the T rigonometric Polynomial Decomposition, Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia T able 5: Node classication and runtime ( hours ) results on exceptionally large graphs. “OOM” denotes “Out-Of-Memory”. Method Products Papers100M Snap Poke c T est acc Runtime T est acc Runtime T est acc Runtime T est acc Runtime GCN 76 . 37 ± 0 . 2 1 . 2 OOM - 46 . 66 ± 0 . 1 1 . 9 74 . 78 ± 0 . 2 1 . 2 SGC 75 . 16 ± 0 . 2 0 . 9 64 . 02 ± 0 . 2 10 . 2 31 . 11 ± 0 . 2 1 . 6 60 . 29 ± 0 . 1 0 . 9 GPRGNN 79 . 45 ± 0 . 1 1 . 3 66 . 13 ± 0 . 2 11 . 1 48 . 88 ± 0 . 2 2 . 0 79 . 55 ± 0 . 3 1 . 2 BernNet 79 . 82 ± 0 . 2 1 . 3 66 . 08 ± 0 . 2 11 . 2 47 . 48 ± 0 . 3 2 . 1 80 . 55 ± 0 . 2 1 . 3 ChebNetII 81 . 66 ± 0 . 3 1 . 2 67 . 11 ± 0 . 2 11 . 0 51 . 74 ± 0 . 2 1 . 9 81 . 88 ± 0 . 3 1 . 2 JacobiConv 79 . 35 ± 0 . 2 1 . 0 65 . 45 ± 0 . 2 10 . 5 50 . 66 ± 0 . 2 1 . 7 73 . 83 ± 0 . 2 1 . 0 OptBasis 81 . 33 ± 0 . 2 1 . 3 67 . 03 ± 0 . 3 11 . 2 53 . 55 ± 0 . 1 2 . 1 82 . 09 ± 0 . 3 1 . 3 NFGNN 81 . 11 ± 0 . 2 1 . 3 66 . 38 ± 0 . 2 11 . 3 57 . 83 ± 0 . 3 2 . 1 81 . 56 ± 0 . 3 1 . 4 AdaptKry 81 . 70 ± 0 . 3 1 . 4 67 . 07 ± 0 . 2 11 . 3 55 . 92 ± 0 . 2 2 . 1 82 . 16 ± 0 . 2 1 . 4 UniFilter 80 . 33 ± 0 . 2 1 . 2 66 . 79 ± 0 . 3 11 . 0 52 . 06 ± 0 . 1 2 . 1 82 . 23 ± 0 . 3 1 . 3 TFGNN ( Ours ) 84.05 ± 0 . 2 1 . 2 68.65 ± 0 . 2 11 . 0 64.38 ± 0 . 2 1 . 9 85.55 ± 0 . 2 1 . 2 #Improv . 2 . 35% - 1 . 54% - 6 . 55% - 3 . 32% - 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (a) Cora. 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (b) Citeseer . 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (c) Roman. 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (d) Amazon. Figure 3: Ablation studies on 𝐾 and 𝜔 . Darker shades indicate higher results. Additional results are in Appendix. T able 6: Ablation studies on T aylor expansion degree. Degree 5 10 15 20 25 Cora 88 . 66 ± 0 . 3 89 . 21 ± 0 . 4 89 . 15 ± 0 . 2 89 . 53 ± 0 . 3 89 . 28 ± 0 . 3 Arxiv 73 . 14 ± 0 . 2 75 . 23 ± 0 . 2 74 . 74 ± 0 . 2 75 . 06 ± 0 . 2 74 . 92 ± 0 . 2 Roman. 72 . 67 ± 1 . 0 74 . 94 ± 1 . 1 74 . 83 ± 1 . 1 74 . 92 ± 1 . 2 75 . 02 ± 1 . 1 Genius 90 . 02 ± 0 . 3 92 . 40 ± 0 . 2 91 . 88 ± 0 . 3 91 . 83 ± 0 . 2 92 . 05 ± 0 . 3 denoted as 𝐷 , to 10 , aligning with other baselines such as GPRGNN and ChebNetII. W e employ a grid search to optimize the parameters 𝜔 within { 0 . 2 𝜋 , 0 . 3 𝜋 , 0 . 5 𝜋 , 0 . 7 𝜋 } and 𝐾 from { 2 , 4 , 6 , 8 , 10 , 15 , 20 } . Additional details are in Appendix. 5.2.2 Main results and discussions. Eectiveness of TFGNN. Our TFGNN achieves remarkable ad- vancements in performance on both heterophilic and homophilic graphs. Specically , across all 13 datasets, TFGNN not only leads in performance but does so with improvements of up to 6.55 over the closest competitor on the Snap-patents dataset. Furthermore, the advantages of TFGNN are signicantly mor e pronounced when evaluated on heterophilic datasets. This trend is corroborated by the numerical ndings in T able 4, which reveal TFGNN’s enhanced capacity to construct functions that can ac- commodate complex patterns. Existing studies have empirically shown that heterophilic graphs generally r equire signicantly more complex target lters than the low-pass lters used for homophilic graphs [ 25 , 30 , 77 ]. While these complex functions can compli- cate performance for other methods, TFGNN utilizes its advanced trigonometric lters to navigate these challenges, yielding substan- tial improvements on heter ophilic scenarios. Scalability and Eciency . T able 5 presents a comparative analy- sis of our TFGNN method alongside leading counterparts, with each baseline recognized for its exceptional scalability and eciency on large graphs. Notably , TFGNN demonstrates superior performance, signicantly exceeding all baselines across every dataset while maintaining eciency comparable to the top-p erforming methods. These ndings align with our expectations, as the mo del complex- ity—both in terms of computation and parameters—of TFGNN is on par with that of other approaches, as detailed in Section 4.4. 5.2.3 Ablation studies. Ablation studies on 𝐾 and 𝜔 . W e conduct ablation studies on the two pivotal hyperparameters, 𝐾 and 𝜔 , associate d with our trigonometric lters. Partial results are illustrated in Figure 3, while a more comprehensive analysis can be found in Appendix. The results re veal a notable tr end: for all datasets, the optimal values for 𝐾 and 𝜔 tend to fall within low ranges, sp ecically 𝐾 ∈ { 2 , 4 , 6 } and 𝜔 ∈ { 0 . 2 𝜋 , 0 . 3 𝜋 , 0 . 5 𝜋 } . Furthermore , their pr oduct 𝐾 · 𝜔 consistently converges to a similar range across all datasets, approximately 𝐾 · 𝜔 ∈ ( 0 . 6 𝜋 , 1 . 2 𝜋 ) . This nding aligns with Theo- rem 4.1, which indicates that high-degree terms contribute unnec- essary complexity . W e thus recommend initializing 𝐾 , 𝜔 , and 𝐾 · 𝜔 within these ranges for ecient use of our models, with further ne-tuning as needed for p erformance optimization. Ablation studies on 𝐷 . W e p erform ablation studies on the degree of T aylor expansion, 𝐷 . T able 6 shows that while increasing the de- gree impro ves performance to a certain extent, accuracy eventually stabilizes. This is consistent with prior studies and can be under- stood in terms of polynomial approximation. Higher-degree orthog- onal bases tend to minimize approximation loss; how ever , beyond an optimal degree, further increases become negligible [60, 67]. WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. T able 7: Graph anomaly dete ction results. ‡ Improvements are relative to general-purp ose metho ds rather than GAD baselines. T ype Dateset Y elpChi ( 1% ) Amazon ( 1% ) T -Finance ( 1% ) Metric F1-macro AUROC F1-macro AUROC F1-macro AUROC GAD Models PC-GNN 60 . 55 75 . 29 82 . 62 91 . 61 83 . 40 91.85 CARE-GNN 61 . 68 73 . 95 75 . 78 88 . 79 86 . 03 91 . 17 GDN 65 . 72 75 . 33 90 . 49 92.07 77 . 38 89 . 42 GAD-specialized spectral GNNs BW GNN 66.52 77 . 23 90 . 28 89 . 19 85 . 56 91 . 38 GHRN 62 . 77 74 . 64 86 . 65 87 . 09 80 . 70 91 . 55 General-purpose spectral GNNs GCN 50 . 66 54 . 31 69 . 79 85 . 18 75 . 26 87 . 05 GPRGNN 60 . 45 67 . 44 83 . 71 85 . 28 77 . 53 85 . 69 OptBasis 62 . 03 68 . 32 86 . 12 85 . 02 79 . 28 86 . 22 AdaptKry 63 . 40 66 . 18 83 . 30 84 . 58 80 . 67 85 . 41 NFGNN 60 . 66 67 . 36 85 . 61 86 . 88 82 . 38 86 . 59 Ours TFGNN 65 . 60 78.79 91.10 90 . 12 87.02 91 . 42 #Improv . ‡ 2 . 20% 10 . 47% 4 . 98% 3 . 24% 4 . 64% 4 . 37% 5.3 Application on graph anomaly detection W e investigate an application example of TFGNN for the graph anomaly detection (GAD ) task, which is typically recognized as binary node classication task (normal vs. abnormal) [53, 62]. 5.3.1 Datasets and baselines. Datasets. W e adopt three datasets (Y elpChi, Amazon, and T -Finance) with a low label-rate of 1% set across all datasets following [71]. Baselines and model implementations. W e include 10 baseline methods, organized into three types b elow: • GAD mo dels: PC-GNN [48], CARE-GNN [14], GDN [18]. • GAD-specialized spe ctral GNNs: BWGNN [71], GHRN [17]. • General-purpose spectral GNNs: GCN [ 35 ], GPRGNN [ 10 ], OptBasis [24], AdaptKry [30], NFGNN [83]. The specications for implementing common baselines (PC-GNN, CARE-GNN, BW GNN, GCN) are derived from [ 71 ]. In our TFGNN and other general-purpose metho ds, we utilize a two-layer MLP with 64 hidden units for the feature transformation module, main- taining alignment with the GAD-specialized models. The hyper- parameters for TFGNN are optimize d as detailed in Se ction 5.2, while the other baselines follow the congurations outlined in their original papers. More experimental details are in Appendix. 5.3.2 Main results and discussions. Improvements on specic task. T able 7 highlights the #Improv . metric, showing that TFGNN outperforms general-purpose models signicantly , achieving increases of up to 11 . 34% on the Y elpChi dataset. This suggests that while general-purpose spectral GNNs can p erform well in benchmark node classication tasks, they often underperform in specialize d applications. In contrast, TFGNN, with its advanced graph lters, consistently provides notable impr ove- ments across both standard and sp ecialized tasks, demonstrating the eectiveness and versatility of our approach. Comparable to GAD-sp ecialized spectral GNNs. T able 7 indi- cates that TFGNN’s performance rivals that of spe cialized spectral GNNs for GAD. Models like BW GNN and GHRN, which ar e built on the same graph spectrum principles, incorporate specic features aimed at enhancing performance. For example, BWGNN [ 71 ] eec- tively addresses the “right-shift” phenomenon with its customized beta wavelets, while GHRN [ 17 ] fo cuses on ltering out high- frequency components to prune inter-class edges in heterophilic graphs. In contrast, TFGNN oers a unique and eective ltering strategy for GAD tasks, showcasing impressiv e outcomes. This re- ects a promising direction for improving spectral GNNs through the introduction of more advanced polynomial graph lters. 6 Conclusions In this paper , we address the polynomial selection problem in spe c- tral GNNs, linking polynomial capabilities to their eectiveness. W e present the rst proof that the construction error of graph con- volution layers is bounde d by the sum of polynomial appro ximation errors on function slices, supported by intuitive numerical valida- tions. This insight motivates the use of “narrow function-pr eferred” polynomials, leading to the introduction of our advanced trigono- metric graph lters. The proposed lters not only demonstrate provable parameter-eciency but also employ T aylor-based pa- rameter decomposition for streamlined implementation. Building upon this, we introduce TFGNN, a scalable spectral GNN featur- ing a decoupled architecture. The ecacy of TFGNN is conrmed through benchmark node classication tasks and a practical exam- ple in graph anomaly detection, highlighting the adaptability and real-world rele vance of our theoretical contributions. Limitations and future works. Our theoretical framework is grounded in the concept of function slices, which are inherently linked to target lters. Howe ver , in practical scenarios, the diver- sity and variability of target lters can hinder the spe cicity of our theoretical results, potentially leading to suboptimal solutions. Therefore, a pr omising future research is to categorize these l- ters and analyze their numerical properties, ther eby enabling more consistent enhancements in the design of spectral GNNs. References [1] Muhammet Balcilar , Guillaume Renton, Pierre Héroux, Benoit Gaüzère, Sébastien Adam, and Paul Honeine. 2021. Analyzing the Expressive Power of Graph Neural Networks in a Spectral Perspective . In International Conference on Learning Representations . https://openreview .net/forum?id=- qh0M9XWxnv Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia [2] Deyu Bo, Chuan Shi, Lele W ang, and Renjie Liao. 2023. Specformer: Spe ctral Graph Neural Networks Meet T ransformers. In The Eleventh International Confer- ence on Learning Representations . https://openreview .net/forum?id=0pdSt3oyJa1 [3] Deyu Bo, Xiao W ang, Y ang Liu, Y uan Fang, Y awen Li, and Chuan Shi. 2023. A Survey on Spectral Graph Neural Networks. arXiv:2302.05631 [ cs.LG] [4] Salomon Bochner and Komaravolu Chandrasekharan. 1951. Fourier T ransforms. The Mathematical Gazette 35, 312 (1951), 140–141. doi:10.2307/3609365 [5] Fedor Borisyuk, Shihai He, Yunbo Ouyang, Morteza Ramezani, Peng Du, Xiaochen Hou, Chengming Jiang, Nitin Pasumarthy , Priya Bannur, Birjodh Tiwana, Ping Liu, Siddharth Dangi, Daqi Sun, Zhoutao Pei, Xiao Shi, Sirou Zhu, Qianqi Shen, Kuang-Hsuan Lee, David Stein, Baolei Li, Haichao W ei, Amol Ghoting, and Souvik Ghosh. 2024. LiGNN: Graph Neural Networks at Linke dIn (KDD ’24) . Association for Computing Machinery , New Y ork, NY, USA, 4793–4803. doi:10.1145/3637528. 3671566 [6] Joan Bruna, W ojciech Zaremba, Arthur Szlam, and Y ann Lecun. 2014. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR2014) . [7] Claudio Brunelli, Heikki Berg, and David Guevorkian. 2009. Approximating sine functions using variable-precision T aylor polynomials. In 2009 IEEE W orkshop on Signal Processing Systems . 57–62. doi:10.1109/SIPS.2009.5336225 [8] Ming Chen, Zhe wei W ei, Zengfeng Huang, Bolin Ding, and Y aliang Li. 2020. Sim- ple and Deep Graph Convolutional Networks. In Proceedings of the 37th Interna- tional Conference on Machine Learning (Proce edings of Machine Learning Research, V ol. 119) . PMLR, 1725–1735. https://procee dings.mlr .press/v119/chen20v.html [9] Zhengdao Chen, Lisha Li, and Joan Bruna. 2019. Sup ervised Community Detec- tion with Line Graph Neural Networks. In International Conference on Learning Representations . https://openreview .net/forum?id=H1g0Z3A9Fm [10] Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2021. Adaptive Univ ersal Generalized PageRank Graph Neural Network. In International Conference on Learning Representations . https://openreview .net/forum?id=n6jl7fLxrP [11] Fan Chung. 1997. Spectral Graph Theor y . V ol. 92. CBMS Regional Conference Series in Mathematics. doi:/10.1090/cbms/092 [12] T .N. Davidson, Zhi-Quan Luo, and J.F. Sturm. 2002. Linear matrix inequality formulation of spectral mask constraints with applications to FIR lter design. IEEE Transactions on Signal Processing 50, 11 (2002), 2702–2715. doi:10.1109/TSP . 2002.804079 [13] Michaël Deerrard, Xavier Br esson, and Pierre V andergheynst. 2016. Convolu- tional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Ad- vances in Neural Information Processing Systems , D . Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett (Eds.), V ol. 29. Curran Associates, Inc. [14] Yingtong Dou, Zhiwei Liu, Li Sun, Yutong Deng, Hao Peng, and P hilip S. Y u. 2020. Enhancing Graph Neural Network-based Fraud Dete ctors against Camouaged Fraudsters. In Proceedings of the 29th ACM International Confer- ence on Information & Knowledge Management (Virtual Event, Ireland) (CIKM ’20) . Association for Computing Machinery , New Y ork, NY, USA, 315–324. doi:10.1145/3340531.3411903 [15] Matthias Fey and Jan Eric Lenssen. 2019. Fast Graph Representation Learning with Py T orch Geometric. doi:10.48550/ARXI V .1903.02428 [16] Dengwei Fu and A.N. Willson. 1999. Design of an improved interpolation lter using a trigonometric p olynomial. In 1999 IEEE International Symposium on Circuits and Systems (ISCAS) , V ol. 4. 363–366 v ol.4. doi:10.1109/ISCAS.1999.780017 [17] Y uan Gao, Xiang W ang, Xiangnan He, Zhenguang Liu, Huamin Feng, and Y ong- dong Zhang. 2023. Addressing Heterophily in Graph Anomaly Detection: A Perspective of Graph Spectrum. In Procee dings of the ACM W eb Conference 2023 (A ustin, TX, USA) (W WW ’23) . Association for Computing Machinery , New Y ork, NY, USA, 1528–1538. doi:10.1145/3543507.3583268 [18] Y uan Gao, Xiang W ang, Xiangnan He, Zhenguang Liu, Huamin Feng, and Y ong- dong Zhang. 2023. Alleviating Structural Distribution Shift in Graph Anomaly Detection. In Proceedings of the Sixteenth ACM International Conference on W eb Search and Data Mining (Singapore, Singapore) (WSDM ’23) . Association for Com- puting Machinery , New Y ork, N Y , USA, 357–365. doi:10.1145/3539597.3570377 [19] Johannes Gasteiger, Aleksandar Bojche vski, and Stephan Günnemann. 2019. Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In International Conference on Learning Representations . https://openreview .net/ forum?id=H1gL- 2A9Ym [20] Chenghua Gong, Y ao Cheng, Xiang Li, Caihua Shan, and Siqiang Luo. 2024. Learn- ing from Graphs with Heterophily: Pr ogress and Future. arXiv:2401.09769 [cs.SI] https://arxiv .org/abs/2401.09769 [21] Jogh G.Proakis. 1975. Digital signal processing. IEEE Transactions on Acoustics, Speech, and Signal Processing 23, 4 (1975), 392–394. doi:10.1109/TASSP .1975. 1162707 [22] Lucy J. Gudino, Joseph X. Rodrigues, and S. N. Jagadeesha. 2008. Linear phase FIR lter for narrow-band ltering. In 2008 International Conference on Commu- nications, Circuits and Systems . 776–779. doi:10.1109/ICCCAS.2008.4657886 [23] Y uhe Guo and Zhewei W ei. 2023. Clenshaw Graph Neural Networks. In Pro- ceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’23) . Association for Computing Machinery , New Y ork, N Y , USA, 614–625. doi:10.1145/3580305.3599275 [24] Y uhe Guo and Zhewei W ei. 2023. Graph Neural Networks with Learnable and Optimal Polynomial Bases. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, V ol. 202) , Andreas Krause, Emma Brunskill, K yunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (Eds.). PMLR, 12077–12097. https://proce edings.mlr .press/ v202/guo23i.html [25] Mingguo He, Zhewei W ei, Zengfeng Huang, and Hongteng Xu. 2021. BernNet: Learning Arbitrary Graph Spectral Filters via Bernstein Approximation. In Ad- vances in Neural Information Processing Systems , A. Beygelzimer , Y. Dauphin, P. Liang, and J. W ortman Vaughan (Eds.). https://openreview .net/forum?id= WigDnV - _Gq [26] Mingguo He, Zhewei W ei, and Ji-Rong W en. 2022. Convolutional Neural Net- works on Graphs with Chebyshev Approximation, Revisited. In Advances in Neu- ral Information Processing Systems , Alice H. Oh, Alekh A garwal, Danielle Belgrave, and K yunghyun Cho (Eds.). https://openreview .net/forum?id=jxPJ4Q A0KAb [27] Xiangnan He, Kuan Deng, Xiang Wang, Y an Li, Y ongdong Zhang, and Meng W ang. 2020. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. doi:10.48550/ARXIV .2002.02126 [28] Roland F Hoskins. 2009. Delta functions: Introduction to generalise d functions . Horwood Publishing. [29] W eihua Hu, Matthias Fey , Marinka Zitnik, Y uxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Lesko vec. 2020. Op en Graph Benchmark: Datasets for Machine Learning on Graphs. In Advances in Neural Information Processing Systems , V ol. 33. 22118–22133. https://proceedings.neurips.cc/pap er_les/paper/ 2020/le/f b60d411a5c5b72b2e7d3527cfc84fd0- Paper .pdf [30] Keke Huang, W encai Cao, Hoang T a, Xiaokui Xiao, and Pietro Liò. 2024. Opti- mizing Polynomial Graph Filters: A Novel Adaptive Krylov Subspace Approach. In Proceedings of the ACM W eb Conference 2024 (Singap ore, Singapore) (WW W ’24) . Association for Computing Machinery , Ne w Y ork, N Y , USA, 1057–1068. doi:10.1145/3589334.3645705 [31] Keke Huang, Yu Guang W ang, Ming Li, and Pietro Lio. 2024. How Universal Polynomial Bases Enhance Spectral Graph Neural Networks: Heterophily , Over- smoothing, and Over-squashing. In Forty-rst International Conference on Machine Learning . https://op enrevie w .net/forum?id=Z2LH6V a7L2 [32] Kafetzis Ioannis, Moysis Lazaros, and V olos Christos. 2023. Assessing the chaos strength of T aylor approximations of the sine chaotic map. Nonlinear Dynamics 111 (2023), 2755–2778. doi:10.1007/s11071- 022- 07929- y [33] E. Stine James and J. Schulte Michael. 1999. The Symmetric T able Addition Method for Accurate Function Approximation. Journal of VLSI signal processing systems for signal, image and video te chnology 21 (1999), 167–177. doi:10.1023/A: 1008004523235 [34] Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Opti- mization. doi:10.48550/ARXIV .1412.6980 [35] Thomas N. Kipf and Max W elling. 2017. Semi-Super vised Classication with Graph Convolutional Networks. In International Conference on Learning Repre- sentations . https://op enrevie w .net/forum?id=SJU4ayY gl [36] B. Lee and N. Burgess. 2003. Some results on T aylor-series function approximation on FPGA. In The Thrity-Seventh Asilomar Conference on Signals, Systems and Computers, 2003 , V ol. 2. 2198–2202. doi:10.1109/A CSSC.2003.1292370 [37] Runlin Lei, Zhen W ang, Y aliang Li, Bolin Ding, and Zhewei W ei. 2022. EvenNet: Ignoring Odd-Hop Neighbors Improves Robustness of Graph Neural Networks. In Advances in Neural Information Processing Systems , S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), V ol. 35. Curran Associates, Inc., 4694–4706. https://op enrevie w .net/forum?id=SPoiDLr3WE7 [38] Bingheng Li, Erlin Pan, and Zhao Kang. 2024. PC-Conv: Unifying Homophily and Heterophily with Two-Fold Filtering. Proceedings of the AAAI Conference on A rticial Intelligence 38, 12 (Mar. 2024), 13437–13445. doi:10.1609/aaai.v38i12. 29246 [39] Guoming Li, Jian Yang, and Shangsong Liang. 2025. ERGNN: Spectral Graph Neural Network With Explicitly-Optimized Rational Graph Filters. arXiv:2412.19106 [cs.LG] https://arxiv .org/abs/2412.19106 [40] Guoming Li, Jian Y ang, Shangsong Liang, and Dongsheng Luo. 2024. El- evating Spectral GNNs through Enhanced Band-pass Filter Approximation. arXiv:2404.15354 [eess.SP] [41] Guoming Li, Jian Y ang, Shangsong Liang, and Dongsheng Luo. 2024. Sp ectral GNN via T wo-dimensional (2-D) Graph Convolution. arXiv:2404.04559 [cs.LG] [42] Xiang Li, Renyu Zhu, Y ao Cheng, Caihua Shan, Siqiang Luo, Dongsheng Li, and W eining Qian. 2022. Finding Global Homophily in Graph Neural Networks When Meeting Heterophily . In Proceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, V ol. 162) , Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, 13242–13256. https://proceedings.mlr .press/v162/li22ad. html [43] Langzhang Liang, Xiangjing Hu, Zenglin Xu, Zixing Song, and Irwin King. 2023. Predicting Global Label Relationship Matrix for Graph Neural Networks under Heterophily . In Advances in Neural Information Processing Systems , A. Oh, T . Nau- mann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), V ol. 36. Curran Associates, Inc., 10909–10921. https://proceedings.neurips.cc/pap er_les/paper/ WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. 2023/le/23aa2163dea287441ebebc1295d5b3fc- Paper- Conference.pdf [44] Ningyi Liao, Siqiang Luo, Xiang Li, and Jieming Shi. 2023. LD2: Scalable Het- erophilous Graph Neural Netw ork with Decoupled Embeddings. In Thirty-seventh Conference on Neural Information Processing Systems . https://openreview .net/ forum?id=7zkFc9TGKz [45] Ningyi Liao, Dingheng Mo, Siqiang Luo, Xiang Li, and Pengcheng Yin. 2024. Scalable decoupling graph neural network with feature-oriented optimization. The VLDB Journal 33, 3 (2024), 667–683. [46] Derek Lim, Felix Matthew Hohne, Xiuyu Li, Sijia Linda Huang, V aishnavi Gupta, Omkar Prasad Bhalerao, and Ser-Nam Lim. 2021. Large Scale Learning on Non- Homophilous Graphs: New Benchmarks and Strong Simple Methods. In Advances in Neural Information Processing Systems , A. Beygelzimer , Y . Dauphin, P. Liang, and J. W ortman V aughan (Eds.). https://openreview .net/forum?id=DfGu8W w T0d [47] Vijay Lingam, Manan Sharma, Chanakya Ekbote, Rahul Ragesh, Arun Iyer , and Sundararajan Sellamanickam. 2023. A Piece- Wise Polynomial Filtering Approach for Graph Neural Networks. In Machine Learning and Knowledge Discovery in Databases . Springer International Publishing, Cham, 412–452. [48] Y ang Liu, Xiang Ao, Zidi Qin, Jianfeng Chi, Jinghua Feng, Hao Y ang, and Qing He. 2021. Pick and Choose: A GNN-based Imbalanced Learning Approach for Fraud Detection. In Proce edings of the W eb Conference 2021 (Ljubljana, Slovenia) (WW W ’21) . Association for Computing Machinery , Ne w Y ork, N Y , USA, 3168–3177. doi:10.1145/3442381.3449989 [49] Kangkang Lu, Y anhua Yu, Hao Fei, Xuan Li, Zixuan Y ang, Zirui Guo, Meiyu Liang, Mengran Yin, and T at-Seng Chua. 2024. Improving Expressive Power of Spectral Graph Neural Networks with Eigenvalue Correction. Proceedings of the AAAI Conference on Articial Intelligence 38, 13 (Mar . 2024), 14158–14166. doi:10.1609/aaai.v38i13.29326 [50] Sitao Luan, Chenqing Hua, Qincheng Lu, Liheng Ma, Lirong Wu, Xinyu W ang, Minkai Xu, Xiao-W en Chang, Doina Precup, Rex Ying, Stan Z. Li, Jian Tang, Guy W olf, and Stefanie Jegelka. 2024. The Heter ophilic Graph Learning Hand- book: Benchmarks, Models, Theoretical Analysis, Applications and Challenges. arXiv:2407.09618 [cs.LG] https://arxiv .org/abs/2407.09618 [51] Sitao Luan, Chenqing Hua, Qincheng Lu, Jiaqi Zhu, Mingde Zhao, Shuyuan Zhang, Xiao- W en Chang, and Doina Precup. 2022. Revisiting Heterophily For Graph Neural Networks. In Advances in Neural Information Processing Systems , S. K oyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho , and A. Oh (Eds.), V ol. 35. Curran Associates, Inc., 1362–1375. https://proce edings.neurips.cc/paper_les/ paper/2022/le/092359ce5cf60a80e882378944bf1be4- Paper- Conference.pdf [52] Exl Lukas, J. Mauser Norb ert, and Zhang Y ong. 2016. Accurate and ecient computation of nonlocal p otentials based on Gaussian-sum approximation. J. Comput. Phys. 327 (2016), 629–642. doi:10.1016/j.jcp.2016.09.045 [53] Xiaoxiao Ma, Jia Wu, Shan Xue, Jian Y ang, Chuan Zhou, Quan Z. Sheng, Hui Xiong, and Leman Akoglu. 2023. A Comprehensive Sur vey on Graph Anom- aly Detection With De ep Learning. IEEE Transactions on Knowledge and Data Engineering 35, 12 (2023), 12012–12038. doi:10.1109/TKDE.2021.3118815 [54] P. Del Moral and A. Niclas. 2018. A T aylor expansion of the square root matrix function. J. Math. A nal. A ppl. 465, 1 (2018), 259–266. doi:10.1016/j.jmaa.2018.05. 005 [55] Mark Newman. 2010. Networks: A n Introduction . Oxford University Press. doi:10. 1093/acprof:oso/9780199206650.001.0001 [56] Peter Nilsson, Ateeq Ur Rahman Shaik, Rakesh Gangarajaiah, and Erik Hertz. 2014. Hardwar e implementation of the exponential function using Taylor series. In 2014 NORCHIP . 1–4. doi:10.1109/NORCHIP .2014.7004740 [57] Antonio Ortega, Pascal Frossard, Jelena Kovačević, José M. F . Moura, and Pierre V andergheynst. 2018. Graph Signal Processing: Overview , Challenges, and Ap- plications. Pr oc. IEEE 106, 5 (2018), 808–828. doi:10.1109/JPROC.2018.2820126 [58] Hongbin Pei, Bingzhe W ei, Kevin Chen-Chuan Chang, Y u Lei, and Bo Y ang. 2020. Geom-GCN: Geometric Graph Convolutional Networks. In International Confer- ence on Learning Representations . https://openreview .net/forum?id=S1e2agrFvS [59] Chen Peng, Villa Umberto, and Ghattas Omar . 2019. T aylor approximation and variance reduction for PDE-constrained optimal control under uncertainty. J. Comput. Phys. 385 (2019), 163–186. doi:10.1016/j.jcp.2019.01.047 [60] George M Phillips. 2003. Interpolation and approximation by polynomials . V ol. 14. Springer New Y ork. doi:10.1007/b97417 [61] Oleg Platonov , Denis Kuznedelev , Michael Diskin, Artem Babenko, and Liudmila Prokhorenkova. 2023. A critical lo ok at the evaluation of GNNs under heterophily: Are we really making pr ogress? . In The Eleventh International Conference on Learning Representations . https://openreview .net/forum?id=tJbbQfw- 5wv [62] Hezhe Qiao, Hanghang T ong, Bo An, Ir win King, Charu Aggarwal, and Guansong Pang. 2024. Deep Graph Anomaly Detection: A Survey and New Perspectives. arXiv:2409.09957 [cs.LG] https://arxiv .org/abs/2409.09957 [63] Aliaksei Sandryhaila and José M. F. Moura. 2013. Discrete Signal Processing on Graphs. IEEE Transactions on Signal Processing 61, 7 (2013), 1644–1656. doi:10. 1109/TSP.2013.2238935 [64] Aliaksei Sandryhaila and José M. F. Moura. 2013. Discrete signal processing on graphs: Graph lters. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing . 6163–6166. doi:10.1109/ICASSP .2013.6638849 [65] A. Sharma and A. K. Varma. 1965. Trigonometric interpolation. Duke Mathemat- ical Journal 32, 2 (1965), 341–357. doi:10.1215/S0012- 7094- 65- 03235- 7 [66] David I Shuman, Sunil K. Narang, Pascal Frossar d, Antonio Ortega, and Pierre V andergheynst. 2013. The emerging eld of signal processing on graphs: Extend- ing high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine 30, 3 (2013), 83–98. doi:10.1109/MSP.2012.2235192 [67] Gordon K Smyth. 1998. Polynomial appro ximation. Ency clopedia of Biostatistics 13 (1998). [68] Y unchong Song, Chenghu Zhou, Xinbing W ang, and Zhouhan Lin. 2023. Ordered GNN: Ordering Message Passing to Deal with Heterophily and Over-smoothing. In The Eleventh International Conference on Learning Representations . https: //openreview .net/forum?id=wKPmPBHSnT6 [69] Gilbert Strang. 2006. Linear algebra and its applications. Belmont, CA: Thomson, Brooks/Cole. [70] Jiaqi Sun, Lin Zhang, Guangyi Chen, Peng Xu, Kun Zhang, and Yujiu Y ang. 2023. Featur e Expansion for Graph Neural Networks. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, V ol. 202) . PMLR, 33156–33176. [71] Jianheng Tang, Jiajin Li, Ziqi Gao, and Jia Li. 2022. Rethinking Graph Neural Networks for Anomaly Detection. In Procee dings of the 39th International Confer- ence on Machine Learning (Proce e dings of Machine Learning Research, V ol. 162) , Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, 21076–21089. https://proce edings.mlr .press/v162/ tang22b.html [72] Qian Tao , Zhen W ang, W enyuan Y u, Y aliang Li, and Zhewei W ei. 2023. LON-GNN: Spectral GNNs with Learnable Orthonormal Basis. arXiv:2303.13750 [cs.LG] [73] Gabriel T aubin, T ong Zhang, and Gene Golub. 1996. Optimal surface smoothing as lter design. In ECCV ’96 . Springer Berlin Heidelberg, 283–292. [74] Xiyuan W ang and Muhan Zhang. 2022. How Pow erful are Spe ctral Graph Neural Networks. In Proceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, V ol. 162) , Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, 23341–23362. https://proce edings.mlr .press/v162/wang22am.html [75] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2021), 4–24. doi:10. 1109/TNNLS.2020.2978386 [76] Lianghao Xia, Y ong Xu, Chao Huang, Peng Dai, and Liefeng Bo. 2021. Graph Meta Network for Multi-Behavior Recommendation. In The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval . 757–766. doi:10.1145/3404835.3462972 [77] Junjie Xu, Enyan Dai, Dongsheng Luo, Xiang Zhang, and Suhang W ang. 2023. Learning Graph Filters for Spectral GNNs via Newton Interpolation. arXiv:2310.10064 [cs.LG] [78] Keyulu Xu, Mozhi Zhang, Stefanie Jegelka, and Kenji Kawaguchi. 2021. Opti- mization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, V ol. 139) , Marina Meila and T ong Zhang (Eds.). PMLR, 11592–11602. https://proceedings.mlr .press/v139/ xu21k.html [79] Zhilin Y ang, William W . Cohen, and Ruslan Salakhutdinov . 2016. Revisiting Semi- Supervised Learning with Graph Emb eddings. doi:10.48550/ARXI V .1603.08861 [80] Pavel Zahradnik and Miroslav Vlcek. 2012. Perfect Decomp osition Narrow-Band FIR Filter Banks. IEEE Transactions on Circuits and Systems II: Express Briefs 59, 11 (2012), 805–809. doi:10.1109/T CSII.2012.2218453 [81] Y uan Zhang, Dong W ang, and Y an Zhang. 2019. Neural IR Meets Graph Embed- ding: A Ranking Model for Product Search. In The W orld Wide W eb Conference (San Francisco, CA, USA) (WWW ’19) . Association for Computing Machiner y , New Y ork, N Y , USA, 2390–2400. doi:10.1145/3308558.3313468 [82] Ruijie Zhao and David B. T ay. 2023. Minimax design of two-channel critically sampled graph QMF banks. Signal Processing 212 (2023), 109129. doi:10.1016/j. sigpro.2023.109129 [83] Shuai Zheng, Zhenfeng Zhu, Zhizhe Liu, Y ouru Li, and Y ao Zhao. 2023. Node- Oriented Sp ectral Filtering for Graph Neural Networks. IEEE Transactions on Pattern A nalysis and Machine Intelligence 46, 1 (2023), 388–402. doi:10.1109/ TP AMI.2023.3324937 [84] Xin Zheng, Yi W ang, Yixin Liu, Ming Li, Miao Zhang, Di Jin, Philip S. Yu, and Shirui Pan. 2024. Graph Neural Networks for Graphs with Heter ophily: A Sur vey . arXiv:2202.07082 [cs.LG] https://arxiv .org/abs/2202.07082 [85] Y un Zhiwei and Zhang W ei. 2017. Shtukas and the T aylor expansion of 𝐿 - functions. A nnals of Mathematics 186, 3 (2017), 767–911. doi:10.4007/annals.2017. 186.3.2 [86] Y u Zhou, Haixia Zheng, Xin Huang, Shufeng Hao, Dengao Li, and Jumin Zhao. 2022. Graph Neural Networks: T axonomy, A dvances, and Trends. ACM Transac- tions on Intelligent Systems and T echnology 13, 1, Article 15 (jan 2022), 54 pages. doi:10.1145/3495161 [87] Jiong Zhu, Y ujun Y an, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Netw orks: Current Limitations Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia and Eective Designs. In Advances in Neural Information Processing Systems , V ol. 33. Curran Associates, Inc., 7793–7804. https://proceedings.neurips.cc/ paper_les/paper/2020/le/58ae23d878a47004366189884c2f8440- Pap er .p df [88] A. Zygmund and Robert Feerman. 2003. Trigonometric Series (3 ed.). Cambridge University Press. doi:10.1017/CBO9781316036587 A Proof of Lemma 3.4 Proof. W e establish this inequality by pro ving its right-hand and left-hand sides independently . Proof of right hand side . For convenience , we dene the L 2 norm of a function 𝑔 over the interval [ 0 , 2 ] , denoted by ∥ 𝑔 ∥ 2 , as follows: ∥ 𝑔 ∥ 2 ≜ ( ∫ 2 0 | 𝑔 ( 𝑥 ) | 2 𝑑 𝑥 ) 1 2 (17) Using the norm expression dened earlier , and recalling the expr es- sion for 𝜖 from Eq. 6, w e can derive the follo wing inequalities by applying the Cauchy-Schwarz inequality: 𝜖 = ∥ 𝑓 ( 𝑥 ) − T 0: 𝐷 ( 𝑥 ; 𝑓 ) ∥ 2 2 , = ∥ 𝑛  𝑠 = 1 𝑓 𝑠 ( 𝑥 ) − 𝑛  𝑠 = 1 T 0: 𝐷 ( 𝑥 ; 𝑓 𝑠 ) ∥ 2 2 , (18) ≤ ( 𝑛  𝑠 = 1 ∥ 𝑓 𝑠 ( 𝑥 ) − T 0: 𝐷 ( 𝑥 ; 𝑓 𝑠 ) ∥ 2 ) 2 = ( 𝑛  𝑠 = 1 √ 𝜖 𝑠 ) 2 , (19) which is our right-hand side. Proof of Le-hand side. T o proceed without loss of generality , we consider 𝑓 to be nonnegative over the entire interval. Recalling the denition of 𝜖 from Eq. 6, it follows that 𝜖 = ∥ T 0: 𝐷 ( 𝑥 ; 𝑓 ) − 𝑓 ( 𝑥 ) ∥ 2 2 , = ∥ 𝑛  𝑠 = 1 𝑓 𝑠 ( 𝑥 ) − 𝑛  𝑠 = 1 T 0: 𝐷 ( 𝑥 ; 𝑓 𝑠 ) ∥ 2 2 , (20) = 2  1 ≤ 𝑝 ≤ 𝑞 ≤ 𝑛 ∥  𝑑 𝑒 𝑡     𝑓 𝑝 ( 𝑥 ) 𝑓 𝑞 ( 𝑥 ) T 0: 𝐷 ( 𝑥 ; 𝑓 𝑞 ) T 0: 𝐷 ( 𝑥 ; 𝑓 𝑝 )     ∥ 2 2 + 𝑛  𝑠 = 1 √ 𝜖 𝑠 2 , (21) ≥ 0 + 𝑛  𝑠 = 1 √ 𝜖 𝑠 2 = 𝑛  𝑠 = 1 𝜖 𝑠 , (22) which is our left-hand side. Thus, combining the two parts of the proof above, we conrm that Lemma 3.4 holds for the continuous form of error . □ Adaptation to the disctrete error form. This is due to the appli- cability of the Cauchy-Schwarz inequality to the discr ete version of norm inequalities, ensuring that the right-hand side holds for the discrete form of the error . The left-hand side, which is based solely on fundamental non-negative relations, also maintains the inequality in the discrete setting. Consequently , Lemma 3.4 can be directly adapted to the discrete scenario. B Proof of Theorem 3.5 Proof. W e establish this inequality by pro ving its right-hand and left-hand sides independently . Proof of right hand side. Recalling the expression of 𝜉 from Eq. 8, we can derive the following inequality using the submultiplicative property of Frobenius norm: 𝜉 = ∥ 𝑼 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) 𝑼 𝑇 𝑿 𝑾 ∥ 𝐹 ≤ ∥ 𝑼 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) 𝑼 𝑇 ∥ 𝐹 · ∥ 𝑿 ∥ 𝐹 · ∥ 𝑾 ∥ 𝐹 , (23) ≤ 𝑟 · ∥ 𝑼 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) 𝑼 𝑇 ∥ 𝐹 · ∥ 𝑿 ∥ 𝐹 . (24) Note that 𝑼 is orthogonal matrix, which will not inuence the Frobenius norm of any matrices in product operation. Thus, the inequality above can further be derived as: 𝜉 ≤ 𝑟 · ∥ 𝑼 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) 𝑼 𝑇 ∥ 𝐹 · ∥ 𝑿 ∥ 𝐹 = 𝑟 · ∥ 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) ∥ 𝐹 · ∥ 𝑿 ∥ 𝐹 , (25) = 𝑟 · 𝜖 · ∥ 𝑿 ∥ 𝐹 , (26) ≤ 𝑟 ∥ 𝑿 ∥ 𝐹 ( 𝑛  𝑠 = 1 √ 𝜖 𝑠 ) 2 , (27) which is the right-hand side. Proof of le hand side. Using the basic of the matrix perturbation theory , we can derive the follo wing inequality: 𝜉 = ∥ 𝑼 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) 𝑼 𝑇 𝑿 𝑾 ∥ 𝐹 ≥ 𝛿 𝑿 𝛿 𝑾 ∥ 𝑼 𝑑 𝑖 𝑎𝑔 ( T 0: 𝐷 ( 𝝀 ; 𝑓 ) − 𝑓 ( 𝝀 ) ) 𝑼 𝑇 ∥ 𝐹 , (28) ≥ 𝛿 𝑿 𝛿 𝑾 𝑛  𝑠 = 1 𝜖 𝑠 , (29) which is the left-hand side. Thus, combining the two parts of the proof ab ove , we conrm that Theorem 3.5 holds. □ C Proof of Theorem 4.1 Proof. T o b egin with, note that 𝛼 𝑘 and 𝛽 𝑘 can be computed as follows: 𝛼 𝑘 = 𝜔 𝜋 ∫ 2 𝜋 𝜔 0 𝑓 ( 𝑥 ) sin ( 𝑘 𝜔 𝑥 ) 𝑑 𝑥 , 𝛽 𝑘 = 𝜔 𝜋 ∫ 2 𝜋 𝜔 0 𝑓 ( 𝑥 ) cos ( 𝑘 𝜔 𝑥 ) 𝑑 𝑥 . (30) W e alternatively consider such a real function 𝑓 ′ ( 𝑥 ) which is dened as follows: 𝑓 ′ ( 𝑥 ) = ( 𝑓 ( 𝑥 ) , 𝑥 ∈  0 , 2 𝜋 𝜔  ; 0 , 𝑜 𝑡 ℎ𝑒 𝑟 𝑠 . (31) Notice that such 𝑓 ′ ( 𝑥 ) satises the Dirichlet conditions [ 28 ], the Fourier transform of 𝑓 ′ ( 𝑥 ) exists according to [ 21 ] and is dened as follows: 𝐹 ′ ( Ω ) = ∫ +∞ − ∞ 𝑓 ′ ( 𝑥 ) 𝑒 − 𝑖 Ω 𝑥 𝑑 𝑥 , = ∫ +∞ − ∞ 𝑓 ′ ( 𝑥 ) [ cos ( Ω 𝑥 ) − 𝑖 · sin ( Ω 𝑥 ) ] 𝑑 𝑥 , (32) where Ω denotes the variable in frequency domain, and 𝑖 is the imaginary unit. Furthermore, the 𝑓 ′ ∈ 𝐿 1 ( R 𝑛 ) is an integrable function, making 𝑓 ′ ( 𝑥 ) satisfy the Riemann–Lebesgue lemma which is dene d as follows: WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. Theorem C.1. ( Riemann–Lebesgue lemma [ 4 ] ). Let 𝑔 ∈ 𝐿 1 ( R 𝑛 ) be an integrable function, and let 𝐺 be the Fourier transform of 𝑔 . Then the 𝐺 vanishes at innity , which is dene d as follows: lim | Ω | →∞ | 𝐺 ( Ω ) | = 0 . (33) Thus, the limit of 𝐹 ′ ( Ω ) as Ω approaches +∞ equals 0 , which is dened as lim Ω →+∞ 𝐹 ′ ( Ω ) = lim Ω →+∞  ∫ +∞ − ∞ 𝑓 ′ ( 𝑥 ) [ cos ( Ω 𝑥 ) − 𝑖 · sin ( Ω 𝑥 ) ] 𝑑 𝑥  | {z } Formula 1 , = 0 . (34) Since the 𝑓 ′ ( 𝑥 ) is a real function, the Formula 1 equals 0 if and only if the following equations hold: lim Ω →+∞ ∫ +∞ − ∞ 𝑓 ′ ( 𝑥 ) cos ( Ω 𝑥 ) 𝑑 𝑥 = 0 , lim Ω →+∞ ∫ +∞ − ∞ 𝑓 ′ ( 𝑥 ) sin ( Ω 𝑥 ) 𝑑 𝑥 = 0 . (35) With combing the denition of 𝑓 ′ ( 𝑥 ) in Eq. 31, the Eq. 35 abov e can be further derived as follows: lim Ω →+∞ ∫ 2 𝜋 𝜔 0 𝑓 ( 𝑥 ) cos ( Ω 𝑥 ) 𝑑 𝑥 = 0 , lim Ω →+∞ ∫ 2 𝜋 𝜔 0 𝑓 ( 𝑥 ) sin ( Ω 𝑥 ) 𝑑 𝑥 = 0 . (36) Finally , with replacing the Ω , Ω → +∞ with 𝑘 𝜔 , 𝑘 → +∞ in Eq. 36, and further considering the Eq. 30, we obtain the following equations: lim 𝑘 →+∞ 𝛼 𝑘 = 𝜔 𝜋 · lim 𝑘 →+∞ ∫ 2 𝜋 𝜔 0 𝑓 ( 𝑥 ) sin ( 𝑘 𝜔 𝑥 ) 𝑑 𝑥 = 0 , lim 𝑘 →+∞ 𝛽 𝑘 = 𝜔 𝜋 · lim 𝑘 →+∞ ∫ 2 𝜋 𝜔 0 𝑓 ( 𝑥 ) cos ( 𝑘 𝜔 𝑥 ) 𝑑 𝑥 = 0 , (37) and the proof is completed. □ D Related W orks D .1 Sp ectral-based graph neural networks Spectral-based graph neural networks form a unique branch of GNNs designe d to process graph-structured data by applying graph lters to execute graph convolution (ltering) operations [ 3 ]. The pioneering spectral GNN, SpectralCNN[ 6 ], was developed as a gen- eralization of convolutional neural networks for graph data, using principles from sp ectral graph the ory . Subsequent renements, such as ChebNet[13] and GCN [35], have built upon this foundation. In recent advancements, the design of sp ectral GNNs has in- creasingly focused on incorporating various graph lters, which are central to their functionality . Polynomial approximation has become the prevailing approach for constructing these lters, pro- viding both enhanced performance and operational eciency . As a result, many contemporary spectral GNNs are predominantly dened by polynomial frame works. For instance, GPRGNN [ 10 ] in- troduces a monomial-based graph lter , interpreted as a generalized PageRank algorithm. BernNet [ 13 ] leverages Bernstein polynomi- als to create nonnegativ e graph lters, demonstrating signicant eectiveness in real-world applications. JacobiConv [ 74 ] unies dif- ferent methods by employing Jacobian polynomials. OptBasis [ 24 ] improves the design of spectral GNNs by introducing lters with optimal polynomial bases. UniFilter [ 31 ] introduces the notion of universal bases, bridging polynomial lters with graph heterophily . D .2 No de classication with heterophily In recent y ears, heterophilic graphs have drawn considerable in- terest in the eld of graph learning. Unlike traditional homophilic graphs, where linked no des usually share the same label, heterophilic graphs connect nodes with contrasting labels. This unique struc- ture presents signicant challenges for graph neural networks (GNNs)[ 20 , 50 , 84 ], which are typically designed for homophilic settings. T o address these challenges, a range of GNNs tailored to heterophily hav e emerged. For example, H2GCN[ 87 ] introduces specialized me chanisms for embedding nodes in heter ophilic envi- ronments, OrderGNN [ 68 ] restructures message-passing to account for heterophily , and LRGNN [ 43 ] leverages a global label relation- ship matrix to improve performance under heterophily . Addressing heterophily with spectral GNNs. In most recent, spectral-based GNNs have shown promise in addressing these chal- lenges by learning dataset-specic lters that extend beyond the standard low-pass lters used in conventional GNNs. By doing so , spectral GNNs demonstrate improved performance in tackling het- erophilic graphs, achieving superior results in node classication under heterophily [10, 24–26, 31, 41, 74]. D .3 Graph anomaly dete ction Graph-based anomaly detection (GAD) is a specialize d task within anomaly detection, aimed at identifying anomalies within graph- structured data [ 53 , 62 ]. The primary goal in GAD is to detect anomalous nodes (outliers) in the graph by le veraging a limited set of labeled samples, including both anomalous and normal nodes. Eectively , GAD can be viewed as a binary node classication task, where the classes represent anomaly and normalcy . The recent success of Graph Neural Networks ( GNNs) in node classication has spurred the development of GAD-specialized GNN methods, such as CARE-GNN [ 14 ], PC-GNN [ 48 ], and GDN [ 18 ], with each signicantly enhancing detection performance. Spectral GNNs in GAD . Building on the success of GNN-base d ap- proaches for graph anomaly detection (GAD), recent studies leverag- ing sp ectral GNNs have yielded pr omising results. By framing GAD through graph spectrum analysis, these methods introduce nov el perspectives on the problem. For example, BW GNN [ 71 ] utilizes beta graph wavelets for signal ltering, eectively addressing the “right-shift” phenomenon in GAD. Similarly , GHRN [ 17 ] enhances GAD by pruning inter-class edges, focusing on high-frequency graph components to improve detection performance. E Experimental Details This section outlines the extensive e xperimental settings relevant to the studies conducted in Section 3.2 and Section 5. Experiments are conducted using an N VIDIA T esla V100 GP U with 32GB of memory , running on Ubuntu 20.04 OS and CUD A version 11.8. Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia E.1 Experimental details of numerical validation Descriptions to target functions. The six target functions em- ployed in the numerical experiments are dene d by the expressions presented in T able 8. Random graph construction. W e construct random graphs using the Erdős-Rényi model, spe cically denoted as 𝐺 ( 𝑛, 𝑝 ) [ 55 ]. In our experiments, we set the number of nodes 𝑛 to 50,000 and the edge creation probability 𝑝 to 0.5. Random no de feature construction. W e construct random node features 𝑿 drawn from a Gaussian distribution. Each entry in the feature matrix 𝑿 is independently sampled and follows a standard normal distribution, 𝑁 ( 0 , 1 ) . Experimental settings. T o start, w e randomly generate ten pairs of graphs and features, denoted as ( G 1 , 𝑿 1 ) , ( G 2 , 𝑿 2 ) , ..., ( G 10 , 𝑿 10 ) . For each pair ( G 𝑗 , 𝑿 𝑗 ) , we apply six dierent graph lters, result- ing in six ltered outputs: 𝒀 1 𝑗 , 𝒀 2 𝑗 , ..., 𝒀 6 𝑗 . This process involv es performing graph convolution on 𝑿 using these target functions. As discussed in Section 3.2, we pursue two tasks: the rst task involves appro ximating function slices, while the second focuses on lter learning. • Approximation of function slices. W e generate 50000 func- tion slices for each target function based on the eigenvalues of the graph G 𝑗 . V arious polynomial bases are employed for the approximation, with the minimum sum of squared errors (SSE) serving as the evaluation metric. The nal results are av eraged over ten randomly generated graphs. • Graph lter learning. W e implement a one-layer linear spectral Graph Neural Network (GNN) that operates without a weight matrix 𝑾 , utilizing various polynomial bases. The input consists of pairs ( G 𝑗 , 𝑿 𝑗 ) to appro ximate the target output 𝒀 . The learne d graph lters are then employed to compute the discrepancies with the target lters, using the Frob enius norm as the evaluation metric. Finally , the results are averaged across ten randomly generated graphs to ensure robustness. E.2 Experimental details for node classication Dataset statistics. The statistics of the 13 datasets used in Sec- tion 5.2 are provided in T ables 9 and 10. Baseline implementations. W e provide code URLs to the pub- lic implementations for all baselines r eferenced in this paper . In particular , for the well-established baselines GCN and ChebNet, we employ standardized implementations based on previous re- search [ 24 – 26 , 30 , 41 , 70 , 74 , 83 ]; for the remaining baselines, we resort to the publicly released code, accessible via the provided URLs as below . • H2GCN: https://github.com/GemsLab/H2GCN • GloGNN: https://github.com/RecklessRonan/GloGNN • LINKX: https://github.com/CU AI/Non- Homophily- Large- Scale • OrderGNN: https://github.com/lumia- group/order edgnn • LRGNN: https://github.com/Jinx- byebye/LRGNN • GCN: https://github.com/ivam- he/ChebNetII • SGC: https://github.com/ivam- he/ChebNetII • GCNII: https://github.com/chennnM/GCNII • ChebNet: https://github.com/ivam- he/ChebNetII • A CMGCN: https://github.com/SitaoLuan/A CM- GNN • Specformer: https://github.com/DSL- Lab/Specformer • GPRGNN: https://github.com/jianhao2016/GPRGNN • BernNet: https://github.com/ivam- he/BernNet • ChebNetII: https://github.com/ivam- he/ChebNetII • OptBasis: https://github.com/yuziGuo/FarOptBasis • NFGNN: https://github.com/SsGood/NFGNN • JacobiConv: https://github.com/GraphPK U/JacobiConv • AdaptKry: https://github.com/kkhuang81/AdaptKry • UniFilter: https://github.com/kkhuang81/UniFilter Implementation of TFGNN. As introduced in Se ction 4.3, TFGNN is implemented in two distinct congurations to accommodate graphs of varying sizes. For graphs detailed in T able 3, we utilize the architecture represented by Eq. (15) . For larger graphs listed in T able 5, we emplo y the architecture shown in Eq. (16). The MLP architecture within TFGNN is dataset-specic. For medium-sized graphs (Cora, Citese er , Pubmed, Roman-empire, Amazon- ratings, and Questions), w e use a tw o-layer MLP with 64 hidden units. In contrast, larger datasets are assigned three-layer MLPs with var ying hidden units: 128 for Gamers and Genius, 256 for Snap-patents and Pokec, 512 for Ogbn-arxiv , and 1024 for Ogbn- papers100M. T o ensure experimental fairness, we x the order of the T aylor- based parameter decomp osition, denoted as 𝐷 , to 10 , aligning with other baselines such as GPRGNN and ChebNetII. W e emplo y a grid search to optimize the weight de cay o ver { 5 𝑒 − 1 , 5 𝑒 − 2 , 5 𝑒 − 3 , 5 𝑒 − 4 , 0 } , learning rate over { 0 . 5 , 0 . 1 , 0 . 05 , 0 . 01 , 0 . 005 , 0 . 001 } , dropout over { 0 , 0 . 2 , 0 . 5 , 0 . 7 , 0 . 9 } , 𝜔 within { 0 . 2 𝜋 , 0 . 3 𝜋 , 0 . 5 𝜋 , 0 . 7 𝜋 } , and 𝐾 from { 2 , 4 , 6 , 8 , 10 , 15 , 20 } . Model training and testing. W e follow the dataset splitting proto- cols established in the literature. For the Cora, Citese er , and Pubmed datasets, we utilize the established 60% / 20% / 20% train/val/test split, which has been widely adopted across numer ous studies [ 24 – 26 , 30 , 31 , 38 , 41 , 70 , 74 ]. For the Roman-empire, Amazon-ratings, and Questions datasets, we implement a 50% / 25% / 25% train/val/test split, aligning with the protocols outline d in their original publi- cations [ 61 ]. This 50% / 25% / 25% train/val/test split strategy is also applied to the Gamers, Genius, Snap-patents, and Pokec datasets, as recommended in [ 46 ]. Finally , for Ogbn-arxiv , Ogbn-products, and Ogbn-papers100M, we adopt the xed splits dened in the original OGB dataset paper [29]. Models are trained for a maximum of 1 , 000 ep ochs, with early stopping implemented after 200 ep ochs if there is no improvement in validation accuracy . T o handle e xceptionally large graphs, we em- ploy a mini-batch training strategy using batches of 20 , 000 nodes. The optimization process employs the Adam optimizer [ 34 ]. For each dataset, we generate 10 random node splits and perform 10 WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. T able 8: Mathematical expressions of six target functions. Functions Expressions 𝑓 1 ( 𝑥 ) 𝑒 − 20 ( 𝑥 − 0 . 5 ) 2 + 𝑒 − 20 ( 𝑥 − 1 . 5 ) 2 𝑓 2 ( 𝑥 )          𝑒 − 100 ( 𝑥 − 0 . 8 ) 2 + 𝑒 − 100 ( 𝑥 − 1 . 2 ) 2 + 0 . 5 · ( 1 + cos ( 2 𝜋 𝑥 ) ) , 𝑥 ∈ [ 0 , 0 . 5 ] 𝑒 − 100 ( 𝑥 − 0 . 8 ) 2 + 𝑒 − 100 ( 𝑥 − 1 . 2 ) 2 , 𝑥 ∈ ( 0 . 5 , 1 . 5 ) 𝑒 − 100 ( 𝑥 − 0 . 8 ) 2 + 𝑒 − 100 ( 𝑥 − 1 . 2 ) 2 + 0 . 5 · ( 1 + cos ( 2 𝜋 𝑥 ) ) , 𝑥 ∈ [ 1 . 5 , 2 ] 𝑓 3 ( 𝑥 ) 𝑒 − 100 ( 𝑥 − 0 . 5 ) 2 + 𝑒 − 100 ( 𝑥 − 1 . 5 ) 2 + 1 . 5 𝑒 − 50 ( 𝑥 − 1 ) 2 𝑓 4 ( 𝑥 ) 𝑒 − 100 𝑥 2 + 𝑒 − 100 ( 𝑥 − 2 ) 2 𝑓 5 ( 𝑥 ) 1 − 𝑒 − 10 𝑥 2 𝑓 6 ( 𝑥 ) 𝑒 − 10 ( 𝑥 − 0 . 4 ) 2 + 2 𝑒 − 10 ( 𝑥 − 1 . 5 ) 2 T able 9: Statistics for medium-to-large datasets, with # Edge homo indicating the edge homophily measure from [87]. Cora CiteSeer PubMed Ogbn-arxiv Roman-empire Amazon-ratings Questions Gamers Genius # Nodes 2708 3327 19,717 169,343 22,662 24,492 48,921 168,114 421,961 # Edges 5278 4552 44,324 1,157,799 32,927 93,050 153,540 6,797,557 922,868 # Features 1433 3703 500 128 300 300 301 7 12 # Classes 7 6 5 40 18 5 2 2 2 # Edge homo [87] 0.81 0.74 0.80 0.65 0.05 0.38 0.84 0.55 0.62 T able 10: Statistics for exceptionally large datasets. # Edge homo for Ogbn-pap ers100M is unavailable due to runtime excee dance. Ogbn-products Ogbn-papers100M Snap-patents Poke c # Nodes 2,449,029 111,059,956 2,923,922 1,632,803 # Edges 61,859,140 1,615,685,872 13,975,788 30,622,564 # Features 100 128 269 65 # Classes 47 172 5 2 # Edge homo [87] 0.81 - 0.07 0.45 random initializations for each baseline on these splits. This pro- cess yields a total of 100 e valuations for each dataset. The r eported results for each baseline represent the average of these 100 evalua- tions. E.3 Experimental details for graph anomaly detection Dataset statistics. T able 11 presents the statistics of datasets used in Section 5.3. T able 11: Statistics of datasets utilized for graph anomaly detection. # Anomaly represents the rate of abnormal nodes. Y elpChi Amazon T -Finance # Nodes 45,954 11,944 39,357 # Edges 3,846,979 4,398,392 21,222,543 # Features 32 25 10 # Anomaly 14.53% 6.87% 4.58% Baseline implementations. W e provide code URLs to the ocial implementations of all baseline models referenced in this paper . Specically , for general-purp ose spectral GNNs like GPRGNN, Opt- Basis, AdaptKry , and NFGNN, which are initially introduce d as uniform, decouple d GNN architectures, we implement them in alignment with the TFGNN variant dene d in Eq. (15) . Each model uses a xed maximum polynomial degree of 10 and a two-layer MLP with 64 hidden units for feature transformation, consistent with BW GNN [ 71 ]. The GCN baseline is similarly implemented with a two-layer setup featuring 64 hidden dimensions. For other baselines, we rely on their ocial implementations (links provided below). All models are rebuilt and evaluated in PyG [ 15 ] framework to maintain experimental fairness. • PC-GNN: https://github.com/PonderL Y/PC- GNN • CARE-GNN: https://github.com/YingtongDou/CARE- GNN • GDN: https://github.com/blacksingular/wsdm_GDN • BW GNN: https://github.com/squareRoot3/Rethinking- Anomaly- Detection • GHRN: https://github.com/blacksingular/GHRN • GPRGNN: https://github.com/jianhao2016/GPRGNN • OptBasis: https://github.com/yuziGuo/FarOptBasis • AdaptKry: https://github.com/kkhuang81/AdaptKry • NFGNN: https://github.com/SsGood/NFGNN Implementation of TFGNN. In pursuit of fairness, TFGNN incor- porates a decoupled architecture consistent with general-purpose spectral GNNs, featuring a maximum p olynomial degree of 10 and a two-layer MLP comprising 64 hidden units for feature transfor- mation. This appr oach also ensures that parameter fairness is in Polynomial Selection in Spectral Graph Neural Networks: An Error-Sum of Function Slices Approach WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia alignment with BW GNN. For hyperparameter tuning, we adhere to the previous setups detailed in Appendix E.2. Training and testing. Following the training protocol established in the BW GNN paper [ 71 ], we maintain a validation-to-test set split of 1 : 2 , and employ training ratios of 1% (across all datasets) and 40% (additionally for T -Finance). Baselines are trained for 100 epo chs using the Adam optimizer , without early stopping. W e report the test results of the models that achieved the highest Macro-F1 score on the validation set, averaging results across 10 random seeds to ensure robustness. F Additional Results In this section, w e present additional results that bolster the ex- periments detailed in the main text, further substantiating our conclusions. F .1 Full numerical experiment results W e present a detaile d overview of our numerical experiment r esults in T able 12, including those for our TFGNN. The data illustrates that both the trigonometric p olynomial and TFGNN achieve outstanding performance, underscoring the advan- tages of our approach. Additionally , these results are consistent with the node classication outcomes outline d in Section 5.2, validating the real-world applicability of our analysis. F .2 Additional ablation studies of 𝐾 and 𝜔 In this section, we present an extended ablation study of the key hyperparameters 𝐾 and 𝜔 , complementing our ndings in Sec- tion 5.2.3, with Figure 4 illustrating the outcomes. The gures indicate a trend similar to that highlighted in Sec- tion 5.2.3, showing that the best-performing values for 𝐾 , 𝜔 , and the product 𝐾 · 𝜔 typically lie within low ranges. Received 20 February 2007; revised 12 March 2009; accepted 5 June 2009 WW W ’25, April 28-May 2, 2025, Sydney , NSW , Australia Li et al. T able 12: Full numerical experiment results. introduce d in Section 3.2. Both trigonometric p olynomial and TFGNN are included for comprehensive evaluations. Method Slice-wise approximation Filter Learning # A vg Rank 1 # A vg Rank 2 Polynomial GNN 𝑓 1 ( 𝑥 ) 𝑓 2 ( 𝑥 ) 𝑓 3 ( 𝑥 ) 𝑓 4 ( 𝑥 ) 𝑓 5 ( 𝑥 ) 𝑓 6 ( 𝑥 ) 𝑓 1 ( 𝑥 ) 𝑓 2 ( 𝑥 ) 𝑓 3 ( 𝑥 ) 𝑓 4 ( 𝑥 ) 𝑓 5 ( 𝑥 ) 𝑓 6 ( 𝑥 ) Monomial GPRGNN [10] 139.9 289.1 466.1 398.3 1.83 97.83 167.2 366.4 566.3 468.7 15.91 139.2 6 6 Bernstein BernNet [25] 32.78 247.3 398.5 306.5 0.058 22.92 68.23 313.2 448.2 415.2 7.79 95.84 5 5 Chebyshev ChebNetII [26] 23.45 85.19 244.8 187.2 0.018 13.13 64.22 168.4 402.5 347.5 6.83 86.25 4 4 Jacobian JacobiConv [74] 22.18 80.77 239.2 155.3 0.017 11.82 48.56 95.92 338.1 266.4 5.33 65.13 3 3 Learnable OptBasis [24] 20.75 80.53 225.7 152.7 0.017 11.20 43.44 89.48 289.5 238.1 4.98 61.70 2 2 Trigonometric TFGNN 12.35 23.69 71.13 59.88 0.017 6.52 27.23 65.19 102.3 105.3 4.05 48.08 1 1 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (a) Cora. 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (b) Citeseer . 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (c) Roman. 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (d) Amazon. 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (e) Pubmed 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (f ) Ques. 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (g) Ar xiv . 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (h) Products. 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 (i) Gamers 2 4 6 8 10 15 20 K . 7 . 5 . 3 . 2 ( j) Genius Figure 4: Additional ablation studies on 𝐾 and 𝜔 . Darker shades indicate higher p erformance values.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment