FEKAN: Feature-Enriched Kolmogorov-Arnold Networks

Reading time: 5 minute
...

📝 Original Info

  • Title: FEKAN: Feature-Enriched Kolmogorov-Arnold Networks
  • ArXiv ID: 2602.16530
  • Date: 2026-02-18
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (정보 없음) **

📝 Abstract

Kolmogorov-Arnold Networks (KANs) have recently emerged as a compelling alternative to multilayer perceptrons, offering enhanced interpretability via functional decomposition. However, existing KAN architectures, including spline-, wavelet-, radial-basis variants, etc., suffer from high computational cost and slow convergence, limiting scalability and practical applicability. Here, we introduce Feature-Enriched Kolmogorov-Arnold Networks (FEKAN), a simple yet effective extension that preserves all the advantages of KAN while improving computational efficiency and predictive accuracy through feature enrichment, without increasing the number of trainable parameters. By incorporating these additional features, FEKAN accelerates convergence, increases representation capacity, and substantially mitigates the computational overhead characteristic of state-of-the-art KAN architectures. We investigate FEKAN across a comprehensive set of benchmarks, including function-approximation tasks, physics-informed formulations for diverse partial differential equations (PDEs), and neural operator settings that map between input and output function spaces. For function approximation, we systematically compare FEKAN against a broad family of KAN variants, FastKAN, WavKAN, ReLUKAN, HRKAN, ChebyshevKAN, RBFKAN, and the original SplineKAN. Across all tasks, FEKAN demonstrates substantially faster convergence and consistently higher approximation accuracy than the underlying baseline architectures. We also establish the theoretical foundations for FEKAN, showing its superior representation capacity compared to KAN, which contributes to improved accuracy and efficiency.

💡 Deep Analysis

📄 Full Content

Feature-space enrichment has long been a central theme in machine learning, with wide applicability to both regression and classification tasks. In its simplest form, linear regression models the response variable as a linear function of input vectors x ∈ R n . While effective in many settings, this formulation becomes inadequate when the underlying relationships between variables are nonlinear. In such cases, linear models fail to capture higher-order interactions and complex dependencies present in the data. A principled yet practical remedy is to introduce a feature map γ : R n → R n+m that lifts the original input space into a higher-dimensional representation. This transformation augments the feature set with nonlinear or higher-order terms, which can then be treated as independent predictors within a linear modelling framework. By operating in the transformed space, linear regression can effectively approximate nonlinear relationships without altering its fundamental structure. An analogous rationale applies to classification problems. Through suitable feature enrichment, data that are not linearly separable in the original space may become separable by a linear decision boundary in the transformed space. In this sense, feature encoding may be viewed as a change of basis, projecting the data onto a representation that more faithfully captures the structure of the underlying problem. Common choices of basis expansions include polynomial functions, Fourier series and radial basis functions. The selection of basis is often guided by prior knowledge of the data-generating process. For example, when the data exhibit periodic structure, Fourier bases provide a natural and efficient representation, whereas problems characterised by localised structure may benefit from radial basis expansions.

Recent years have witnessed a marked resurgence in the use of multi-layer perceptrons (MLPs), driven largely by advances in hardware accelerators that enable efficient large-scale training via backpropagation. Despite their expressive capacity, MLPs exhibit an inherent spectral bias, favouring the learning of low-frequency components while struggling to represent high-frequency structure. This limitation can be detrimental in tasks where fine-scale detail is critical. Feature enrichment has emerged as an effective strategy to mitigate this shortcoming. Tancik et al. [1] introduced Fourier feature mappings as a simple yet powerful mechanism for enabling coordinate-based MLPs to learn high-frequency functions in low-dimensional domains. Through a series of image regression and reconstruction experiments, they demonstrated that augmenting inputs with Fourier features substantially improves the model’s ability to capture high-frequency content, without requiring modifications to the underlying network architecture. Building on this framework of positional encoding using fixed Fourier features, Sun et al. [2] proposed sinusoidal positional encoding (SPE), in which the frequencies are learned adaptively rather than specified a priori as hyperparameters. This formulation enhances flexibility and reduces manual tuning. The authors validated the effectiveness of SPE across a range of applications, including speech synthesis and image reconstruction, highlighting its capacity to model complex, high-frequency signals.

The Kolmogorov-Arnold Network (KAN) [3] is a recently introduced neural network architecture that replaces conventional linear weights with learnable univariate functions. This design is motivated by the Kolmogorov-Arnold representation theorem and aims to model nonlinear relationships through structured functional decomposition rather than fixed affine transformations. Within the KAN framework, feature enrichment offers a potentially valuable complement to the underlying basis functions that constitute the model. By transforming the input representation into a more expressive feature space, the functional components of KAN may be relieved of modelling highly intricate structure directly, thereby facilitating faster convergence and reducing overall training time. Furthermore, such enrichment may enhance parametric efficiency, enabling comparatively lightweight models to attain performance on par with larger architectures trained in the original feature space without transformation. Although KAN exhibits appealing properties, including interpretability and favourable parameter scaling, its training cost remains substantially higher than that of conventional MLPs, limiting its practicality in large-scale settings. To date, feature enrichment strategies have not been systematically explored in conjunction with KAN-based architectures. We posit that integrating such transformations could provide a simple yet effective means of improving computational efficiency, particularly in scientific and engineering applications, without introducing additional architectural complexity.

Compared to conventional MLPs, KAN exhibits

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut