Qonvolution: Towards Learning High-Frequency Signals with Queried Convolution

Reading time: 5 minute
...

📝 Original Info

  • Title: Qonvolution: Towards Learning High-Frequency Signals with Queried Convolution
  • ArXiv ID: 2512.12898
  • Date: 2025-12-15
  • Authors: Abhinav Kumar, Tristan Aumentado-Armstrong, Lazar Valkov, Gopal Sharma, Alex Levinshtein, Radek Grzeszczuk, Suren Kumar

📝 Abstract

Accurately learning high-frequency signals is a challenge in computer vision and graphics, as neural networks often struggle with these signals due to spectral bias or optimization difficulties. While current techniques like Fourier encodings have made great strides in improving performance, there remains scope for improvement when presented with high-frequency information. This paper introduces Queried-Convolutions (Qonvolutions), a simple yet powerful modification using the neighborhood properties of convolution. Qonvolution convolves a low-frequency signal with queries (such as coordinates) to enhance the learning of intricate high-frequency signals. We empirically demonstrate that Qonvolutions enhance performance across a variety of high-frequency learning tasks crucial to both the computer vision and graphics communities, including 1D regression, 2D super-resolution, 2D image regression, and novel view synthesis (NVS). In particular, by combining Gaussian splatting with Qonvolutions for NVS, we showcase state-of-the-art performance on real-world complex scenes, even outperforming powerful radiance field models on image quality.

💡 Deep Analysis

Figure 1

📄 Full Content

QONVOLUTION: TOWARDS LEARNING HIGH- FREQUENCY SIGNALS WITH QUERIED CONVOLUTION Abhinav Kumar1, Tristan Aumentado-Armstrong2∗, Lazar Valkov1∗, Gopal Sharma1, Alex Levinshtein2, Radek Grzeszczuk1,2, Suren Kumar1 1Samsung Research America, AI Center – Mountain View, CA, USA 2Samsung Research, AI Center – Toronto, ON, Canada {a.kumar4,tristan.a,lazar.valkov,gopal.sharma}@samsung.com {alex.lev,radek.g,suren.kumar}@samsung.com Project Page: https://abhi1kumar.github.io/qonvolution/ ABSTRACT Accurately learning high-frequency signals is a challenge in computer vision and graphics, as neural networks often struggle with these signals due to spectral bias or optimization difficulties. While current techniques like Fourier encod- ings have made great strides in improving performance, there remains scope for improvement when presented with high-frequency information. This paper intro- duces Queried-Convolutions (Qonvolutions), a simple yet powerful modification using the neighborhood properties of convolution. Qonvolution convolves a low- frequency signal with queries (such as coordinates) to enhance the learning of intricate high-frequency signals. We empirically demonstrate that Qonvolutions enhance performance across a variety of high-frequency learning tasks crucial to both the computer vision and graphics communities, including 1D regression, 2D super-resolution, 2D image regression, and novel view synthesis (NVS). In partic- ular, by combining Gaussian splatting with Qonvolutions for NVS, we showcase state-of-the-art performance on real-world complex scenes, even outperforming powerful radiance field models on image quality. Figure 1: Learning high-frequency with Qonvolution. We provide an example on novel view synthesis of 3D Gaussian Splatting (Kerbl et al., 2023) and adding QNN. Adding QNN faithfully reconstructs high-frequency details in various regions and results in higher quality synthesis. We highlight the differences in inset figures. 1 INTRODUCTION Neural networks are now fundamental to computer vision and graphics, for deciphering a wide range of signals, from intricate 1D data like time series (Kazemi et al., 2019) and natural language (Vaswani et al., 2017) to rich 2D images (Tancik et al., 2020), and immersive 3D scenes (Barron et al., 2023). However, these networks often struggle to capture high-frequency details, a challenge often attributed to spectral bias (Rahaman et al., 2019) or optimization difficulties (Tab. 10) due to complicated landscapes (Li et al., 2018). The challenge of capturing high-frequency details in neural networks has spurred a rich and diverse body of research. A key strategy involves modifying positional encodings, using Fourier encod- ∗Equal Second Authors. 1 arXiv:2512.12898v1 [cs.CV] 15 Dec 2025 (a) MLP-based Neural Fields (b) CNN (c) QNN Figure 2: Overview of MLPs, CNNs and QNNs. MLPs take (encoded) queries γ(qi) and uses linear layers. CNNs take the low-frequency signal bf low and uses convolutions. QNN concatenates the low-frequency signal bf low to the (encoded) queries γ(qi) and uses convolutions. [Key: Freq = Frequency, ⊕= Concatenation] ings (Tancik et al., 2020). Another approach focuses on altering activation functions, such as those in SIREN (Sitzmann et al., 2020). Other methods aim to predict Fourier series coefficients (Lee et al., 2021), use high-frequency weighted losses (Zhang et al., 2024) or tune weight initialization (Saratchandran et al., 2024). While all these innovations improve performance, the spectrum of frequencies effectively learned remains limited, highlighting a compelling need for further advance- ments in high-frequency signal representation. There are two classes of tasks that attempt to learn high-frequency signals. One stream, which in- cludes the popular novel view synthesis (NVS) task, uses Multi-Layer Perceptrons (MLPs) (Fig. 2a) to directly fit signals. However, MLPs lack the necessary inductive biases (Cohen & Welling, 2016; LeCun et al., 1998) to capture the local neighborhood dependencies inherent in most 1D and 2D signals. We conjecture that these local relationships are crucial for learning high-frequency infor- mation effectively. By processing data points or pixels in isolation, existing MLP networks often neglect these local connections, which limits their capacity to fully represent high-frequency signals. On the other hand, if a low-frequency signal is given, the second class of problems, which includes, e.g., 2D super-resolution (SR), convolves the low-frequency signal with a CNN (Fig. 2b), and thus, uses neighborhood information. However, these approaches (Karras et al., 2018) often do not uti- lize the information present in the input queries (e.g.: spatial coordinates) in predicting the output high-frequency signal. As shown in previous work (Liu et al., 2018), architectures that are aware of locality, such as CNNs, often fail at tasks that require even a simple transformations of coordinates. Thus, the potential for

📸 Image Gallery

1410037-200.png 1d_regression.png cnn.png flowers.png gs_qnn.png mcmc_scale_gaussians.png mlp.png nvs.png oracle_analysis_40k.png qnn.png sr.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut