다항 신경망의 신경다양체 차원과 전역 식별성 연구

Reading time: 4 minute
...

📝 Abstract

We study neurovarieties for polynomial neural networks and fully characterize when they attain the expected dimension in the single-output case. As consequences, we establish non-defectiveness and global identifiability for multi-output architectures.

💡 Analysis

We study neurovarieties for polynomial neural networks and fully characterize when they attain the expected dimension in the single-output case. As consequences, we establish non-defectiveness and global identifiability for multi-output architectures.

📄 Content

The field of Neuroalgebraic Geometry is an emerging area of research focused on studying function spaces defined by machine learning models based on algebraic architectures like Polynomial Neural Networks (PNN). PNNs represent a distinct class of neural network architectures where the traditional, typically non-polynomial, activation functions (like ReLU, sigmoid, or tanh) are replaced with polynomials. This substitution gives PNNs a unique theoretical and practical profile, making them a significant area of study in machine learning.

These networks achieve a competitive experimental performance across a wide range of tasks. Critically, the polynomial activation functions naturally facilitate the capture of high-order interactions between the input features. Unlike simpler activation functions, which might require a much greater depth or width to implicitly model complex relationships, PNNs explicitly incorporate multiplicative terms that represent these intricate, high-order dependencies.

This powerful capability has led to their successful employment across numerous domains ranging from computer vision (tasks like image recognition and object detection, [CMB + 20, CGD + 22, YHN + 21]) and image representation (learning efficient and meaningful feature encodings for visual data,[YBJ + 22]), to physics (solving complex differential equations or modeling physical systems, [BK21]) and finance (applications such as time series prediction or risk modeling, [NM18]).

On the theoretical side, the use of polynomials provides a remarkable benefit: it allows for a fine-grained theoretical analysis of the network’s properties. The set of functions that a PNN can compute, the function space associated with the architecture, possesses a specific geometric structure. These function spaces are frequently referred to as neuromanifolds and their Zariski closures are called neurovarieties.

Since polynomials are the building blocks, tools from algebraic geometry can be rigorously applied to analyze these neurovarieties. Investigating the properties of such spaces, particularly their dimension, yields crucial insights into the network’s behavior: The dimension and structure of the neurovariety shed light on the impact of a PNN’s architectural choices (such as the layer widths and the activation degrees of the polynomials) on the overall expressivity (the set of functions it can approximate) of various PNN subtypes, including standard feedforward, convolutional, and self-attention architectures.

All these opportunities made Neuroalgebraic Geometry an emerging area of research that attracted many contributions in the last years [KTB19, BT20, Xiu21, BBCV21, Sha23, LYPL21, SMK24, MSM + 25, KLW24,HMK25,SMK25].

This paper investigates the algebraic structure underlying polynomial neural networks and their neurovarieties V n,d . Aimed also at a computer science audience interested in the theoretical capacity and complexity of network architectures, this work connects deep learning structures to fundamental results in algebraic geometry.

We focus on a class of feed-forward networks defined by a width vector n = (n 0 , . . . , n L ) and a set of activation exponents d = (d 1 , . . . , d L-1 ). These networks utilize weight matrices W i and activation functions σ i which raise inputs element-wise to the power d i . The output function F of this architecture is a set of n L homogeneous polynomial of total degree d = L-1 i=1 d i . The neurovariety V n,d is the Zariski closure of the image of the map that takes the network weights (W 1 , . . . , W L ) to the projective space of output polynomial coefficients. Essentially, V n,d captures all possible output polynomials that can be realized by the given architecture and their limits. A key mathematical challenge is determining the dimension of this variety, which represents the actual complexity or expressivity of the network in terms of the number of independent parameters required to generate the output space. Indeed, the dimension reflects the intrinsic degrees of freedom of the model and is a simple measure of expressivity that is more precise than the raw parameter count.

The study of the dimension of neurovarieties has been recently exploited and sponsored in various paper, for instance [KTB19, MSM + 25, FRWY25]. The neurovariety has an expected dimension expdim(V n,d ), typically calculated based on the total number of parameters adjusted for obvious representation ambiguities. If the actual dimension is strictly less than this expected dimension, the neurovariety is called defective, also indicating redundant parameterization.

From the viewpoint of learning theory and practice, the actual dimension dim V n,d is the relevant capacity measure of a Polynomial Neural Network (PNN), sharper than raw parameter count. Knowing dim V n,d has several consequences:

-Model selection and redundancy. If V n,d is defective, many weights are intrinsically redundant (flat directions of the

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut