SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting
Reading time: 4 minute
...
📝 Original Info
Title: SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting
ArXiv ID: 2512.03620
Date: 2025-12-03
Authors: Hanxiu Zhang, Yue Zheng
📝 Abstract
The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical challenge in contemporary AI research. While fingerprinting techniques have emerged as a fundamental mechanism for detecting unauthorized model usage, existing methods -- whether behavior-based or structural -- suffer from vulnerabilities such as false claim attacks or susceptible to weight manipulations. To overcome these limitations, we propose SELF, a novel intrinsic weight-based fingerprinting scheme that eliminates dependency on input and inherently resists false claims. SELF achieves robust IP protection through two key innovations: 1) unique, scalable and transformation-invariant fingerprint extraction via singular value and eigenvalue decomposition of LLM attention weights, and 2) effective neural network-based fingerprint similarity comparison based on few-shot learning and data augmentation. Experimental results demonstrate SELF maintains high IP infringement detection accuracy while showing strong robustness against various downstream modifications, including quantization, pruning, and fine-tuning attacks. Our code is available at https://github.com/HanxiuZhang/SELF_v2.
💡 Deep Analysis
📄 Full Content
SELF: A ROBUST SINGULAR VALUE AND
EIGENVALUE APPROACH FOR LLM FINGERPRINTING
Hanxiu Zhang, Yue Zheng∗
The Chinese University of Hong Kong, Shenzhen
hanxiuzhang@link.cuhk.edu.cn, zhengyue@cuhk.edu.cn
ABSTRACT
The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical
challenge in contemporary AI research. While fingerprinting techniques have emerged as a fundamen-
tal mechanism for detecting unauthorized model usage, existing methods—whether behavior-based
or structural–suffer from vulnerabilities such as false claim attacks or susceptible to weight manipula-
tions. To overcome these limitations, we propose SELF, a novel intrinsic weight-based fingerprinting
scheme that eliminates dependency on input and inherently resists false claims. SELF achieves robust
IP protection through two key innovations: 1) unique, scalable and transformation-invariant fingerprint
extraction via singular value and eigenvalue decomposition of LLM attention weights, and 2) effective
neural network-based fingerprint similarity comparison based on few-shot learning and data augmen-
tation. Experimental results demonstrate SELF maintains high IP infringement detection accuracy
while showing strong robustness against various downstream modifications, including quantization,
pruning, and fine-tuning attacks. Our code is available at github.com/HanxiuZhang/SELF_v2.
Keywords Large Language Model · Intellectual Property Protection · Fingerprinting · Singular Values · Eigenvalues
1
Introduction
Large language models (LLMs) are increasingly being adopted as versatile tools to enhance productivity in various
fields, including medical assistance ([1]), code generation ([2]), and so on. Developing a functional LLM requires
substantial investments, including high-quality datasets, significant computational resources, and specialized human
expertise. Consequently, protecting the intellectual property (IP) of LLMs is of paramount importance ([3]), particularly
in the current era where open-source trends clash with the need for model creators to maintain naming conventions for
attribution on derivative works.
Current model IP infringement detection methods primarily fall into two categories: watermarking and fingerprinting.
Watermarking approaches embed identifiable features (watermarks) invasively into target models while trying to
preserve their original functionality ([4, 5]). In contrast, fingerprinting methods extract unique model identifiers without
modifying the model, either by analyzing the model’s input-output behavioral patterns ([6]) (i.e., behavior fingerprinting)
or structural information (i.e., structural fingerprinting) such as weight distributions ([7]), intermediate representations
([8]), or gradient profiles ([9]). Compared to watermarking-based methods, fingerprinting schemes eliminate the need
of retraining and avoid potential performance degradation associated with watermark insertion ([10]).
Despite these advantages, existing fingerprinting methods face critical limitations. Behavior-based techniques are
vulnerable to false claim attacks ([11]), wherein malicious actors can falsely claim the ownership of independently
trained models by crafting (transferable) adversarial samples. Although [12] propose to mitigate the attack by
constructing fingerprints using targeted adversarial examples, the risk persists as such adversarial examples can still be
transferrable albeit with greater difficulty. Structural approaches analyze model internal parameters but lack robustness
against weight manipulations such as permutation or linear mapping. For schemes like HuRef ([7]) where the input is
required to actively participate in fingerprint computation, we further extend the scope of false claim attack as malicious
accuser can manipulate ownership verification results through carefully crafted input. Under this broader definition, we
∗Corresponding author
arXiv:2512.03620v1 [cs.CR] 3 Dec 2025
conducted false claim attack on HuRef scheme and successfully manipulated the similarity score output (see Appendix
B).
To address these issues, we propose a structural fingerprinting method named SELF, which purely depends on the model
weights. Figure 1 describes SELF’s pipeline. The owner first extracts a fingerprint from the target model and trains
a Similarity Network (SimNet) for verification. If the model is stolen, the owner can detect piracy by SimNet’s high
similarity output. SELF comprises two key components: (1) Fingerprint Extraction, which derives unique, robust and
scalable fingerprints from model weights; and (2) Similarity Computation, where a neural network learns fingerprint
patterns to enable robust and efficient similarity assessment.
Figure 1: IP infringement detection pipeline using SELF.
In the fingerprint extraction module, we address potential model weight tampering caused by transformation attacks
(e.g., permutation and linear-mapping ([7])) through identifying invariant attributes.