Efficient Test-Time Adaptation through Latent Subspace Coefficients Search
Real-world deployment often exposes models to distribution shifts, making test-time adaptation (TTA) critical for robustness. Yet most TTA methods are unfriendly to edge deployment, as they rely on backpropagation, activation buffering, or test-time mini-batches, leading to high latency and memory overhead. We propose $\textbf{ELaTTA}$ ($\textit{Efficient Latent Test-Time Adaptation}$), a gradient-free framework for single-instance TTA under strict on-device constraints. ELaTTA freezes model weights and adapts each test sample by optimizing a low-dimensional coefficient vector in a source-induced principal latent subspace, pre-computed offline via truncated SVD and stored with negligible overhead. At inference, ELaTTA encourages prediction confidence by optimizing the $k$-D coefficients with CMA-ES, effectively optimizing a Gaussian-smoothed objective and improving stability near decision boundaries. Across six benchmarks and multiple architectures, ELaTTA achieves state-of-the-art accuracy under both strict and continual single-instance protocols, while reducing compute by up to $\textit{63$\times$}$ and peak memory by $\textit{11$\times$}$. We further demonstrate on-device deployment on a ZYNQ-7020 platform. Code will be released upon acceptance.
💡 Research Summary
The paper introduces ELaTTA (Efficient Latent Test‑Time Adaptation), a gradient‑free test‑time adaptation (TTA) framework designed for strict on‑device constraints. Traditional TTA methods rely on back‑propagation, activation buffering, or mini‑batch statistics, which are unsuitable for edge devices that must process single inputs with minimal latency and memory. ELaTTA circumvents these issues by freezing all model weights and instead adapting each test sample through a low‑dimensional coefficient vector that lives in a source‑derived principal latent subspace.
Offline preparation: A small set of source samples (as few as 20 for ImageNet‑scale tasks) is passed through the encoder to collect latent vectors. A truncated singular value decomposition (SVD) yields the top‑k orthogonal basis vectors Vₖ (k ≪ D), which define a subspace that captures the dominant variations of the source domain. Vₖ is stored on the device with negligible overhead (≈0.01 % of backbone parameters).
Test‑time adaptation: For a given OOD input x, the encoder produces a latent zₜ. The method searches for a coefficient vector p ∈ ℝᵏ such that the adapted latent zₐₚₜ = zₜ + Vₖp leads to a confident prediction. Confidence is encouraged by minimizing Shannon entropy of the classifier output. Direct pointwise entropy minimization is brittle near decision boundaries, so the authors introduce a Gaussian‑smoothed objective J(m,Σ) = E_{p∼N(m,Σ)}
Comments & Academic Discussion
Loading comments...
Leave a Comment