LION-DG: Layer-Informed Initialization with Deep Gradient Protocols for Accelerated Neural Network Training

Reading time: 1 minute
...

๐Ÿ“ Original Info

  • Title: LION-DG: Layer-Informed Initialization with Deep Gradient Protocols for Accelerated Neural Network Training
  • ArXiv ID: 2601.02105
  • Date: 2026-01-05
  • Authors: Hyunjun Kim

๐Ÿ“ Abstract

Weight initialization remains decisive for neural network optimization, yet existing methods are largely layer-agnostic. We study initialization for deeply-supervised architectures with auxiliary classifiers, where untrained auxiliary heads can destabilize early training through gradient interference. We propose LION-DG, a layer-informed initialization that zero-initializes auxiliary classifier heads while applying standard He-initialization to the backbone. We prove that this implements Gradient Awakening: auxiliary gradients are exactly zero at initialization, then phase in naturally as weights grow-providing an implicit warmup without hyperparameters. Experiments on CIFAR-10 and CIFAR-100 with DenseNet-DS and ResNet-DS architectures demonstrate: โ€ข DenseNet-DS: +8.3% faster convergence on CIFAR-10 with comparable accuracy โ€ข Hybrid approach: Combining LSUV with LION-DG achieves best accuracy (81.92% on CIFAR-10) โ€ข ResNet-DS: Positive speedup on CIFAR-100 (+11.3%) with side-tap auxiliary design We identify architecture-specific trade-offs and provide clear guidelines for practitioners. LION-DG is simple, requires zero hyperparameters, and adds no computational overhead.

๐Ÿ“„ Full Content

...(๋ณธ๋ฌธ ๋‚ด์šฉ์ด ๊ธธ์–ด ์ƒ๋žต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์ดํŠธ์—์„œ ์ „๋ฌธ์„ ํ™•์ธํ•ด ์ฃผ์„ธ์š”.)

Start searching

Enter keywords to search articles

โ†‘โ†“
โ†ต
ESC
โŒ˜K Shortcut