Hi-Light: A Path to high-fidelity, high-resolution video relighting with a Novel Evaluation Paradigm
Video relighting offers immense creative potential and commercial value but is hindered by challenges, including the absence of an adequate evaluation metric, severe light flickering, and the degradation of fine-grained details during editing. To overcome these challenges, we introduce Hi-Light, a novel, training-free framework for high-fidelity, high-resolution, robust video relighting. Our approach introduces three technical innovations: lightness prior anchored guided relighting diffusion that stabilises intermediate relit video, a Hybrid Motion-Adaptive Lighting Smoothing Filter that leverages optical flow to ensure temporal stability without introducing motion blur, and a LAB-based Detail Fusion module that preserves high-frequency detail information from the original video. Furthermore, to address the critical gap in evaluation, we propose the Light Stability Score, the first quantitative metric designed to specifically measure lighting consistency. Extensive experiments demonstrate that Hi-Light significantly outperforms state-of-the-art methods in both qualitative and quantitative comparisons, producing stable, highly detailed relit videos.
💡 Research Summary
Hi‑Light tackles two long‑standing challenges in video relighting—temporal lighting flicker and loss of high‑frequency detail—by introducing a training‑free, backbone‑agnostic pipeline and a novel evaluation metric. The method first down‑samples the input video to 480 p and runs a state‑of‑the‑art image relighting diffusion model (e.g., IC‑Light) on each frame. To prevent the diffusion process from producing oscillating luminance, a lightness‑prior anchor is injected at every diffusion step: the L channel of the original low‑resolution video is high‑pass filtered, producing a static lightness residual ΔL that is added with a fixed weight γ. This anchors the L channel across time, dramatically reducing frame‑to‑frame brightness variance without altering overall exposure.
After the guided diffusion loop, the intermediate relit video still suffers from flicker and blurred details. Hi‑Light therefore applies the Hybrid Motion‑Adaptive Lighting Smoothing Filter (HMA‑LSF). Optical flow (Farneback) estimates pixel‑wise motion between consecutive frames. The previous smoothed frame is warped according to this flow, aligning moving objects before blending with the current frame. An adaptive blending weight α, inversely proportional to motion magnitude, ensures that fast motion relies more on the current frame, avoiding ghosting, while static regions benefit from temporal averaging. A short frame window further stabilizes the estimate. In parallel, a bilateral filter preserves edges while smoothing illumination, giving the filter its “hybrid” character.
Finally, the LAB‑Detail‑Preserving Fusion (LAB‑DF) module transfers the stabilized lighting to the original high‑resolution video. The L channel of the smoothed low‑resolution relit video supplies the new illumination, while the A and B channels (and high‑frequency texture) are taken directly from the high‑resolution source. This decouples color and texture from lighting, preventing the detail degradation typical of diffusion‑based approaches.
For evaluation, the authors propose the Light Stability Score (LSS), the first metric dedicated to quantifying lighting consistency. Each frame is converted to grayscale; a brightness threshold τ isolates bright pixels Pₜ. Three time‑series are derived: average intensity of bright pixels Iₜ, count of bright pixels Cₜ, and the first derivative of Iₜ (Ĩₜ). For each series, the mean absolute change M is normalized by the series’ peak‑to‑peak range R, yielding an unsmoothness measure Uₙₒᵣₘ = M/R. An exponential decay maps Uₙₒᵣₘ to a smoothness score S ∈
Comments & Academic Discussion
Loading comments...
Leave a Comment