Yesnt: Are Diffusion Relighting Models Ready for Capture Stage Compositing? A Hybrid Alternative to Bridge the Gap

Reading time: 2 minute
...

📝 Original Info

  • Title: Yesnt: Are Diffusion Relighting Models Ready for Capture Stage Compositing? A Hybrid Alternative to Bridge the Gap
  • ArXiv ID: 2510.23494
  • Date: 2025-10-27
  • Authors: ** 정보 없음 (논문에 명시된 저자 정보가 제공되지 않았습니다.) **

📝 Abstract

Volumetric video relighting is essential for bringing captured performances into virtual worlds, but current approaches struggle to deliver temporally stable, production-ready results. Diffusion-based intrinsic decomposition methods show promise for single frames, yet suffer from stochastic noise and instability when extended to sequences, while video diffusion models remain constrained by memory and scale. We propose a hybrid relighting framework that combines diffusion-derived material priors with temporal regularization and physically motivated rendering. Our method aggregates multiple stochastic estimates of per-frame material properties into temporally consistent shading components, using optical-flow-guided regularization. For indirect effects such as shadows and reflections, we extract a mesh proxy from Gaussian Opacity Fields and render it within a standard graphics pipeline. Experiments on real and synthetic captures show that this hybrid strategy achieves substantially more stable relighting across sequences than diffusion-only baselines, while scaling beyond the clip lengths feasible for video diffusion. These results indicate that hybrid approaches, which balance learned priors with physically grounded constraints, are a practical step toward production-ready volumetric video relighting.

💡 Deep Analysis

Figure 1

📄 Full Content

📸 Image Gallery

00002.png 00002_mean.png 00014.png 00014_mean.png 00028.png 00028_mean.png Cello_syferfontein_6d_clear_3_blendersynth.png Cello_syferfontein_6d_clear_3_ourssynth.png ChessScene_rogland_clear_night_1_blendersynth.png ChessScene_rogland_clear_night_1_ourssynth.png Grammophone_umhlanga_sunrise_3_blendersynth.png Grammophone_umhlanga_sunrise_3_ourssynth.png sample_113_gof.jpg sample_113_ours.jpg sample_1_close_diff_renderer.jpg sample_1_close_gof.jpg sample_1_close_iclight.jpg sample_1_close_neural_gaffer.jpg sample_1_close_ours.jpg sample_1_close_r3dg.jpg sample_1_diff_renderer.jpg sample_1_gof.jpg sample_1_iclight.jpg sample_1_neural_gaffer.jpg sample_1_ours.jpg sample_1_r3dg.jpg sample_2_close_diff_renderer.jpg sample_2_close_gof.jpg sample_2_close_iclight.jpg sample_2_close_neural_gaffer.jpg sample_2_close_ours_new.jpg sample_2_close_r3dg.jpg sample_2_diff_renderer.jpg sample_2_gof.jpg sample_2_iclight.jpg sample_2_neural_gaffer.jpg sample_2_ours_new.jpg sample_2_r3dg.jpg teaser_130.jpg vram_fig.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut