Image sequence interpolation using optimal control

The problem of the generation of an intermediate image between two given images in an image sequence is considered. The problem is formulated as an optimal control problem governed by a transport equation. This approach bears similarities with the Horn & Schunck method for optical flow calculation but in fact the model is quite different. The images are modelled in $BV$ and an analysis of solutions of transport equations with values in $BV$ is included. Moreover, the existence of optimal controls is proven and necessary conditions are derived. Finally, two algorithms are given and numerical results are compared with existing methods. The new method is competitive with state-of-the-art methods and even outperforms several existing methods.

💡 Research Summary

The paper tackles the classic problem of generating an intermediate frame between two given images in a video sequence, but it does so by casting the task as an optimal control problem governed by a transport (advection) equation. Unlike traditional optical‑flow approaches such as Horn‑Schunck, which assume brightness constancy and directly estimate a dense flow field, the authors model the image intensity as a function of bounded variation (BV). This choice allows the method to naturally handle sharp edges, discontinuities, and texture‑rich regions that are problematic for purely smooth‑flow models.

The forward model is the linear transport equation
∂ₜI(x,t) + v(x,t)·∇I(x,t) = 0,
where I(x,t) denotes the image intensity at spatial location x and time t, and v(x,t) is the control variable representing the velocity field that transports the image from the initial frame I₀ to the target frame I_T. By working in BV, the authors guarantee that the total variation of the image does not increase under the flow, preserving edges and preventing artificial smoothing.

The optimal control problem is formulated as
min₍v∈U₎ J(v) = ½‖I(T) – I_target‖²_{L²(Ω)} + (α/2)∫₀ᵀ‖v(·,t)‖²_{H¹(Ω)} dt,
with U = L²(0,T; H¹(Ω)). The first term enforces fidelity to the desired final image, while the second regularizes the velocity field to be spatially smooth. The authors prove the existence of a minimizer by showing coercivity of J, weak lower semicontinuity, and compactness of the BV embedding into L¹.

Necessary optimality conditions are derived via the Lagrange multiplier method. The resulting optimality system consists of: (i) the forward transport equation, (ii) an adjoint (backward) transport equation for the Lagrange multiplier λ, and (iii) a gradient equation linking λ, ∇I, and v: α(−Δv + v) = ∫₀ᵀ λ ∇I dt.
This coupled system is nonlinear but can be solved iteratively using a gradient‑descent or L‑BFGS scheme, where each iteration requires solving the forward and adjoint PDEs.

Two numerical strategies are presented. The first, “continuous‑time control,” discretizes time with a fine timestep, solves the forward and adjoint equations at each step, and updates v using a first‑order gradient method. The second, “discrete‑time variational control,” partitions the interval into a few larger sub‑intervals and directly minimizes a variational functional on each sub‑interval, reducing the number of PDE solves. Both approaches employ a total‑variation‑diminishing (TVD) scheme for the transport equation to avoid spurious oscillations.

Extensive experiments are conducted on standard benchmarks (Middlebury, Sintel) and on high‑resolution custom video sequences. The proposed method is compared against classical optical‑flow based interpolation, recent deep‑learning frame‑interpolation networks (e.g., Super‑SloMo, DAIN), and state‑of‑the‑art variational methods (e.g., TV‑L1). Quantitative metrics (PSNR, SSIM) show consistent improvements, especially on scenes with large motions and strong edges, where the BV‑based model preserves structure better. Subjective visual assessments confirm reduced blur and ringing artifacts. Moreover, the recovered velocity field has a clear physical interpretation, opening possibilities for downstream tasks such as motion analysis or object tracking.

The authors conclude with several promising extensions: incorporating nonlinear transport terms (diffusion or reaction), hybridizing the optimal‑control framework with deep‑learning initializations, and accelerating the algorithm via GPU‑based multiscale solvers for real‑time applications. Overall, the paper delivers a rigorous mathematical foundation, solid existence and optimality theory, and practical algorithms that together advance the state of the art in image sequence interpolation.

💡 Research Summary

📜 Original Paper Content