Recent Advances on HEVC Inter-frame Coding: From Optimization to Implementation and Beyond
High Efficiency Video Coding (HEVC) has doubled the video compression ratio with equivalent subjective quality as compared to its predecessor H.264/AVC. The significant coding efficiency improvement is attributed to many new techniques. Inter-frame coding is one of the most powerful yet complicated techniques therein and has posed high computational burden thus main obstacle in HEVC-based real-time applications. Recently, plenty of research has been done to optimize the inter-frame coding, either to reduce the complexity for real-time applications, or to further enhance the encoding efficiency. In this paper, we provide a comprehensive review of the state-of-the-art techniques for HEVC inter-frame coding from three aspects, namely fast inter coding solutions, implementation on different hardware platforms as well as advanced inter coding techniques. More specifically, different algorithms in each aspect are further subdivided into sub-categories and compared in terms of pros, cons, coding efficiency and coding complexity. To the best of our knowledge, this is the first such comprehensive review of the recent advances of the inter-frame coding for HEVC and hopefully it would help the improvement, implementation and applications of HEVC as well as the ongoing development of the next generation video coding standard.
💡 Research Summary
This paper presents a comprehensive survey of recent advances in inter‑frame coding for High Efficiency Video Coding (HEVC), focusing on three major dimensions: fast coding algorithms, hardware implementation techniques, and advanced coding concepts. HEVC achieves roughly a 50 % reduction in bitrate compared with its predecessor H.264/AVC while delivering comparable subjective quality, but this gain comes at the cost of a 4–10× increase in computational complexity and up to 5 000× more processing power required for real‑time operation. The authors dissect the inter‑frame coding pipeline into three core components—CU/PU partitioning, motion estimation (ME) (including integer‑pixel and fractional‑pixel stages), and motion compensation (MC)—and review the state‑of‑the‑art acceleration methods for each.
Fast Inter‑Coding Solutions
The survey categorizes fast algorithms into three families. Top‑down methods follow the standard recursive quadtree traversal but prune the search early based on criteria such as RD‑cost thresholds, Coded Block Flags, SKIP mode detection, or Bayesian probability models. Several works employ machine‑learning classifiers (K‑NN, Markov Random Fields) to predict whether a CU should be split, achieving notable reductions in the number of RD calculations. Bottom‑up approaches start from the smallest CU size and work upward, using depth‑prediction and reverse traversal to prioritize texture‑rich or high‑motion regions; however, they are less effective in smooth areas. Prediction‑based techniques directly infer the optimal CU depth or depth range from spatial‑temporal features of neighboring CUs, motion‑vector variance, edge gradients, or similarity classes. These methods often achieve the highest speed‑up while preserving coding efficiency because they eliminate entire branches of the partition tree before any costly motion search is performed.
Hardware Implementations
The authors review implementations on GPUs, FPGAs, ASICs, and multi‑core CPUs. GPU‑based designs exploit massive data parallelism to accelerate integer‑pixel ME and fractional‑pixel interpolation, carefully arranging memory accesses to avoid bandwidth bottlenecks. FPGA solutions focus on pipelined architectures, fixed‑point arithmetic, and on‑chip storage of motion‑vector candidates, achieving low power consumption suitable for embedded devices. ASIC designs integrate dedicated motion‑vector search engines and high‑throughput interpolation filters, enabling real‑time 8K encoding. Across platforms, trade‑offs among latency, power, and area are discussed, and emerging hybrid CPU‑GPU collaborative frameworks and dynamic voltage/frequency scaling (DVFS) techniques are highlighted as ways to improve energy efficiency without sacrificing throughput.
Advanced Inter‑Frame Coding Techniques
Beyond acceleration, the paper surveys novel coding tools that extend the basic inter‑frame model. Affine motion compensation introduces rotation and scaling parameters, allowing more accurate prediction of complex object motion. Multiple reference frame selection dynamically adapts the reference set based on content characteristics, reducing search space while maintaining prediction quality. Recent deep‑learning approaches train convolutional or recurrent networks to predict motion vectors and to refine cost functions with perceptual loss terms, surpassing traditional Advanced Motion Vector Prediction (AMVP) in both accuracy and visual quality. Hybrid codecs that combine block‑based prediction with neural‑network‑based frame synthesis are also examined, showing promising gains in bitrate reduction for challenging sequences.
The survey quantitatively compares each technique in terms of coding efficiency (BD‑Rate savings), computational complexity reduction (percentage of original encoding time saved), and implementation feasibility. It also discusses practical considerations such as standard compliance, integration into the reference software (HM), and adaptability to emerging use‑cases like mobile streaming, broadcast contribution, and cloud‑based transcoding.
In summary, this paper provides the most extensive review to date of HEVC inter‑frame coding research, covering over 200 recent publications. It serves as a valuable roadmap for researchers and engineers seeking to balance the competing demands of high compression efficiency, low latency, and hardware constraints in current and future video coding standards.
Comments & Academic Discussion
Loading comments...
Leave a Comment