Shape-Adaptive Motion Estimation Algorithm for MPEG-4 Video Coding

Reading time: 5 minute
...

📝 Original Info

  • Title: Shape-Adaptive Motion Estimation Algorithm for MPEG-4 Video Coding
  • ArXiv ID: 1002.1168
  • Date: 2010-02-08
  • Authors: Researchers from original ArXiv paper

📝 Abstract

This paper presents a gradient based motion estimation algorithm based on shape-motion prediction, which takes advantage of the correlation between neighboring Binary Alpha Blocks (BABs), to match with the Mpeg-4 shape coding case and speed up the estimation process. The PSNR and computation time achieved by the proposed algorithm seem to be better than those obtained by most popular motion estimation techniques.

💡 Deep Analysis

Deep Dive into Shape-Adaptive Motion Estimation Algorithm for MPEG-4 Video Coding.

This paper presents a gradient based motion estimation algorithm based on shape-motion prediction, which takes advantage of the correlation between neighboring Binary Alpha Blocks (BABs), to match with the Mpeg-4 shape coding case and speed up the estimation process. The PSNR and computation time achieved by the proposed algorithm seem to be better than those obtained by most popular motion estimation techniques.

📄 Full Content

Motion estimation and compensation is a key component for high quality video compression, which is characterized by its high computation complexity and memory requirements. However, Motion estimation is considered as the most time-consuming stage in MPEG processing [1] (up to 90% of the total execution time [2]). Therefore, to achieve performances desired for real time applications, it's imperative to think about hardware architecture and use a motion estimation algorithm which reduces computation complexity. The best performances, in term of PSNR, are achieved by exhaustive search (ES) ME algorithms, since they examine all possible motion vectors, however, their implementation increase the computation time and slow down the compression process [3]. Fast search algorithm, such as 2-D log search scheme [4], the Three step search (TSS) [5], the Four Step Search (FSS) [6] and Diamond Search (DS) [7] have been proposed, all of them try to achieve the same PSNR as the ES by considering only the most probable motion vectors. In fact, many researchers have focused on ME algorithms especially based on texture coding. However, one of the most important concepts introduced by the Mpeg-4 visual standard is the use of video object (VO) as an entity the user can access and manipulate. The instance of a VO at a particular point of time is called video object plane (VOP) [8]. To support coding of arbitrary-shaped objects, each position in the picture is associated to a Binary Alpha Blocks (BAB); and thus macro-blocks of the image are classed as: opaque (fully 'inside' the VOP), transparent (not part of the VOP) or on the boundary of the VOP. Therefore, in MPEG-4 video coding, ME of shape is also imperative for real-time VOP-based encoding. Several papers have proposed software implementation methods for shape coders [ 9 ], [ 10 ] where shape information is used to reduce search point per macroblock and only valid predecessors are evaluated [10] for boundary macro-blocks. Since hardware implementation is usually better to achieve the complexity suitable for real-time applications, we propose in this document a gradient based algorithm where ME for shape coding is combined with ME for texture, which we will use for a hardware implementation of an MPEG4 encoder IP to accelerate convergence process. The algorithm uses shape ME for boundaries macro-blocks and textures ME for opaque macro-blocks.

To check its performances, we have implemented and tested the proposed algorithm with many test video sequences. Results show that the algorithm presents a good PSNR result with a net decreasing in the number of iterations and computation time. The next section presents background information about video coding and motion estimation, the main idea of the proposed algorithm is described in the section 3 and the evaluation of obtained results is presented in section 4.

For video compression case, the goal is to remove the redundancy in images and reduce the amount of bits required to represent the video sequence. In addition to the discrete cosine transform (DCT) and the quantization block used to remove spatial redundancy, a typical MPEG encoder utilizes a motion estimation (ME) and compensation system to remove temporal redundancy between successive frames of the treated video. In block-based video coding standards such as Mpeg-4, the first video encoding stage performs motion estimation and compensation for each frame of the video sequence. In this step, we compare the content of the current and previous images and encode only displaced difference blocks, with motion vectors, instead of encoding all original blocks. Conventional algorithms generally use Matching-based or Gradient-based techniques to compute motion vectors. Matching-based techniques: in these approaches, true motion vectors can be determined based on the differences of pixel intensities. The best matching is obtained for smallest differences between pixel intensities of the current and reference frames. Gradient-Based techniques: in these approaches, based on the “intensity conservation over time assumption”, the spatiotemporal derivatives of pixel intensities is measured to determine true motion vectors. The total derivative of the image intensity function (I) should be zero every time and for each position in the image:

In the search process, the problem is to find the motion vector MV for the current block B (y,x) at time instance , so that the error SAD (sum of absolute differences) between the block B and the matching block C at time instance is minimized.

For commonly used motion estimation algorithms, there is no limit on the number of steps that the search algorithm can take. Therefore we thought about exploiting the optical-flow principle and use a recursive motion estimation which is a less complex method to compute dense displacement fields [10]. The proposed algorithm can be divided into two main steps as shown in Fig. 2: the first step is a Block recur

…(Full text truncated)…

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut