An optimal mode selection algorithm for scalable video coding

1 An optimal m ode selection algorithm for sca lable video coding L. Balaji Department of ECE, Vel Tech Rangar ajan Dr. Sagu nthala R&D Institute of Science and Technology, Avadi, Chenna i 600062, I ndia Email: maildhana bal@gmail.com K.K. Thyagharajan Department o f ECE, RMD Engineering College, Chennai 601206, India Email: kkthy agharajan@yahoo.com C. Raja Department of ECE, Vignan’s Foundation for Science Technology and Research, Vadlamudi, Guntur Dt., Andhra Prad esh, 522213, India Email: rajachan dru82@yahoo.co.in A. Dhanalakshmi Department o f EIE, Panimalar Engineer ing College, Chennai 600123, India Email: dhanalak shmi248@gmail.com Abstract: Scalable video c oding (SVC ) is extended from its predecessor advanced video coding (AVC) because of its flexible transmission to all type of ga dgets. However, SVC is more flexible and scalable than AVC, but it is more complex in determining the computatio ns than AVC. The traditional full search method in the stan dard H.264 SVC consumes more encoding time for computation. This complexity in computation need to be reduced and many fast mode decision (FMD) algorithms were developed, but many fail to balance i n all th e three measu res such as peak signal to noise ratio (PSNR), en coding time and bit rate. In this paper, the proposed o ptimal mode selection algorithm based on the orientation of pixels achieves b etter tim e savin g, g ood PSNR and co ding efficiency. The p roposed algorithm is co mpared with the s tandard H.264 JSVM refer ence softwar e and fo und to be 57.44% time saving , 0.43 dB increments in PSNR and 0.23% compression in bit rate. Keywords: scalable video coding; SVC; computation; mode selection; peak signal to noise ratio; PSNR; time; bit rate. 1 Introduction The H. 264 scalab le video co ding (H.264/SVC) facilitates to create a bit stream by encoding a video signal once and allows extracting various sub streams with differ ent bit rates and resolutions from the same stream. The sub stream can be scaled o ver an d over until it has smallest description called base layer (BL) stream . Th e BL is the lowest layer and it has the lowest resolu tion, the lowest frame rate and the lowest q uality of content (Sch warz et al., 2007). This layer is backward co mpatible with H.264 advanced vid eo cod ing (H.264/AVC) and it can be decoded independently . The v ideo data can be enhanced to a higher resolution or a higher frame rate by adding additional bit streams o f higher layers called the enhancement layers (ELs). Determining or predicting the EL data from the lower layer (reference lay er) of the sam e instant of time is kno wn as the in ter-layer pred iction process. This makes the vid eo to hav e various spatial resolutio ns and this called as spatial scalability. The scalable video coding (SVC) suppo rts t hree scalabilities viz. spatial, temporal and quality scalabilities. In a multi-layer spatial scalability approach , each layer has a distinct size and it is referr ed by an identifier D called dependency identifier (Segall and Sullivan, 2007). In SVC two inter -layer prediction concepts ( Schwarz et al., 20 04) have been intr oduced: 1. MB mode pr ediction and associated motion parameters estimation 2. Residual signal pred iction The offset betwee n the location of the current (target) MB and the best -matched b lock in the refer ence MB is called motion vector (MV). The offset b etween the location o f the current block and the predicted block in the reference frame is called predicted motion vector (PMV) . Spatial co rrelation is used to pred ict PMV. The difference between PMV and MV of the MB is called motion v ector difference (MVD). If the PMV is p redicted accurately a small MVD will be obtained. The temporal scalability in H.264 /SVC is ac hieved by de termining the con tent of a frame at higher temporal level from the frames at lower levels. The number of frames to be played with an interval of time ( temporal resolution) can be changed by a pplying temporal scalability. The I or P frame in a video is refe rred to as a key pictu re or a key frame. This type of fram e will b e the first frame of a Group of Picture (GOP). A GOP is a n umber of pictures between two key frames and it will co ntain 2/4/8/16/32 frames d epending on the number o f layers used for temporal scalab ility. The num ber of frames in a GOP m ay be given as 2 N wher e N is the number of layers and its value is 0 for the BL (Lee and Kim, 2012). Figure 1 shows the temporal scalability levels in a spatial base and a spatial EL. Each spatial lay er eith er base or enh ancement includes the correspond ing tempo ral scalability levels as chosen by the coder. For a GOP size of 8 , there will be 4 temporal scalability levels (TSL0 , TSL1, TSL2 an d TSL3) in each spatial base and spatial ELs where N rang e from 0, 1, 2 and 3 or N = 3. The key f rame (I or P) will alwa ys b e at TSL0 that is used t o code the upper level (TSL1, TSL2 and TSL3) temporal frames. Figure 1 show that I or P frames alone can be used at TSL0 and B frames are used at TSL1, TSL2 and TSL3 in both spatial base and spatial ELs. Si nce four tempo ral levels are included, h ence the total numb er of frames in spatial b ase and spatial enhancem ents layers is 8 (2 3 = 8). Figure 1 Temporal scalability levels in a spatial base and a spatial EL (see online versio n for colours) The quality scalability is the feature enhancement for pictures of the same spatio- temporal resolution. An adaptive distortion -based intra-rate estimation algorithm has been proposed in Yan and Wang (2009) for improv ing the video quality o f H.264/AVC. An improved context-b ased adap tive variable length coding scheme has b een pro posed in Heo and Ho (2010) wh ich modifies the r elative entrop y coding par ts in H264/ AVC f or achieving lossless intra-co ding. Kau an d Leng (2015) pro posed a simple gradien t evaluation m ethod that ev aluates the texture o rientation inside the coding block to spee d up the encoding of H.264/AVC intra-p rediction mod e. In Thyagharajan and Ramach andran (2006), meth ods have been suggested to analyse the qu ality of video streaming created by a sequence of intra-cod ed and in ter-coded f rames. Th ese works improve the cod ing efficiency but increase the computational complexity . Th e scalabilities should b e achieved by balancing both decoder complexity and coding efficien cy. The com putation complex ity in H.264/SVC depend s on the methods u sed to decide the search range an d mod es for MBs, and also depends on the methods used for motion estimatio n. Conven tion al JSVM uses full search method. It calculates the r ate d istortion cost (RDC) for all possible modes and chooses the mo de with minimal RDC as the best m ode. But this method consumes more encoder time an d increases computation complexity. To reduce encoder time an d complexity, various fast m ode decision (FMD) algorithms hav e bee n proposed for SVC and these are discussed in Section 2. The pr oposed mode selectio n alg orithm is d iscussed in Section 3 which reduce s the encoder time without sacrificing peak signal to noise r atio (PSNR) or bit rate. Section4 provides the experimental results and followed by conclusion in Section 5. 2 Related work Ri et al. (2009) classify the existing low-complex ity alg orithms for inter-mode decision as: • rate distortion ( RD) estimation-based algorithms • rate distortion optimisation-based alg orithms • non-rate distortion o ptimisation (non-RDO) algorithms The RD estimation -based algorithm s estimate the r ate and distortion with qu antised coefficients of the discrete cosine tr ansform (DCT) used for coding. Th e distortion is proportional to the quantisation error (Tu et al., 2006; Ichigaya et al., 2006) and the rate is related to th e n umber and sum of non -zero qu antised DCT coefficients. The RDO-based algorithms use statistical relationsh ips between layers and modes to predict the best mode. The non-RDO-based algorithms u se the features like textu re and edge inform ation to select the optimal m ode. Nine prediction mo des are used in H. 264/AVC for its single lay er coding. But H.264/SVC inclu des s even macro block m odes for inter-pred iction (16 × 6 , 1 6 × 8, 8 × 16, 8 × 8 , 8 × 4, 4 × 8, and 4 × 4), 13 prediction mo des for in tra-coding and a SKIP mode. Inter-mode decision requires the estimation of MVs for all possible blo ck types for each MB. The optimisation of mode d ecision will red uce the com putational complexity. BL pred iction mod e and quarter pixel r efinement mo de have been ad ded for encoding the EL. In Yu et al. (2008), an algorithm has been proposed for mo de selection of inter -frame coding based on Lagran gian cost. Computation complexity reduction m ethods either reduce the number o f modes or reduce the search range for mo tion estimation. In Li et al. (2006), a FMD alg orithm for spatial and temporal SVC is p resented. In this method, instead of choosing a mode th at has minimum RDC, the redundant modes of the E L ar e min imised based on the relation between the BL and EL. Th e duplicate mo des at high er layer are removed with the mode information av ailable at the lower layer . This algorithm attains good PSNR, b ut increases the enco ding time due to the full search method used for the B L. A layer adaptive mode d ecision algorithm and a motion search scheme for hier archical B - frames has bee n pr oposed in Lin et al. (2010 ). In this method the RDC is estimated for different modes in the ELs and mode conditional probabilities are calculated for different temporal layers. Inter -layer prediction is used for EL if the quantisation parameter (QP) is less than 33 otherwise exhau stive searches will be u sed for mod e selection. In this type of adaptive mode selection, th e computation al complexity for coding the ELs is remarkably reduced, bu t the bit rate increases by 1% and the aver age Y-PSNR loss in creases by 0.05 db. A FMD alg orithm presented i n Yeh et al. (2010) predicts th e mode for EL by statistical analysis. By using Bayesian th eorem, it confirms wh ether the mode is best or not and refines th e decision by using Marko v process. But this method degrades the PSNR and increases the bit r ate. The algorithm p roposed in Kim et al. ( 2009) u ses the modes of a co - located MB and its n eighbouring MBs at the BL to predict the mode of each MB at the E L. The H.264/SVC distribu tes all hierarchical B-pictures with in two co nsecutive key pictur es at different temporal scalability levels for ach ieving various temporal resolutions. Th e inter-pred iction methods used for B -pictures increase the computational comp lexity. To reduce this co mplexity a fast-i nter-p rediction mode decision meth od is proposed in Lee and Kim (2012). In this method, the pixel values of the cu rrent blo ck to b e en coded are compared with those o f a motion compensated reference block using statistical analysis. In our former work (Balaji and Thyagharajan, 2014), a set of modes based on probability is built for BL and the mo de selection for EL is obtained using the correlation between the frames. Balaji an d Thyaghar ajan (2015), an FMD algor ithm based on likelihood model identifies the prime mode for all types of s equence s. Although this algorithm identifies the modes faster, it lags in terms of PSNR and bit r ate due to the use of fu ll search method. In Liu et al. (20 19) the MBs are classified i nto five activity classes using MVs for efficient m ode detection. Frame sequences th at contain slow motion or uniform motion tend to have more SKIP mode macro blo cks (Grecos and Yang, 2005). Few algorith ms (Dhanalaksh mi et al., 20 19; Wang et al., 2 019) dis cussed to enh ance the performan ce of the scalable extension of HEVC (SHVC). In Dhanalak shmi et al. (2 019), a superior step search algor ithm is introduce d to enhance th e coding efficiency and PSNR without much increase in th e co mputational complexity. While in Wang et al. (2019), proposed to early terminate the mode decision process in SHVC using the d epth lev el probabilities for code tree unit (CTU). So the analysis shows that no algorithm would search b est mode without the loss of either PSNR or bit rate. The algorithms already available red uce the num ber of modes but do not improve the matching criteria. To im prove the matching criteria e ven for noisy v ideos and to en hance the mod e selection with reduced complex ity a novel macro block pre-classification algorithm is propo sed. The algorithms p roposed in this paper s how significant improvement in mode decision , PSNR an d bit rate as compared to JSVM and reduce the computation complexity. We have also compared o ur algorithm with the FMD algorithms alrea dy proposed . 3 Proposed optima l mode selecti on algorithm Generally, to encode a MB called as cu rrent MB or target MB, a close match of that MB is searched in a reference frame. A reference frame will be in the previous layer if inter - coded spatial scalability is required or it may be in the pr evious tem poral scalability level (TSL) otherwise it may even be within the same frame in the case o f intra -coding. The search can also be either on macr o b lock boundar ies or on pixel boundaries. The current MB to be en coded is comp ared with the reference MB b y estimating the sum o f squared differences of the corresponding pixels. T his sum is called the er ror or residue. Noise in the video will increa se this error. The positive and negativ e noise will result in cumulativ e error even if the noise distribution is ze ro as proposed in our previous wo rk (Balaji and Thyagharajan, 2017). To decimate the effect of no ise and reduce the computation al complexity, it is pro posed to calculate the sum of differen ces (SOD) betw een the corresponding pixels as shown in equation (1). where MB c ( i , j ) is the current MB to be encoded and MB r ( i , j ) is the MB in the reference frame which is to be checked that whether it is a match or not. The moti on of an y object is closely related to the ce nter of gravity (COG) of that o bject. If an MB co ntains an object, the movemen t of the object affects the CO G of that MB. If the COG of current MB is denoted as ( GX c , GY c ) and COG of the reference MB is denoted as ( GX r , GY r ) then the movement of COG in the horizontal direction ( X d ) can be given as shown in equatio ns (4). Since the horizontal movem ent of an ob ject in an MB is represented by X d and the vertical movemen t is represented by Y d , the change (or mov ement) in the COG from the reference MB to targ et MB can be given as shown in equation (8), where DCOG is the distance from the COG of ref erence MB to the COG of the target MB. DCOG = ( X d 2 + ( Y d 2 ) ) (8) This DCOG is used to classify the MBs as discussed b elow. Table 1 Classification of macroblocks MB’s class Nature of the video Remarks C1 Stationary background OR static foreground SOD and DCOG are small, SKIP mode C2 Uniform motion foreground OR smooth motion background SOD and DCOG values are larger than the previous case (C1) C3 Non-uniform slow motion background AND foreground SOD and DCOG are medium C4 Fast OR complex motion SOD and DCOG are large In adaptive mode selection strategy, a sub set of intra or inter -predictio n mod es will be sent for the RDO process. Gorur and Amrutur (2014) reduce the bandwidth an d computational cost for surveillance video encoders by performing SKIP mode selection using Gaussian mixture m odel-based segm entation. They also classify the MBs as foregr ound and background MB and hence reduce the cost of coding of uncovered background regions. In Yu et al. (2008), sk ip mode for current MB is chosen when either the co -lo cated mac ro block in the refer ence frame is enco ded with skip mode or at least one skip mode MB is found abo ve or to the left of the curren t MB. When skip m ode is used, the content o f the reference macro block is directly copied to the co -located MB in the curren t frame and it requires neith er motion compensation nor Lagrangian estimation. In our m ethod larger block size is used for coding MBs if the background of a video is stationery and foreground is static. If the foreground has uniform motion with rigid objec ts or if the backgr ound has smooth motion then the residual in encoding will be small (SOD is also small) and hence larger block sizes (16 × 16) are used for encoding. For regions with motions of different objects or g enerally for regions with comp lex moti ons, smaller block sizes are used . If the MB to b e encoded belongs to a background scen e, there will no t be any motion and the MV will be almo st zero. In this case SOD and DCOG will b e very small an d hen ce SKIP mode is used. If the MB contains slow moving objects, then MV will be o f reasonable value and hen ce all inter modes are u sed. In this ca se th e SOD an d DCOG will be of medium values. If the MB belongs to a f ast-moving object, then the MV will be lar ge. Hence the search range should be maximum with all possible inter and int ra modes. In this case SOD an d DCOG wi ll also be large values. Table 1 summarises these d iscussions. The mode in a n inter-layer prediction is determined by the mode of co -located MB in the frame of BL or p revious temp oral level. The mo de for intra -coding is determined by the mo des of th e neighbouring MBs in th e same frame. First, we ar e estimating the SOD and DCOG for the collocated MBs in the target and reference f rames. The following algorithm is used to d ecide the m ode selection wh ere k 1 and k 2 are constants. Since the valu es of SOD an d DCOG dep end on the QP used, the constants k 1 and k 2 are also relate d with QP in the algorithm. In prac tice, all video encoders have used a fixed search range in order to obtain uniform quality. But most of the times, the MVs are very small compar ed to the given search range. The search range for hierar chical B-picture is fixed to 32 in th e reference software. Fixed search range h as much redundancy. Therefore, in our method we are using adaptive search range. Smaller values of SOD and DCOG indicate that th e motion is less. So, the search range is fixed to be sma ll. The proposed mod e selection alg orithm defines the size of the search range and the modes to be checked in an ad aptive manner . The defined model p arameters such as k 1 to k 3 and d 1 to d 3 are set (sever al iterations) to define the search range adap tivel y. As we know, there are 7 modes to be checked for inter -layer coding and 13 mo des for intra-layer coding. The algorithm wo rks as follows, The search range will be set as 2, if SOD lies below the p roduct of QP and k 1 ; and DCOG lies below th e product of QP and d 1 . Now the set o f modes to be chec ked will b e only 2 mod es (SKIP, 16 × 16). The search range will be set as 4, if SOD lies below the p roduct of QP and k 2 ; and DCOG lies below th e product of QP and d 2 . Now the set o f modes to be chec ked will b e only 4 mod es (16 × 16, 16 × 8, 8 × 16, 8 × 8). The search range will be set as 8, if SOD lies below the p roduct of QP and k 3 ; and DCOG lies belo w the product of QP and d3. Now the set of modes to be checked will be only 7 mod es (16 × 16, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, 4 × 4). The algorithm will check all inter and intra mode, if SOD lies abov e the product of Q P and k 3 ; and DCOG lies above the product of QP and d 3 and the algorithm is depicted belo w. 3.1 Mode selection a lgorithm Input: k 1 , k 2 , k 3 k 1 < k 2 < k 3 d 1 , d 2 , d 3 d 1 < d 2 < d 3 MBsizeInterM = {16 × 16, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, 4 × 4} MBsizeIntraM = {13 m odes} if ((SOD ≤ (QP × k 1 )) & (D COG ≤ (QP × d 1 ))) { Search range = 2 mode ∈ ( SKIP, 16 × 16) } else if ((SOD≤(QP ×k 2 )) & (DCOG≤(QP×d 2 ))) { Search range = 4 mode ∈ (16 × 16, 16 × 8, 8 × 16, 8 × 8) } else if ((SOD≤(QP ×k 3 )) & (DCOG≤(QP×d 3 ))) { Search range = 8 m ode ∈ (16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4 ×4) } else { Search range = 32 m ode ∈ (any one of the all inter and intra modes) } 4 Results and discu ssion The p roposed optimal mode selectio n algor ithm is ex ecuted using th e standard JSVM (jo int scalable video model) reference software with version 9.19.15 ( Reichel et al., 2007). The testing platf orm configures to Intel i3 Pro cessor, CPU with 2.40 Ghz, 2 GB RAM. Standard benchmark s equences such as Bus, City, Forem an, Crew, Soccer an d Mobile test seq uences are taken for the performance evaluation of the pro posed algorithm. All the test sequ ences are set with the p arameters as in Table 2. Table 2 Simulation parameters Parameters BL EL1 EL2 Resolution QCIF CIF CIF Frame rate 15Hz 15 Hz 30 Hz QP (BL/EL1/EL2) 16/20/24, 20/24/30, 24/28/32, 28/32/36 No. of frames 150 Maximum delay 1,200 ms GOP size 16 No. of reference frames 1 Search mode Fast search Search range 32 EL search range 4 Fast bidirectional search On Iterative search range 8 Codec JSVM 9.19.15 The proposed alg orithm was evaluated under three performance measur es such as PSNR, bit rate and time. Table 3, shows th e experimental results of the six video sequ ences averaged under differ ent QPs. Each video sequence was iterated with 1 50 fram es to analyse the performance mea sures for the p roposed algorithm. This is com pared with the algorithms prop osed by Li, Yeh, Kim and Lee. Table 3 Average change in BD YPSNR ( Bjontegaard, 2001) with respect to JSVM 9.19.15 Algorithms Video sequence Bus Foreman City Crew Mobile Soccer Li – 0.12 – 0.20 – 0.11 – 0.12 – 0.17 – 0.17 Yeh – 0.11 – 0.19 – 0.10 – 0.11 – 0.10 – 0.16 Kim – 0.08 – 0.11 – 0.14 – 0.06 – 0.09 – 0.07 Lee – 0.11 – 0.13 – 0.14 – 0.13 – 0.11 – 0.12 Prop – 0.07 – 0.09 – 0.10 – 0.04 – 0.05 – 0.09 Tables 3, 4 an d 5 sho w the co mparison among the prev iously proposed algorithms in terms of BD PSNR, Bit rate and Time. T he proposed algorithm achieves 57.44% faster in time with PSNR impro vement by 0.43 dB and 0.23 kbp s reduction in b it rate comp ared with the standard JSVM reference softwar e. T he r esults ob tained are on an average of all six sequences. All the existing FMD algor ithms perform better and save time compared to JSVM at all quantisation values. Although, the proposed algorithm lags beh ind few ex isting algorithms in ter ms of FMD, it outperforms fro m the p reviously p roposed algorithm s in terms of PSNR and bit rate. Table 3 lists the BD PSNR obtained for the previously p roposed algorithms and the mode selection algorithm. It is interesting to n ote that the proposed algorithm outperforms all algorithm s in terms of PSNR. The proposed algorithm provides better visual quality compared to all other algorithms except soccer sequence, and it is du e to fast motion with large spatial details. Table 4 Average change in BD bit rate (Bjontegaard, 2001) with respect to JSVM 9.19.15 Algorithms Video sequence Bus Foreman City Crew Mobile Soccer Li 2.06 3.33 2.12 2.56 2.15 2.38 Yeh 1.06 2.32 0.84 1.44 0.92 1.23 Kim 1.59 2.46 1.79 2.55 1.79 2.45 Lee 1.44 2.80 1.75 1.93 1.68 1.53 Prop 0.82 1.28 0.59 0.63 1.12 0.68 Table 4 lists the BD bit rate obtained for the previously proposed algorithms and the mode selection algorithm. I t is interesting to note that the propo sed algo rithm outperforms all algorithms in ter ms of b it rate. The proposed algorithm provides better coding efficiency compared to all oth er algorithms except mobile seq uence, and it is due to slow mo tion convergence with small spatial details. Table 5 lists the BD encoder time saving for the previously proposed alg orithms an d the mod e selection algorithm. It is more interesting to note that the p roposed algorithm outperforms all existing algorithms in ter ms of saving ti me for the encoder. The propo sed algorithm prov ides a mode decision algo rithm that can able to minim ise the complexity of the encod er and achieves fast mode selection. Table 5 Average percentage change in computation time, BD Time (Bjontegaard, 2001) wit h respect to JSVM 9.19.15 Algorithms Video sequence Bus Foreman City Crew Mobile Soccer Li 31.67 26.98 30.41 34.35 27.21 31.96 Yeh 34.15 31.29 32.51 38.03 31.83 35.84 Kim 37.03 31.91 31.90 36.22 35.64 36.14 Lee 34.85 31.64 33.10 39.03 31.59 35.93 Prop 38.47 31.66 34.35 36.78 41.23 39.56 In general, the p roposed mode selection alg orithm p rovides good time saving for the encoder along with better coding efficien cy and good visual quality. 5 Conclusions SVC with its scalable feature appears more flex ible with any type o f gadg ets than a dvanc ed video coding (AVC). In addition, the computatio n is also more than AVC. More FMD algorithms are developed to save encoding time by reducing t h e complex ity in co mputation but co mpromising with PSNR and bit rate. The p roposed op timal mode selection algori thm based on the orientation o f pixels achieved b etter time saving without any degradation in PSNR or increment in bit rate. The proposed alg orithm is simulated with the standard JSVM reference software an d fo und to perform better in all the three measures such as 57.44% tim e saving, 0.43 dB increments in PSNR and 0.23% compression in bit rate. References Balaji, L. and Thyagharajan, K.K. (2014) ‘H.264/SVC mode decision based on mode correlation and desired mode list’, Intl. Journ. of Automation and Computing , Vol. 11, No. 5, pp.510 – 516. Balaji, L. and Thyagharajan, K.K. (2015) ‘ H.264 SVC complexity reduction based on li kelihood mode decision’, The Scientific World Journal , No. 418437, pp.1 – 10. Balaji, L. and Thyagharajan, K.K. (2017) ‘A pixel orientatio n and adaptive search range-based complexity reduction in H.264 scalable video coding’, Proc. of ICACCS2017 , January, India. Bjontegaard, G. (2001) ‘ Calculation of average PS NR differences between RD - curves’, Presented at the 13th VCEG-M33 Meeting , Austin, TX, 2 – 4 April. Dhanalakshmi, A., Nagarajan, G. a nd Balaji, L. (2019) ‘SHVC performa nce enhancement using superior step search algorithm’, ICT Express , Article in Press, April. Gorur, P. and Amrutur, B. (2014) ‘Skip decision and re ference frame selection for low- complexity H.264/AVC surveillance video coding’, IEEE Trans. Circuits Syst. Video Technol. , Vol. 24, No. 7, pp.1156 – 1169. Grecos , C. and Yang, M.Y. (2005) ‘Fast Inter mode prediction for P -slices in the H.264 video coding standard’, IEEE Trans. Broadcast , Vol. 51, No. 2, pp.256 – 263. Heo, J. and Ho, Y. (2010) ‘Efficient level and zero coding methods fo r H.264/AVC lossless intra cod ing’, IEEE Signal Process. Lett. , Vol. 17, No. 1, pp.87 – 90. Kau, L-J. and Leng, J - W. (2015) ‘A gradient i ntensity -adapted algorithm with adaptive selection strategy for th e fast de cision of H.264/AVC intra - prediction modes’, IEEE Trans. Circuits Syst. Vide o Technol. , Vol. 25, No. 6, pp.944 – 957. Kim, S.T., Konda, K.R., P ark, C.S., Cho, S.J. and Ko, S .J. (2009) ‘Fast mode decision algorithm for inter - layer coding in scalable video coding’, IEEE Trans. Consum. Electron. , Vol. 55, No. 3, pp.1572 – 1580. Lee, B. and Kim, M. (2 012) ‘An efficient inter - prediction mode decision method for temporal scalability coding with hierarchical B-picture structure’, IEEE Trans. Broadcast , Vol. 58, No. 2, pp.285 – 290. Li, H., Li, Z.G. and Wen, C. (2006) ‘Fast mode de cision a lgorit hm for inter- frame coding in fully scalable video coding’, IEEE Trans. Circuits Syst. Video Technol. , Vol. 16, No. 7, pp.889 – 895. Lin, H-C., Peng, W-H. and Hang, H- M. (2010) ‘Fast context -adaptive mode decision algorithm for scalable video coding wit h combined coarse-gr ain quality scalability (CGS) and temporal scalability’, IEEE Trans. C ircuits Syst. Video Technol. , Vol. 20, No. 5, pp.732 – 748. Liu, Z., Shen, L. and Zhang, Z. (2009) ‘An efficient inter mode decision algorithm based on motion homoge neity for H. 264/AVC’, IEEE Trans. Circuits Syst. Video Technol. , Vol. 19, No. 1, pp.128 – 132 Reichel, J., Schwarz, H. and Wien, M. (2007) ‘Joint scalable video model 11(JSVM 11)’, Joint Video Team , Doc. JVT-X202. Ri, S -H., Vatis, Y. and Ostermann, J. (200 9) ‘Fa st inter -mode decision in an H.264/AVC e ncoder using mode and Lagrangian cost correlation’, IE EE Trans. Circuits Syst. Video Technol. , Vol. 19, No. 2, pp.302 – 306. Schwarz, H., Marpe, D. and Wiegand, T. (2004) Inter-Layer Prediction of Motion and Residual Data , ISO/IEC JTC, Doc. MPEG2004/M11043. Schwarz, H., Marpe, D. and Wiegand, T. (2007) ‘Overview of the s calable video coding extension of the H.264/AVC standar d’, IEEE Trans. Circuits Syst. Video Technol. , Vol. 17, No. 9, pp.1103 – 1120. Segall, C. A. and Sullivan, G.J. (2007) ‘Spatial s calability within the H. 264/AVC scalable video coding extension’, IE EE Trans. Circuits Syst. Video Technol. , Vol. 17, No. 9, pp.1121 – 1135. Thyagharajan, K.K. and Ramachandran, V. (20 06) ‘Optimal buf fering requirement analysis fo r jitter - free variable bit rate video streaming’, Computer Syst. Science Engg. , Vol. 21, No. 3, pp.161 – 172. Wang, D., Zhu, C., Sun, Y., Dufaux, F. and Huang, Y. (2019) ‘Efficient multi-strategy intra prediction for quality scalable high efficien cy video coding’, IEEE Transactions on Image Processing , April, Vol. 28, No. 4, pp.2063 – 2074. Yan, B. and Wang, M. (2009) ‘Adaptive distortion -based intra-rate estimation for H.264/AVC rate control’, IEEE Signal Process. Lett. , Vol. 16, No. 3, pp.145 – 148. Yeh, C.H., Fan, K.J., Chen, M.J. and Li, G.L. (2010) ‘Fast mode decision algorithm for scalable video coding using B ayesi an theorem detection and Markov process’, IEE E Trans. Circuits Syst. Video Technol. , Vol. 20, No . 4, pp.563 – 574. Yu, A.C.W., Mart in, G.R . and Park, H. (2008) ‘Fast inter -mode selection in H.264/AVC standard using a hierarchical decision process’, IEEE Trans. Circuits Syst. Video Technol. , Vol. 18, No. 2, pp.186 – 195.

An optimal mode selection algorithm for scalable video coding

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment