Advanced
Fast Intraframe Coding for High Efficiency Video Coding
Fast Intraframe Coding for High Efficiency Video Coding
KSII Transactions on Internet and Information Systems (TIIS). 2014. Mar, 8(3): 1093-1104
Copyright © 2014, Korean Society For Internet Information
  • Received : December 26, 2013
  • Accepted : February 09, 2014
  • Published : March 28, 2014
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Han Huang
Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China
Yao Zhao
Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China
Chunyu Lin
Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China
Huihui Bai
Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China

Abstract
The High Efficiency Video Coding (HEVC) is a new video coding standard that can provide much better compression efficiency than its predecessor H.264/AVC. However, it is computationally more intensive due to the use of flexible quadtree coding unit structure and more choices of prediction modes. In this paper, a fast intraframe coding scheme is proposed for HEVC. Firstly, a fast bottom-up pruning algorithm is designed to skip the mode decision process or reduce the candidate modes at larger block size coding unit. Then, a low complexity rough mode decision process is adopted to choose a small candidate set, followed by early DC and Planar mode decision and mode filtering to further reduce the number of candidate modes. The proposed method is evaluated by the HEVC reference software HM8.2. Averaging over 5 classes of HEVC test sequences, 41.39% encoding time saving is achieved with only 0.77% bitrate increase.
Keywords
1. Introduction
T he High Efficiency Video Coding (HEVC) is a new video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC) formed by the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). In comparison with its predecessor H.264/AVC, 50% bitrate reduction is achieved for equal perceptual quality [1] . While the same block-based hybrid coding structure as in H.264/AVC is adopted in HEVC, each coding tool is improved with new technical features and characteristics. As opposed to the fixed size macro-block (MB) in H.264/AVC, the coding tree unit (CTU) in HEVC is configurable and its size can be chosen from 64x64, 32x32 and 16x16. A CTU consists of one coding unit (CU) or can be recursively split into a quadtree of CUs. The largest size of a CU is equal to that of a CTU, and the smallest is 8x8. An example of CU splitting and its corresponding quadtree is shown in Fig. 1 . The dotted lines with arrows represent the processing order of the CUs in the CTU, the notation “1” in the quadtree indicates further splitting and “0” indicates leaf node. A CU can be further split into multiple prediction units (PU) for inter or intra prediction. In intra coding, the PU partition type can be either PART_2Nx2N or PART_NxN, as shown in Fig. 2 . However, PART_NxN is only allowed for the minimum CU size, otherwise it is similar as four equal-size sub-CUs.
PPT Slide
Lager Image
Example of a 64x64 CTU and its CU partition and processing order. (a) CU partitions (b) quadtree structure
PPT Slide
Lager Image
PU types for intra prediction
During the transform and quantization process, a CU can also be recursively split into a quadtree of transform unit (TU). The intra prediction is applied for each TU sequentially instead of applying the intra prediction at the PU level [2] . In this way, the nearest neighboring reference samples from the already reconstructed TU are used for prediction. The HEVC defines 35 intra prediction modes, namley Intra_DC, Intra_Planar and 33 angular prediction modes denoded as Intra_Angular[k] , k = 2,…,34. In comparison with H.264/AVC, the intra prediction modes defined in HEVC are able to capture more directional structures.
In the HEVC test mode (HM), a rate-distortion (RD) optimized bottom-up pruning algorithm [3] is adopted to find the best CU partition of a CTU. Given a CU, the best PU partition type, prediction mode and TU partition are found by minimizing the Lagrangian cost. While the flexible block structure and more choices of prediction mode provide better coding performance, it dramatically increases the encoder complexity. Even for intraframe coding, it is still far away from real-time application [4] . The computational complexity mainly comes from two folds. First, an optimal CU partition within a CTU can consist of various sizes of CU ranging from 8x8 to 64x64. Second, finding an optimal prediction mode requires a RD optimization process. Therefore, fast algorithms in both aspects are desirable. Research works regarding fast CU size decision include [5 - 8] ; works targeting fast intra mode decision include [9 - 15] . In our previous work [16] , a novel fast bottom-up pruning (FBUP) algorithm was proposed and proved to be efficient. In this paper, three techniques are proposed for fast intra mode decision and combined with the FBUP algorithm. Starting from the smallest CU size, a full quadtree is processed in a bottom-up manner. First, we check whether mode decision process is needed for a current CU node. If the mode decision process is performed, then a set of intra candidate modes are derived from the coding information obtained from its sub-blocks (sub-CUs or PART_NXN PUs), followed by a rough decision process to choose a smaller set of candidates. The number of candidates in the set is adaptively chosen based on the low complexity RD cost. In this way, we can either skip the mode decision process or reduce the number of candidate modes. Thus, the computational complexity is reduced.
In Section 2, the previous works in the literature are reviewed. Then the proposed method is described in Section 3. In Section 4, the experimental results are presented to prove the efficiency of the proposed algorithm. Compared to the original HM encoder, averaging over all the test sequences and rate points, 41.39% encoding time saving is achieved while the bitrate increase is only 0.77%. Finally, the conclusion is given in Section 5.
2. Related Works
2.1 Rate-Distortion Optimized Bottom-up Pruning
In HEVC, the CTU size is 64x64 and the maximum number of depth levels of the quadtree is D =4. The HM encoder adopts the rate-distortion optimized bottom-up pruning algorithm to obtain the optimal CU quadtree partition. Consider the full quadtree as shown in Fig. 3 , we locate a node by its depth level from top to bottom and its position from left to right. Let ( i , j ) be the index of j th node at depth level i , then j <4 i . Denote X i,j , as a CU at node ( i , j ) without further splitting and C i,j as the optimal tree after rate-distortion optimized pruning. Let J (·) be the operation to calculate the best RD cost. The bottom-up pruning algorithm traverses the full quadtree in a depth-first order. The sub-CUs of node ( i , j ) is pruned if and only if
PPT Slide
Lager Image
.
PPT Slide
Lager Image
Full quadtree
- 2.2 Fast Intra Mode Decision Algorithm in HM
Full RD optimized search in the 35 intra prediction modes is computational complex. To reduce the encoder complexity, the HM encoder adopts a fast intra mode decision algorithm, which consists of two steps described as follows.
  • 1. Rough mode decision: choose theNbest candidate modes by minimizingJSATD=DSATD+λRmode.
  • 2. Given the N candidate modes in step 1 and themost probable modes(MPM) derived from neighboring PUs, do R-D optimized mode decision. The best intra mode is found by minimizingJfullRD=Drec+λRtotal.
DSATD is the Sum of Absolute Transformed Difference (SATD) which calculates the sum of abolute value of hadamard transformed coefficients of the residual signal, R mode is the number of bits to encode mode information, Drec is the distortion of the reconstructed block, Rtotal is the total number of bits to encode current block and λ is the Lagrangian multiplier. Calculation of JSATD doesn't require the DCT transform, quantization and entropy coding process, thus it’s much simpler than calculation of the full RD cost JfullRD . Therefore, the computational complexity can be reduced if N is set to a smaller value. In the HM encoder N is set to 3, 3, 3, 8 and 8 for the PU sizes of 64×64 , 32×32 , 16×16 , 8×8 and 4×4 respectively.
3. Proposed Methods
- 3.1 Fast bottom-up Pruning Algorithm
- 3.1.1 Analysis of Bottom-up Pruning Algorithm
The bottom-up pruning algorithm performs mode decision at each node of the full quadtree. Therefore, a total of
PPT Slide
Lager Image
mode decisions for each CTU. However, given a CU at node ( i , j ), the best partition and prediction modes of its sub-CUs C i+1,4j+k , k =0,1,2,3 are known. Thus, we can reuse the coding information in sub-CUs to check whether mode decision at current node is necessary.
Let Q (·)=0 indicates a leaf node and Q (·)=1 denotes a tree. Then Q ( C i+1,4j+k )=1 suggests that the strutural information in the k th sub-CU is complex. Hence, the current node is likely to be split. In Table 1 , it shows the probability p 0 of a node is split when
PPT Slide
Lager Image
, i.e.
PPT Slide
Lager Image
. The results are obtained by coding the first second (50 frames) of HEVC test sequence BasketballDrill . Similar statistics are found in coding the other sequences. Note that at level 3, it reaches the leaf node of the quadtree, thus the CU doesn’t have any sub-CUs. However, HEVC allows PART_NxN type PU at the bottom level, in which the CU can also be considered as a tree.
The probability
PPT Slide
Lager Image
The probability
- 3.1.2 Fast bottom-up Pruning Algorithm
Based on the above observations, we can conclude that there is a high probability that a CU remains split if its sub-CUs are sub-trees. In other words, lots of mode decision process in a large CU is unnecessary. Therefore, they could be avoided to decrease the computational complexity. In this paper,
PPT Slide
Lager Image
is set as the necessary condition to skip the intra mode decision process at node ( i , j ). An example is shown in Fig. 4 . If C 1,1 is a tree, then the sub-CUs of node C 0,0 are not pruned. Implementation details of the proposed fast bottom-up pruning algorithmis is described by the pseudocode in Algorithm 1 .
PPT Slide
Lager Image
Example of a pruned quadtree.
Fast bottom-up pruning algorithm
PPT Slide
Lager Image
Fast bottom-up pruning algorithm
- 3.1.2 Analysis of the Performance of Fast Bottom-up Pruning Algorithm
To analyse the performance of the proposed FBUP algorithm, we can evaluate how often a node ( i , j ) is skipped by FBUP while its RD cost J ( X i,j ) is actually less than J ( C i,j ), denoted as
PPT Slide
Lager Image
, and use
PPT Slide
Lager Image
as a messure of loss in RD performance. An example is shown in Table 2 , in which the results are obtained by encoding the HEVC test sequence BasketballDrill . We can see that generally the values of p 1 and ∇ J are small. Similar results are observed when coding other HEVC test sequences. It indicates that the probability of the proposed FBUP algorithm fails to predict the result of RD optimized mode decision is quite small, and the loss in RD performance is also trivial.
Analysis of the performance of proposed fast bottom-up pruning algorithm
PPT Slide
Lager Image
Analysis of the performance of proposed fast bottom-up pruning algorithm
- 3.2 Proposed Fast Intra Mode Decision Algorithm
In this paper, three techniques are proposed for fast intra mode decision, namely, bottom-up prediction , adaptive intra mode filtering , and DC and Planar early decision .
- 3.2.1 Bottom-up Prediction
In the proposed FBUP algorithm, mode decision at a CU node is selectively skipped based on the coding information of its sub-CUs. Similarly, we can also predict the intra mode of current CU node if it’s not skipped. An example is shown in Fig. 5 , in which the red arrows indicate the intra prediction directions. The intra prediction mode of the block is the same as the prediction modes of its two sub-blocks.
PPT Slide
Lager Image
Example of bottom-up prediction.
The straightforward way of bottom-up prediction is to use the chosen best modes of sub-CUs, instead of all 35 modes, as the candidate modes for the parent CU. Though the number of candidate modes can be largely reduced by such method, however, we find that the coding efficiency is decreased significantly. Therefore, a more robust scheme is required. In this paper, the rough mode decision results of sub-CUs are employed to select the candidate modes for parent CU. Let Ω i,j be the set of candidate modes of a CU at node ( i , j ) and Փ i+1,4j+k be the set of N modes chosen in the rough mode decision process of its k th sub-CU, then
PPT Slide
Lager Image
. By using the N modes that are chosen by rough mode decision instead of only one mode, prediction from sub-CUs becomes more robust. On the other hand, N is usually a small number, so it can achieve a good tradeoff between coding efficiency and encoder complexity.
- 3.2.2 Adaptive Intra Mode Filtering
In the fast intra mode decision algorithm in HM, the number N is set to a fixed value. In this paper, an adaptive method is proposed based on the JSATD costs in the rough mode decision process.
  • 1. For each mode in candidate set Ω, calculate its costJSATD, sort the candidate set in the ascending order ofJSATD.
  • 2. LetCandi,i=0,1,…,7 be the first 8 modes after sorting, calculate the mean value of their cost, denoted as.
  • 3. The new candidate set is obtained as Ω'={Candi|JSATD(Candi)<μ&i
- 3.2.3 DC and Planar Early Decision
The intra mode in HEVC can be categorized into two classes: one is DC and Planar modes, the other is directional intra prediction mode. The DC and Planar modes are designed for smooth area, and the directional modes are designed for coding blocks that have directional edges. It’s assumed that the rough mode decision process can effectively distinguish these two classes. So, in this paper, if the best mode in rough mode decision process is Planar or DC mode, then the directional modes are removed from the candidate set.
- 3.2.4 The Overall Algorithm
The three techniques described in the previous subsections can be combined. Similar to the original intra mode decision algorithm in HM, the most probable modes (MPM) derived from neighboring coded blocks are also included in the final candidate set for RD optimized mode decision. The overall algorithm is as follows:
  • 1. For a CU at node (i,j), the set of candidate intra modes is
  • 2. CalculateJSATDfor each candidate mode in Ωi,j, sort the candidate modes in the decending order ofJSATDand denoted asCandi,i=0,1,⋯,M-1
  • 3. IfCand0equals to 0 or 1, i.e. Planar or DC mode, thenm=1+(Cand1<2) and go to step 6, elsem=8
  • 4. Calculate
  • 5. Fori=7,⋯,0 , ifJSATD(Candi)<μthenm=m-1
  • 6. Փi,j={Candi,i=0,⋯,m-1}∪{MPM}, search the best mode in Փi,jby RD optimized mode decision as described in section 2.2.
4. Experimental Results
The proposed algorithm was evaluated in the HEVC reference software HM8.2 [17] . The main profile intra-only encoder configuration in the common test condition [18] is used to code the HEVC test sequences in classes A to E. The maximum CU size (CTU size) is 64x64, the maximum number of CU depth levels is 4, and maximum number of TU levels is 3. Rate-distortion optimized quantization, sample adaptive offset, transform skipping, and fast transform skipping are on. Four quantization parameters (QP) {22, 27, 32, and 37} are used to encode each sequence.
To demonstrate the encoding time saving (ETS) of the proposed method, an isolated PC with Intel Core 2 3.0GHz CPU and 2.0GB RAM was used to encode the first second of each sequence. Compared with the original fast intra mode decision algorithm in HM8.2 encoder, the ETS, in percentages, and the corresponding BD-rate [19] for the proposed fast intra mode decision algorithm are shown in Table 3 . Averaging over all test sequences, the ETS for different QP settings are around 25% and the Y BD-rate is only 0.40%. It’s also observed that the ETS is consistent for different test sequences and the maximum Y BD-rate is 0.70%. Thus, we can conclude that the proposed fast intra mode decision algorithm can efficiently reduce the encoder complexity without significant loss in rate-distortion performance. When the proposed fast intra mode decision algorithm is combined with the FBUP algorithm, the results are shown in Table 4 . It shows that the combined algorithm can further reduce the encoder complexity. Averaging over all test sequences, the ETS for different QP settings are around 40% and the Y BD-rate is 0.77%. The maximum average bitrate increase is only 1.05%. Thus, the overall algorithm also does not compromise the RD performance.
Encoding time saving (%) and corresponding Y BD-rate (%) of the proposed fast intra mode decision algorithm.
PPT Slide
Lager Image
Encoding time saving (%) and corresponding Y BD-rate (%) of the proposed fast intra mode decision algorithm.
Encoding time saving (%) and corresponding Y BD-rate (%) of the proposed fast intra mode decision algorithm combined with the FBUP algorithm.
PPT Slide
Lager Image
Encoding time saving (%) and corresponding Y BD-rate (%) of the proposed fast intra mode decision algorithm combined with the FBUP algorithm.
5. Conclusion
In this paper, a fast intra mode decision algorithm is proposed for HEVC intraframe coding. The proposed algorithm consists of three techniques to reduce the number of candidate modes for RD optimized mode decision. Compared to the original fast intra mode decision algorithm in HM encoder, the proposed algorithm can achieve about 25% encoding time saving with only 0.40% bitrate increase. Also, the proposed fast intra mode decision algorithm can be well combined with the proposed fast bottom-up pruning algorithm. The encoder complexity is further reduced, about 41% encoding time savings are achieved, and the average bitrate increase is only 0.77%.
BIO
Han Huang received the B.S. degree in computer science and Ph.D. degree in signal and information processing from Beijing Jiaotong University, Beijing, China, in 2007 and 2014, respectively. He is now with MediaTek (Beijing) Inc. From 2009 to 2011, he was a visiting student in the Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA. His current research interests include video compression and processing.
Yao Zhao received the B.S. degree from Fuzhou University, Fuzhou, China, in 1989, and the M.E. degree from Southeast University, Nanjing, China, in 1992, both in radio engineering, and the Ph.D. degree from the Institute of Information Science, Beijing Jiaotong University (BJTU), Beijing, China, in 1996. In 1998, he was an Associate Professor at BJTU, where he became a Professor in 2001. From 2001 to 2002, he was a Senior Research Fellow in the Information and Communication Theory Group, Faculty of Information Technology and Systems, Delft University of Technology, Delft, The Netherlands. He is currently the Director of the Institute of Information Science, BJTU. He is currently leading several national research projects such as 973 Program, 863 Program, and the National Science Foundation of China. His current research interests include image/video coding, digital watermarking and forensics, and video analysis and understanding. Dr. Zhao serves on the editorial boards of several international journals, including as an Associate Editor of the IEEE Signal Processing Letters, an Area Editor of Signal Processing: Image Communication (Elsevier), and as an Associate Editor of Circuits, System & Signal Processing (Springer). He was the recipient of National Science Foundation of China for Distinguished Young Scholars in 2010.
Chunyu Lin was born in LiaoNing Province, China. He works as a lecturer in Beijing Jiaotong University. He obtained his doctor dgree in Beijing Jiaotong University in 2011. From 2009 to 2010, he was a visiting researcher at the ICT group of Delft University of Technology, Netherlands. From 2011 to 2012, He was a postdoc in Gent Unveristy, Belgium. His research interests are in the areas of image/video compression and robust transmission, stereo matching and 3D video coding.
Huihui Bai received the Ph.D. degree in signal and information processing from Beijing Jiaotong University (BJTU), Beijing, China, in 2008. She is currently an Associate Professor at the Institute of Information Science, BJTU. She has been involved in research and development work in video coding technologies and standards such as high efficiency video coding, 3-D video compression, multiple description video coding, and distributed video coding. She is leading or participating in several research projects such as 973 Program, 863 Program, the National Natural Science Foundation of China, Beijing Natural Science Foundation, and Jiangsu Provincial Natural Science Foundation.
References
Sullivan G. J. , Ohm J.-R. , Han W.-J. , Wiegand T. , Wiegand T. 2012 "Overview of the High Efficiency Video Coding (HEVC) standard" IEEE Trans. Circuits Syst. Video Technol. Article (CrossRef Link) 22 (12) 1649 - 1668    DOI : 10.1109/TCSVT.2012.2221191
Lainema J. , Bossen F. , Han Woo-Jin , Min Junghye , Ugur K. 2012 “Intra coding of the HEVC standard” IEEE Trans. Circuits Syst. Video Technol. Article (CrossRef Link) 22 (12) 1792 - 1801    DOI : 10.1109/TCSVT.2012.2221525
Sullivan G.J. , Baker R.L. 1994 “Efficient quadtree coding of images and video” IEEE Trans. Circuits Syst. Image Process. Article (CrossRef Link) 3 (3) 327 - 331    DOI : 10.1109/83.287030
Bossen F. , Bross B. , Suhring K. , Flynn D. 2012 “HEVC complexity and implementation analysis” IEEE Trans. Circuits Syst. Video Technol. Article (CrossRef Link) 22 (12) 1685 - 1696    DOI : 10.1109/TCSVT.2012.2221255
Shen Xiaolin , Yu Lu , Chen Jie 2012 “Fast coding unit size selection for HEVC based on bayesian decision rule” in Proc. of Picture Coding Symposium (PCS) May 7-9 Article (CrossRef Link) 453 - 456
Tan Hui Li , Liu Fengjiao , Tan Yih Han , Yeo Chuohao 2012 “On fast coding tree block and mode decision for High-Efficiency Video Coding (HEVC)” Speech and Signal Process. in Proc. of IEEE Conf. Acoust. March 25-30 Article (CrossRef Link) 825 - 828
Shen L. , Liu Z. , Zhang X. , Zhao W. , Zhang Z. 2013 “An effective CU size decision method for HEVC encoders” IEEE Trans. Multimedia Article (CrossRef Link) 15 (2) 465 - 470    DOI : 10.1109/TMM.2012.2231060
Choi Kiho , Jang Euee S. 2012 “Fast coding unit decision method based on coding tree pruning for high efficiency video coding” Optical Engineering Article (CrossRef Link) 51 (3) 030502–1 - 030502–3    DOI : 10.1117/1.OE.51.3.030502
Piao Yinji , Min Junghye , Chen Jianle 2010 “Encoder improvement of unified intra prediction,” in document JCTVC-C207
Zhao Liang , Zhang Li , Ma Siwei , Zhao Debin 2011 “Fast mode decision algorithm for intra prediction in HEVC” in IEEE Proc. Visual Commun. Image Process. November 6-9 Article (CrossRef Link) 1 - 4
Silva T.L. da , Agostini L. V. , Silva Cruz L. A. da 2012 “Fast HEVC intra prediction mode decision based on edge direction information” in Proc. of The 20th European Signal Process. Conf. (EUSIPCO) August Article (CrossRef Link) 1214 - 1218
Zhang H. , Ma Z. 2013 “Early termination schemes for fast intra mode decision in High Efficiency Video Coding” in Proc. of IEEE Int. Symp. Circuits and Syst. May 19-23 Article (CrossRef Link) 25 - 48
Sun Heming , Zhou Dajiang , Goto S. 2012 “A low complexity HEVC intra prediction algorithm based on level and mode filtering” in IEEE Proc. Multimedia and Expo July 9-13 Article (CrossRef Link) 1085 - 1090
Jiang Wei , Ma Hanjie , Chen Yaowu 2012 “Gradient based fast mode decision algorithm for intra prediction in HEVC” in Proc. of IEEE Proc. Consumer Electronics, Communications and Networks (CECNet) April 21-23 Article (CrossRef Link) 1836 - 1840
Zhang Yongfei , Li Zhe , Li Bo 2012 “Gradient-based fast decision for intra prediction in HEVC” in IEEE Proc. Visual Commun. Image Process. November 27-30 Article (CrossRef Link) 1 - 6
Huang Han , Zhao Yao , Lin Chunyu , Bai Huihui 2013 “Fast bottom-up pruning for HEVC intraframe coding” IEEE Proc. Visual Commun. Image Process. Article (CrossRef Link)
HEVC test model reference software HM8.2 [Online]. Available:
Bossen F. 2012 “Common test conditions and software reference configurations,” in document JCTVC-I1100