From HEVC to VVC: Evolution of Transformation Techniques

From HEVC to VVC: Evolution of Transformation Techniques

The development of video coding standards has undergone significant transformations over the past decade, driven by the increasing demand for higher video compression performance. In this article, we will delve into the evolution of transformation techniques, from the HEVC (High-Efficiency Video Coding) standard to the latest VVC (Versatile Video Coding) standard.

A Transform Technique in HEVC

The HEVC standard, adopted in 2013, introduced a primary transform technique known as DCT2 (Discrete Cosine Transform Type 2). This transform was chosen due to its low implementation complexity and relatively efficient conversion gain. The DCT2 transform kernel was designed to be symmetric, supporting matrix multiplication and butterfly portion (Partial Butterfly) two calculation methods. This flexibility allowed for efficient hardware implementation and reduced computational complexity.

The HEVC standard also adopted the 7 transformation proposed by Samsung, known as DST Type 7 (Discrete Sine Transform Type 7). This transform was based on mathematical derivation and was found to be optimal for vertical intra-prediction residuals. However, only the 4:00 DST7 was adopted for intra-prediction residual transform, while other sizes and inter prediction residual still used DCT2.

The Evolution of Conversion Technology after HEVC

The conversion technology adopted by HEVC was based on statistical characteristics of a fixed image, but the actual image prediction residuals showed obvious dynamic nature. This limited the transformation’s coding performance gain. Research showed that using multiple candidate transforms could better adapt to the dynamic changes in the statistical properties of the prediction residuals, leading to significant further increases in conversion gain.

In 2009, the first multi-candidate technical solutions transform was proposed in the MPEG Xi’an conference m16926. However, this proposal did not support fast algorithms and was not well-matched for hardware computing power. In 2015, Qualcomm pioneered the technology-based solutions EMT (Enhanced Multiple Transform), which defined three groups of transforms, each containing DST7, DST1, and two translations DCT5 and DCT8. This solution was more effective than the conventional single DCT2, as the basis functions DCT2 fit only for uniform distribution of residuals, while the conversion candidate could be a combination of EMT that fit residuals unevenly distributed.

The Transform Technique of VVC

The VVC standard adopted a new generation of technical solutions based on multi-candidate conversion, which is a technological leap in video coding standard. The VVC standard retained all the main nuclear transformation and DST7/DCT8 fast algorithm. The 64-point butterfly algorithm adopted by the forward transform contained a full 32 points, 16 points, DCT2 8, and 4-point forward transform butterfly algorithm.

However, the main complexity of the MTS (Multiple Transform Selection) lies in the large number of multiplications required for DST7 and DCT8. To address this bottleneck, several companies proposed technical solutions aimed at reducing the computational complexity DST7/DCT8, including the fast DST7/DCT8 algorithm proposed by Tencent audio and video laboratory in January 2019. This algorithm supports two matrix multiplication algorithms and fast algorithm equivalent implementations, while reducing the number of multiplications by about 50%.

Sub-Block Transform for Inter Blocks

In January 2019, the Marrakesh meeting adopted the sub-block transformation technology (Sub-block Transform, SBT) proposed by Huawei. This technique assumes that the distribution of localized inter prediction residuals, reducing high-frequency components of the transform coefficients, and reducing the residual flag 0 block coding cost and improving the compression performance.

Summary

This paper summarizes the design of HEVC standard primary transform technology and introduces the evolution of the industry’s continued exploration of transformation and next-generation technology standards conversion technology VVC after HEVC. The VVC standard adopts the new generation of technical solutions based on multi-candidate conversion, which is a technological leap in video coding standard.

References

[1] A. Fuldseth, Bjøntegaard G., M. Sadafale, Budagavi M., “Design for the HEVC with the Transform Intermediate 16 Data 'bit representation”, JCTVC-E0243, Joint Collaborative Team ON Connections Video Coding (JCTVC) of in ITU-SG16 WP3 and the ISO T / to IEC JTC1 / SC29 / WG11, 5th Meeting: Geneva, CH, March 16-23, 2011.

[2] J. Han, Saxena, A., and K. Rose, “Towards Optimal Jointly Adaptive Transform and Spatial Prediction in Video / Image Coding,” in the IEEE International’s ON Conference Acoustics, Speech and Signal Processing (ICASSP), March 2010, pp. 726-729.

[3] Chuohao Yeo, Yih Han Tan, Zhengguo of Li, S. Rahardja, “Mode-dependent FAST for a separable KLT-based Intra Block Coding”, Circuits and Systems (the ISCAS) the IEEE International’s 2011 Symposium ON, PP. 621-624, May 2011.

[4] B. J. Zeng and Fu, “Discrete cosine TRANSFORMS Directional: A Framework for new new Image Coding,” The IEEE Trans Circuits Syst Connections Video Technol., Vol 18 is, NO. 3, PP 305-313, -Mar… 2008.

[5] X. Zhao, L. Zhang, S. Ma, and W. Gao, “Connections Video Coding with Rate-Distortion Optimized Transform,” the IEEE Trans. Circuits Syst. Connections Video Technol., Vol 22 is, NO.. 1, PP. 138-151, 2012.

[6] X. Cao, Y. and of He, “Singular Vector Adaptive Transform based decomposition residuals for Motion Compensation,” in Proc. ON the IEEE International’s Image Processing Conference, PP. 4127-4131, 2014.

[7] X. Zhao, L. Zhang, S. Ma, W. Gao, “Rate-Distortion Optimized the Transform”, the ISO / to IEC JTC1 / SC29 / WG11, the MPEG m16926, On Oct 2009.

[8] X. Zhao, J. Chen, M. Karczewicz, L. Zhang, X. Li and W.-J. Chien, “Enhanced Multiple Video Coding for the Transform,” in Proc. The Data Compression Conference, PP. 73- 82, 2016.

[9] X. Zhao, X. of Li, S. Liu, “CE6: Primary Transform 'bit the On-Core. 8 (the Test 6.1.3),” JVET-L0285, Exploration Joint Connections Video Team (JVET) of the ITU-T SG 16 the ISO. 3 and WP / to IEC JTC. 1 / SC 29 / WG. 11, Macao, the CN, Oct. 2018.

[10] X. Zhao, X. of Li, Y. Luo, S. Liu, “CE6: the DST-Fast. 7 /. 8 with the DCT-Dual Implementation Support (the Test 6.2.3),” JVET-M0497, Exploration Joint Connections Video Team (JVET) of the ITU-T SG 16 WP. 3 and the ISO / JTC. 1 to IEC / SC 29 / WG. 11, of Marrakesh, MA, January 2019.

[11] Z. Zhang, X. Zhao, X. Li, Z. Li, S. Liu, “the Transform for Multiple Fast Adaptive Versatile Video Coding,” Proc. The Data Compression Conference, 2019.

[12] M. Koo, Salehifar M., J. Lim, S. Kim, “Related-CE6: Point 32 skipping ON the MTS based High Frequency Coefficients,” JVET-M0297, Exploration Joint Connections Video Team (JVET) of the ITU-T 16 WP. 3 and the ISO SG / to IEC JTC. 1 / SC 29 / WG. 11, of Marrakesh, MA, January 2019.

[13] Y. Zhao, H. Gao, H. Yang, J. Chen, “CE6: Sub-Block Transform for Inter Blocks (the Test 6.4.1),” JVET-M0140, Exploration Joint Connections Video Team (JVET) of the ITU 16 WP. 3 and SG -T the ISO / JTC. 1 to IEC / SC 29 / WG. 11, of Marrakesh, MA, January 2019.