HEVC VMAF-oriented Perceptual Rate Distortion Optimization using CNN

Abstract

Video coding standards like HEVC and VVC have achieved significant coding performance. However, the RDO module in coding framework ignores the characteristics of human visual system (HVS), which leads to insufficiency for perceptual video coding. Recently, learning-based objective assessment metric VMAF is developed and has been demonstrated higher quality assessment accuracy than conventional metrics. To incorporate VMAF into RDO aiming at improving perceptual coding efficiency, in this paper, a perceptual RDO scheme is proposed. A CNN-based on-line training method is first explored to determine the VMAF-related distortion estimation coefficient. Based on the VMAF-related coefficient and R-D model, a VMAF-based Lagrangian multiplier is proposed to adjust the R-D performance of each coding block. Experiments demonstrate that the proposed method can achieve an average -2.80% VMAF-based BD-Rate compared with the original HEVC, which effectively improves the coding performance.

Publication
2021 Picture Coding Symposium (PCS)
Li Song
Li Song
Professor, IEEE Senior Member

Professor, Doctoral Supervisor, the Deputy Director of the Institute of Image Communication and Network Engineering of Shanghai Jiao Tong University, the Double-Appointed Professor of the Institute of Artificial Intelligence and the Collaborative Innovation Center of Future Media Network, the Deputy Secretary-General of the China Video User Experience Alliance and head of the standards group.