A VMAF Directed Perceptual Rate Distortion Optimization for Video Coding


Video Multimethod Assessment Fusion (VMAF), as a state-of-the-art learning-based objective assessment metric, has been demonstrated to achieve a stronger correlation with human visual system than others. However, combining multiple metrics by Support Vector Machine (SVM) makes it unrealistic to be directly integrated into encoder due to its high complexity and non-differentiability. In this paper, to alleviate this problem, we propose a novel perceptual distortion metric directed by VMAF (D-VMAF) to optimize perceptual quality. The relationship between frame-level VMAF and block-level content is fitted by DVMAF with the combination of spatial and temporal factors. By the proposed perceptual distortion metrics, the Lagrangian multiplier in RDO is adjusted adaptively. Experimental results demonstrate that 3% coding performance in BD-VMAF can be achieved using our method compared to the original HM encoder.

2020 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)
Li Song
Li Song
Professor, IEEE Senior Member

Professor, Doctoral Supervisor, the Deputy Director of the Institute of Image Communication and Network Engineering of Shanghai Jiao Tong University, the Double-Appointed Professor of the Institute of Artificial Intelligence and the Collaborative Innovation Center of Future Media Network, the Deputy Secretary-General of the China Video User Experience Alliance and head of the standards group.