Learning a No Reference Quality Assessment Metric for Encoded 4K-UHD Video


4K-UHD videos have become popular since they significantly improve user’s visual experience. As video coding, transmission and enhancement technology developing fast, existing quality assessing metrics are not suitable for 4K-UHD scenario because of the expanded resolution and lack of training data. In this paper, we present a no-reference video quality assessment model achieving high performance and suitability for 4K-UHD scenario by simulating full-reference metric VMAF. Our approach extract deep spatial features and optical flow based temporal features from cropped frame patches. Overall score for video clip is obtained from weighted average of patch results to fully reflect the content of high-resolution video frames. The model is trained on automatically generated HEVC encoded 4K-UHD dataset which is labeled by VMAF. The strategy of constructing dataset can be easily extended to other scenarios such as HD resolution and other distortion types by modifying dataset and adjusting network. With the absence of reference video, our proposed model achieves considerable accuracy on VMAF labels and high correlation with human rating, as well as relatively fast processing speed.

Digital TV and Wireless Multimedia Communication
Li Song
Li Song
Professor, IEEE Senior Member

Professor, Doctoral Supervisor, the Deputy Director of the Institute of Image Communication and Network Engineering of Shanghai Jiao Tong University, the Double-Appointed Professor of the Institute of Artificial Intelligence and the Collaborative Innovation Center of Future Media Network, the Deputy Secretary-General of the China Video User Experience Alliance and head of the standards group.