Learning-Based Quality Enhancement For Scalable Coded Video Over Packet Lossy Networks


The layered feature of scalable video coding (SVC) offers a sufficient adaptation to unreliable transmission. When network condition drops sharply, enhancement layers will be abandoned, and only base layers are delivered. However, this will cause noticeable visual artifacts due to quality differences between different layers. To alleviate this problem, we novelly introduce a deep learning-based method into video reconstruction phase of scalable bitstreams. A super-resolution motivated recurrent network is proposed to extract and fuse features from both previous high-resolution frames and the current low-resolution frame. To the best of our knowledge, this is the first attempt to improve the performance of scalable bitstreams reconstruction by a specifically designed super-resolution network. By seamlessly integrating the accessible features, significant video quality improvements in terms of PSNR, SSIM, and VMAF are achieved. At the same time, the improvement of overall visual quality stability is apparent under packet lossy networks, indicating both efficiency and robustness of our approach.

2020 IEEE International Conference on Multimedia and Expo (ICME)
Li Song
Li Song
Professor, IEEE Senior Member

Professor, Doctoral Supervisor, the Deputy Director of the Institute of Image Communication and Network Engineering of Shanghai Jiao Tong University, the Double-Appointed Professor of the Institute of Artificial Intelligence and the Collaborative Innovation Center of Future Media Network, the Deputy Secretary-General of the China Video User Experience Alliance and head of the standards group.