Compression Priors Assisted Convolutional Neural Network for Fractional Interpolation


Fractional interpolation has been extensively utilized in a series of video coding standards to generate fractional precision prediction to remove the temporal redundancy in consecutive frames. In the traditional interpolation filter based methods, the fractional samples are interpolated through a linear combination of the neighboring integer samples. This method is simple yet unable to accurately characterize the nonstationary video signals. Recently, convolutional neural network has been utilized in the fractional interpolation and shows superior performance compared with the traditional methods. However, only the reconstruction of the reference frame is used as the infer information source. All the other information contained in the bitstream or generated during the encoding/decoding procedure denoted as the compression prior is not utilized at all. In this paper, we give the first trial to involve some compression priors into the CNN to improve the performance of a in-loop coding tool. Specifically, we propose a Compression Priors assisted Convolutional Neural Network (CPCNN) to further improve the fractional interpolation efficiency. In addition to the reconstructed component, we additionally utilize two other compression priors – the corresponding residual component and col-located high quality component to boost the performance. Specifically, the residual component that indicates the prediction efficiency and contains effective texture information is utilized as a complementary input to the reconstructed one. While the col-located component provides more useful high quality information to help the reconstruction get rid of the quality fluctuation. Furthermore, a special network structure is designed to learn powerful representations of these triple input components. Comprehensive experiments have been conducted to demonstrate the effectiveness of our proposed CPCNN. The experimental results show that compared to HEVC, our proposed CPCNN achieves on average of 5.3%, 2.8% and 1.9% BD-Rate savings under LDP, LDB and RA configurations, respectively.

IEEE Transactions on Circuits and Systems for Video Technology
Li Song
Li Song
Professor, IEEE Senior Member

Professor, Doctoral Supervisor, the Deputy Director of the Institute of Image Communication and Network Engineering of Shanghai Jiao Tong University, the Double-Appointed Professor of the Institute of Artificial Intelligence and the Collaborative Innovation Center of Future Media Network, the Deputy Secretary-General of the China Video User Experience Alliance and head of the standards group.