Video Frame Interpolation Using Recurrent Convolutional Layers

Abstract

Frame interpolation attempts to generate intermediate frames given existing ones, which is challenging because of complex video scenes and motion. Standard methods first estimate motion between two continuous frames and then synthesize new ones. In this paper, we propose a novel frame interpolation method based on a video synthesis approach deep voxel flow (DVF). In DVF, a deep convolutional encoder-decoder predicts 3D voxel flow, and then a volume sampling layer synthesizes the intermediate frame guided by the flow. To improve the accuracy of voxel flow, we employ recurrent convolutional layers (RCL) in the encoder-decoder module to refine the flow step by step, called DVF-RCL. We also incorporate perceptual loss to increase the visual quality. Experiments demonstrate that our method greatly improves the performance of original DVF and produce results that compare favorably to state-of-the-art methods both quantitatively and qualitatively.

Publication
2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM)
Li Song
Li Song
Professor, IEEE Senior Member

Professor, Doctoral Supervisor, the Deputy Director of the Institute of Image Communication and Network Engineering of Shanghai Jiao Tong University, the Double-Appointed Professor of the Institute of Artificial Intelligence and the Collaborative Innovation Center of Future Media Network, the Deputy Secretary-General of the China Video User Experience Alliance and head of the standards group.