Frame interpolation attempts to generate intermediate frames given existing ones, which is challenging because of complex video scenes and motion. Standard methods ﬁrst estimate motion between two continuous frames and then synthesize new ones. In this paper, we propose a novel frame interpolation method based on a video synthesis approach deep voxel ﬂow (DVF). In DVF, a deep convolutional encoder-decoder predicts 3D voxel ﬂow, and then a volume sampling layer synthesizes the intermediate frame guided by the ﬂow. To improve the accuracy of voxel ﬂow, we employ recurrent convolutional layers (RCL) in the encoder-decoder module to reﬁne the ﬂow step by step, called DVF-RCL. We also incorporate perceptual loss to increase the visual quality. Experiments demonstrate that our method greatly improves the performance of original DVF and produce results that compare favorably to state-of-the-art methods both quantitatively and qualitatively.