Frame interpolation attempts to generate intermediate frames given existing ones, which is challenging because of complex video scenes and motion. Standard methods first estimate motion between two continuous frames and then synthesize new ones. In this paper, we propose a novel frame interpolation method based on a video synthesis approach deep voxel flow (DVF). In DVF, a deep convolutional encoder-decoder predicts 3D voxel flow, and then a volume sampling layer synthesizes the intermediate frame guided by the flow. To improve the accuracy of voxel flow, we employ recurrent convolutional layers (RCL) in the encoder-decoder module to refine the flow step by step, called DVF-RCL. We also incorporate perceptual loss to increase the visual quality. Experiments demonstrate that our method greatly improves the performance of original DVF and produce results that compare favorably to state-of-the-art methods both quantitatively and qualitatively.