Video coding standards like HEVC and VVC have achieved significant coding performance. However, the RDO module in coding framework ignores the characteristics of human visual system (HVS), which leads to insufficiency for perceptual video coding. Recently, learning-based objective assessment metric VMAF is developed and has been demonstrated higher quality assessment accuracy than conventional metrics. To incorporate VMAF into RDO aiming at improving perceptual coding efficiency, in this paper, a perceptual RDO scheme is proposed. A CNN-based on-line training method is first explored to determine the VMAF-related distortion estimation coefficient. Based on the VMAF-related coefficient and R-D model, a VMAF-based Lagrangian multiplier is proposed to adjust the R-D performance of each coding block. Experiments demonstrate that the proposed method can achieve an average -2.80% VMAF-based BD-Rate compared with the original HEVC, which effectively improves the coding performance.