This paper presents several novel GPU optimization technologies to accelerate the SRCNN(Super-Resolution Convolutional Neural Network) - one of the best super-resolution algorithm. We first directly parallelize and implement the SRCNN, then accelerate the convolution by making use of the hierarchical feature of GPU memory. We explore different optimization methods on each convolution and select the fastest combination. Further acceleration can be achieved by fusing the convolution and ReLu(Rectified Linear units) operation to eliminate the memory access time of ReLu. Our experiments show that the overall execution time for 1080p to 4K upscaling is reduced from 300s/frame to 0.15s/frame, while the image quality is exactly the same as original SRCNN.