摘要
Recently,learning-based models have enhanced the performance of single-image superresolution(SISR).However,applying SISR successively to each video frame leads to a lack of temporal coherency.Convolutional neural networks(CNNs)outperform traditional approaches in terms of image quality metrics such as peak signal to noise ratio(PSNR)and structural similarity(SSIM).On the other hand,generative adversarial networks(GANs)offer a competitive advantage by being able to mitigate the issue of a lack of finer texture details,usually seen with CNNs when super-resolving at large upscaling factors.We present i See Better,a novel GAN-based spatio-temporal approach to video super-resolution(VSR)that renders temporally consistent super-resolution videos.i See Better extracts spatial and temporal information from the current and neighboring frames using the concept of recurrent back-projection networks as its generator.Furthermore,to improve the"naturality"of the superresolved output while eliminating artifacts seen with traditional algorithms,we utilize the discriminator from super-resolution generative adversarial network.Although mean squared error(MSE)as a primary loss-minimization objective improves PSNR/SSIM,these metrics may not capture fine details in the image resulting in misrepresentation of perceptual quality.To address this,we use a four-fold(MSE),perceptual,adversarial,and total-variation loss function.Our results demonstrate that i See Better offers superior VSR fidelity and surpasses state-of-the-art performance.