Video is becoming the “biggest big data” as the most important and valuable source for insights and information.

Researches in our lab cover a broad range of topics on video signal. Inspired by state-of-the-art representation model like deep learning, and driven by modern biologically computational model for visual experience, we are working on better solution for video analysis, video processing and video compression. Specifically we are exploring each research topic from three aspects: algorithm, computation and data.

@sjtu-medialab   @media_tech

Follow our Github for academic open source projects.
Follow our Wechat Official Account for the latest media technology progress.

Recent Publications

A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution
Complexity-Oriented Per-Shot Video Coding Optimization
Generative Compression for Face Video: A Hybrid Scheme
Hiding Among Your Neighbors: Face Image Privacy Protection with Differential Private k-Anonymity
Low-Complexity Multi-Model CNN in-Loop Filter for AVS3
Multi-Scale Coarse-to-Fine Transformer for Frame Interpolation
Reinforcement Learning Based Cross-Layer Congestion Control for Real-Time Communication
Dense 3D Coordinate Code Prior Guidance for High-Fidelity Face Swapping and Face Reenactment
Dense 3D Coordinate Code Prior Guidance for High-Fidelity Face Swapping and Face Reenactment

稠密三维坐标编码指导下的高保真人脸交换和人脸重演 摘要 在人脸合成任务中,常用的二维人脸表征(如 2D 人脸关键点,人脸分割图等)通常稀疏且不连续,进而导致无法精确指导人脸的合成。为了克服这些缺点,我们采用一个稠密且取值连续的人脸表征:Projected Normalized Coordinate Code (PNCC) [1] 作为指导,并提出一个 PNCC-Spatio-Normalization (PSN) 方法来实现任意姿态和表情的人脸合成。基于 PSN,我们设计了一个用来实现换脸和人脸重演的框架,以及一个简单有效的融合算法 Appearance-Blending Module (ABM)。我们的方法不需要额外的训练和微调,实验证明了该方法的优越性。 引言 真实感的人脸合成是计算机视觉和图形学领域的一个新兴研究课题,其中人脸交换(简称“换脸”)和人脸重演是两个很有前途的子任务。人脸交换是将源人脸的身份转移到目标人脸,而人脸重演是利用目标人脸的姿态和表情来驱动源人脸。它们由于在娱乐、隐私、虚拟现实和视频配音等方面的应用前景而受到关注。 在换脸和人脸重演任务中,2D 人脸表征(如人脸关键点,人脸分割图等)常被用作指导人脸合成的工具,然而这些表征常常由于过于稀疏而无法精确指导人脸的合成,且完全基于 2D 的表达也无法将人脸属性进行解耦(如身份,姿态,表情等属性)。也有一些在隐空间进行编码进而合成人脸的方法,这些方法虽然取得了不错的实验结果,但合成过程缺乏可控性和灵活性。 为了解决上述问题,我们提出使用一种更加稠密的人脸表征:Projected Normalized Coordinate Code (PNCC) [1] 来精确地指导人脸合成并实现人脸属性的解耦和控制。PNCC 是一种基于三维人脸重建的二维人脸表征,具有稠密和邻域取值连续的特点。在之前的工作中,PNCC 都是作为一种辅助信息来帮助合成人脸,没有充分发挥其潜能。

Modeling Acceleration Properties for Flexible INTRA HEVC Complexity Control
3D-BitNet: Flow-Agnostic and Precise Network for video Bit-Depth Expansion
SpaAbr: Size Prediction Assisted Adaptive Bitrate Algorithm for Scalable Video Coding Contents