In this day and age, you don’t need a whole lot to generate stunning videos with AI. Hunyuan Sonic is a nifty approach to breathing life into static images. It uses temporal audio learning for accurate lip-sync and natural expressions. By using a motion-decoupled controller, motion of the head and expression movement “are disentangled and independently controlled by intra-audio clips.”
Sonic can generate stunning videos with an image and audio input. It can generate long videos up to 10 minutes. As the above video shows, Sonic can create more dynamic, natural videos. Sonic works well with images that are not real humans.
[HT: Zhejiang University,Tencent ]