It is no secret that top AI models are trained using a ton of H100 GPU hours. Seaweed 7B aims to be a more cost effective option. It has 7b parameters and learns from multi-modal data such as video, image, and text to generate realistic videos. It is a very versatile model that can generate various realistic landscape and human actions.
Seaweed lets you use an image as a reference to generate your video. As the researchers explain:
“Seaweed is adapted to generate content conditioned on audio inputs by Omnihuman, enabling the creation of realistic human characters that perfectly match the voice in the audio.”
It can make 20 second videos without any extension. With an extension, it can generate videos up to a minute. This model can also work with reference images like other top models. Seaweed is also capable of generating both audio and video together.
[HT]