AI video models are getting better all the time. Alibaba’s Wan 2.1 model is one of the best ones yet. It is capable of handling text to video and image to video. It outperforms other open source models and many commercial solutions.
Wan2.1 also is great for video editing, text to image, and video audio. The Wan2.1-T2V-14B can produce 720p resolution videos. This is the first video model that can generate Chinese and English text. The T2V-1.3B model requires 8.19 GB VRAM, so it should work with consumer grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes.
[HT]