SkyReels-V1 Open Source Human Centric AI Video Model

This is the SkyReels V1 model: an open source human-centric AI video model that produces videos comparable to Kling and Hailuo. It is all about HunyuanVideo on O(10M) high-quality film to offer advanced facial animation, with 33 distinct facial expressions with over 400 natural movement combinations.

As explained on the project’s website, multi-stage image-to-video pretraining, inspired by HunyuanVideo design, was used for this model:

Stage 1: Model Domain Transfer Pretraining: We use a large dataset (O(10M) of film and television content) to adapt the text-to-video model to the human-centric video domain.
Stage 2: Image-to-Video Model Pretraining: We convert the text-to-video model from Stage 1 into an image-to-video model by adjusting the conv-in parameters. This new model is then pretrained on the same dataset used in Stage 1.
Stage 3: High-Quality Fine-Tuning: We fine-tune the image-to-video model on a high-quality subset of the original dataset, ensuring superior performance and quality.

SkyReels V1 scored 82.43 on VBench, which compares open source models. It is higher than VideoCrafter 2.0 VEnhancer, and CogVideoX1.5-5b.

[HT]

What's Hot

Tencent Hunyuan 3D AI Creation Engine v2.5 Announced

ERNIE X1 Turbo & ERNIE 4.5 Turbo Announced

BadVideo: Backdoor Attack Against Text-To-Video Models

BadVideo: Backdoor Attack Against Text-To-Video Models

Higgsfield Announces Its Turbo Model

Hailuo New Lets You Brainstorm with DeepSeek

Sora v2 Leak: 1-min Video Output, Text + Video to Video?

OpenAI o1 Rolled out 100% of ChatGPT Plus, Team, and Pro Users

Veo 2 Can Understand Math Prompts

Most Popular

How to Run DeepSeek in Cursor

GPTARS: GPT Powered TARS Robot

Simple Grok 2 Jailbreak

Our Picks

Tencent Hunyuan 3D AI Creation Engine v2.5 Announced

ERNIE X1 Turbo & ERNIE 4.5 Turbo Announced

BadVideo: Backdoor Attack Against Text-To-Video Models

What's Hot

SkyReels-V1 Open Source Human Centric AI Video Model

Related Posts