Zonos Open Weight Text to Speech Model

There are already plenty of tools that can generate humanlike AI voice. Zonos-v0.1 is an open weight text to speech model that delivers expressive audio on par with top TTS providers. You will need up 5 to 30 seconds of speech to achieve high-fidelity voice cloning. You also get various voice options to choose from (American Male/Female, British Male/Female, Random).

This model outputs speech natively at 44kHz. It was trained on 200k hours of English speech. With this tool, you get zero-shot text to speech. It supports English, Japanese, Chinese, French, and German.

Today, we’re excited to announce a beta release of Zonos, a highly expressive TTS model with high fidelity voice cloning.
We release both transformer and SSM-hybrid models under an Apache 2.0 license.
Zonos performs well vs leading TTS providers in quality and expressiveness. pic.twitter.com/jaliZNJecm
— Zyphra (@ZyphraAI) February 10, 2025

[HT]

What's Hot

Tencent Hunyuan 3D AI Creation Engine v2.5 Announced

ERNIE X1 Turbo & ERNIE 4.5 Turbo Announced

BadVideo: Backdoor Attack Against Text-To-Video Models

Mureka AI Music Tool Gets Major Update, Fine-tuning, More Languages

YuE Open Source AI Music Service

Tad AI: New AI Music Generator, Suno Killer?

QwQ-32B DeepSeek R1 Comparable Model

Leonardo Motion 2.0 & Flux Element Training Announced

Tencent’s Hunyuan3D-2mv for 3D Assets Generation Model Released

Tencent Hunyuan 3D AI Creation Engine v2.5 Announced

ERNIE X1 Turbo & ERNIE 4.5 Turbo Announced

Higgsfield Announces Its Turbo Model

Most Popular

How to Run DeepSeek in Cursor

GPTARS: GPT Powered TARS Robot

Simple Grok 2 Jailbreak

Our Picks

Tencent Hunyuan 3D AI Creation Engine v2.5 Announced

ERNIE X1 Turbo & ERNIE 4.5 Turbo Announced

BadVideo: Backdoor Attack Against Text-To-Video Models

What's Hot

Zonos Open Weight Text to Speech Model

Related Posts