M

MOSS TTSD V0.5

Developed by fnlp
MOSS-TTSD is an open-source bilingual spoken dialogue synthesis model that supports Chinese and English and can convert dialogue scripts into natural and expressive dialogue voices.
Downloads 182
Release Time : 7/4/2025

Model Overview

MOSS-TTSD is a Text-to-Speech Dialogue (TTSD) model specifically designed to generate natural dialogue voices between two people, suitable for scenarios such as AI podcast production.

Model Features

Highly expressive dialogue voices
Trained on millions of hours of TTS data and 400,000 hours of synthetic and real dialogue voices, it can generate dialogue voices similar to human voices with natural dialogue rhythms.
Dual-speaker voice cloning
Supports zero-shot dual-speaker voice cloning and can accurately switch speakers according to the dialogue script.
Sino-English bilingual support
Capable of generating expressive Chinese and English voices.
Long voice generation
Capable of generating voices up to 960 seconds in a single session.
Completely open source and commercially available
Licensed under Apache-2.0, supporting free commercial use.

Model Capabilities

Text-to-speech
Dialogue voice synthesis
Bilingual voice generation
Voice cloning
Long voice generation

Use Cases

Content creation
AI podcast production
Automatically convert dialogue scripts into natural and fluent podcast voices
Generate expressive dialogue voices to enhance the listener experience
Voice interaction
Virtual assistant dialogue
Generate more natural dialogue voices for virtual assistants
Improve the naturalness and friendliness of human-computer interaction
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase