C

Cosyvoice2 0.5B

Developed by FunAudioLLM
CosyVoice is a text-to-speech (TTS) model that supports multilingual and voice conversion capabilities, providing high-quality speech synthesis.
Downloads 4,573
Release Time : 12/20/2024

Model Overview

CosyVoice is an advanced text-to-speech model that supports zero-shot speech synthesis, cross-lingual speech synthesis, and voice conversion. It can generate natural and fluent speech from text input and supports multiple languages and speech styles.

Model Features

Multilingual Support
Supports speech synthesis in multiple languages including Chinese, English, Japanese, Cantonese, and Korean.
Zero-shot Speech Synthesis
Generates target speech styles without requiring specific speaker data.
Cross-lingual Speech Synthesis
Uses speech samples from one language to synthesize speech in another language.
Voice Conversion
Converts source speech into target speech styles.
Streaming Inference
Supports real-time streaming speech generation without quality degradation.

Model Capabilities

Text-to-Speech
Voice Style Conversion
Multilingual Speech Synthesis
Zero-shot Speech Synthesis
Cross-lingual Speech Synthesis
Streaming Speech Generation

Use Cases

Voice Assistants
Multilingual Voice Assistant
Provides natural and fluent multilingual speech output for voice assistants.
High-quality speech synthesis results
Audio Content Creation
Audiobook Production
Quickly converts text content into natural speech.
Efficient content production workflow
Game Development
Game Character Voices
Generates diverse voices for game characters.
Rich character voice expressions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase