C

Csm 1b Safetensors Quants

Developed by lunahr
CSM (Conversational Speech Model) is a 1-billion-parameter speech generation model developed by Sesame, capable of generating RVQ audio encoding from text and audio inputs.
Downloads 37
Release Time : 3/15/2025

Model Overview

A speech generation model based on the Llama backbone network and a lightweight audio decoder, supporting text-to-speech functionality and outputting Mimi audio encoding.

Model Features

Multi-speaker Support
Allows control of different speaker tones via the speaker parameter
Context-aware Generation
Supports enhanced generation effects through contextual audio segments
Secure Tensor Format
Supports multiple secure tensor formats and tracks download statistics

Model Capabilities

Text-to-speech
Multi-speaker Speech Generation
Context-aware Speech Synthesis

Use Cases

Voice Interaction
Dialogue System Voice Output
Combines with LLM to build a complete dialogue system
Interactive voice demos have been showcased on the blog
Content Creation
Audio Content Generation
Automatically generates voice content such as podcasts and audiobooks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase