S

Speechless Llama3.2 V0.1

Developed by homebrewltd
Speechless is a compact open-source text-to-semantic model (1 billion parameters) designed to directly convert audio into discrete semantic representation tokens without relying on traditional text-to-speech (TTS) models.
Downloads 28
Release Time : 12/28/2024

Model Overview

This model simplifies the training process, saves resources, and achieves scalability by directly converting text into semantic speech tokens, especially suitable for low-resource languages.

Model Features

Direct Audio Conversion
Directly converts audio into discrete semantic representation tokens without relying on traditional text-to-speech (TTS) models.
Resource Efficient
Simplifies the training process and saves resources, especially suitable for low-resource languages.
Multilingual Support
Supports English and Vietnamese, trained on over 400 hours of English and 1000 hours of Vietnamese data.

Model Capabilities

Audio-to-Semantic Tagging
Multilingual Processing
Efficient Resource Utilization

Use Cases

Speech Processing
Speech Tag Generation
Directly converts audio into discrete semantic representation tokens for subsequent processing or analysis.
Word error rate is 3.99 on the Vietnamese test set and 3.27 on the English test set.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase