G

Granite Speech 3.2 8b

Developed by ibm-granite
Granite-speech-3.2-8b is a compact and efficient speech language model specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST).
Downloads 3,335
Release Time : 3/26/2025

Model Overview

This model adopts a two-stage design. The first call transcribes audio files into text. If further processing of the transcribed text is required, an additional call to the underlying Granite language model is needed. Suitable for enterprise-grade speech input processing applications.

Model Features

Two-stage design
The first call transcribes audio into text, requiring explicit triggering of the underlying language model for further processing, enhancing modularity and security.
Modality alignment technology
Trained on corpora containing both audio inputs and text targets to optimize speech processing capabilities.
Efficient architecture
Combines Conformer blocks, windowed query transformers, and LoRA adapters for efficient speech processing.

Model Capabilities

English speech-to-text
English-to-other-language speech translation
Automatic speech recognition
Automatic speech translation

Use Cases

Speech processing
Enterprise-grade speech transcription
Transcribes English speech content such as meeting recordings and customer service calls into text.
High-accuracy English speech-to-text
Cross-language speech translation
Translates English speech into French, Spanish, Italian, German, Portuguese, Japanese, or Chinese.
Supports speech translation in multiple languages
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase