A

Ast Finetuned Speech Commands V2

Developed by MIT
An audio spectrogram transformer model fine-tuned on the Speech Commands v2 dataset for audio classification tasks, achieving 98.12% accuracy.
Downloads 10.94k
Release Time : 11/14/2022

Model Overview

This model converts audio into spectrograms and applies a vision transformer architecture, specifically designed for voice command classification tasks.

Model Features

High accuracy
Achieves 98.12% classification accuracy on the Speech Commands v2 dataset
Spectrogram conversion
Applies vision transformer technology after converting audio signals into spectrograms
End-to-end learning
Learns features directly from raw audio data without manual feature engineering

Model Capabilities

Voice command recognition
Audio classification
Short audio processing

Use Cases

Smart home control
Voice-controlled devices
Recognizes user voice commands to control smart home devices
High accuracy recognition of common control commands
Accessibility applications
Voice assistance tools
Provides voice control interfaces for users with mobility impairments
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase