ast-finetuned-speech-commands-v2 Open-source Audio Model - Accurately Complete Audio Classification Tasks

Ast Finetuned Speech Commands V2

Developed by MIT

An audio spectrogram transformer model fine-tuned on the Speech Commands v2 dataset for audio classification tasks, achieving 98.12% accuracy.

Audio Classification

Transformers

Open Source License:Bsd-3-clause #High-precision audio classification #Voice command recognition #Spectrogram transformer

Downloads 10.94k

Release Time : 11/14/2022

Model Overview

This model converts audio into spectrograms and applies a vision transformer architecture, specifically designed for voice command classification tasks.

Model Features

High accuracy

Achieves 98.12% classification accuracy on the Speech Commands v2 dataset

Spectrogram conversion

Applies vision transformer technology after converting audio signals into spectrograms

End-to-end learning

Learns features directly from raw audio data without manual feature engineering

Model Capabilities

Voice command recognition

Audio classification

Short audio processing

Use Cases

Smart home control

Voice-controlled devices

Recognizes user voice commands to control smart home devices

High accuracy recognition of common control commands

Accessibility applications

Voice assistance tools

Provides voice control interfaces for users with mobility impairments

Property	Details
Model Type	Audio Spectrogram Transformer (fine - tuned on Speech Commands v2)
Training Data	Speech Commands v2
License	BSD 3 - Clause
Results	Task: Audio Classification Dataset: Speech Commands v2 Metric: Accuracy, Value: 98.12

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Ast Finetuned Speech Commands V2

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Audio Spectrogram Transformer (fine-tuned on Speech Commands v2)

🚀 Quick Start

✨ Features

💻 Usage Examples

📚 Documentation

Model description

📄 License