Ast Finetuned Audioset 10 10 0.4593 Finetuning ESC 50 Slower LR
Audio classification model based on AST architecture, pre-trained on the AudioSet dataset and fine-tuned on the ESC-50 dataset
Downloads 22
Release Time : 12/10/2022
Model Overview
This model is an audio classification model using the AST (Audio Spectrogram Transformer) architecture. It was first pre-trained on the AudioSet dataset and then fine-tuned on the ESC-50 environmental sound classification dataset.
Model Features
Transformer-based Audio Processing
Adopts AST architecture, successfully applying Transformer to audio spectrogram processing
Two-stage Training
Pre-trained on the large AudioSet dataset first, then fine-tuned on the ESC-50 dataset
High Accuracy
Achieves 89.29% accuracy on the evaluation set
Model Capabilities
Audio Classification
Environmental Sound Recognition
Sound Event Detection
Use Cases
Smart Home
Appliance Sound Recognition
Identify sounds from different household appliances
Environmental Monitoring
Natural Environment Sound Classification
Identify sounds in different environments like forests, cities, etc.
Featured Recommended AI Models
Š 2025AIbase