A

Ast Finetuned Audioset 10 10 0.4593 Finetuning ESC 50 Slower LR

Developed by xpariz10
Audio classification model based on AST architecture, pre-trained on the AudioSet dataset and fine-tuned on the ESC-50 dataset
Downloads 22
Release Time : 12/10/2022

Model Overview

This model is an audio classification model using the AST (Audio Spectrogram Transformer) architecture. It was first pre-trained on the AudioSet dataset and then fine-tuned on the ESC-50 environmental sound classification dataset.

Model Features

Transformer-based Audio Processing
Adopts AST architecture, successfully applying Transformer to audio spectrogram processing
Two-stage Training
Pre-trained on the large AudioSet dataset first, then fine-tuned on the ESC-50 dataset
High Accuracy
Achieves 89.29% accuracy on the evaluation set

Model Capabilities

Audio Classification
Environmental Sound Recognition
Sound Event Detection

Use Cases

Smart Home
Appliance Sound Recognition
Identify sounds from different household appliances
Environmental Monitoring
Natural Environment Sound Classification
Identify sounds in different environments like forests, cities, etc.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase