Ast Finetuned Audioset 10 10 0.448 V2
An audio spectrogram transformer fine-tuned on the AudioSet dataset, which converts audio into spectrograms and processes them using a vision transformer, excelling in audio classification tasks.
Downloads 2,072
Release Time : 11/14/2022
Model Overview
This model is an audio classification model based on the ViT architecture. It converts audio signals into spectrogram form and processes them using a vision transformer, suitable for various audio classification tasks.
Model Features
Spectrogram Conversion Processing
Converts audio signals into spectrogram form and processes them using a vision transformer architecture, effectively capturing audio features.
AudioSet Fine-tuning
Fine-tuned on the large-scale audio dataset AudioSet, it possesses robust audio classification capabilities.
SOTA Performance
Achieves state-of-the-art performance in multiple audio classification benchmark tests.
Model Capabilities
Audio Classification
Spectrogram Analysis
Audio Feature Extraction
Use Cases
Audio Content Analysis
Environmental Sound Classification
Identifies and classifies various types of environmental sounds, such as animal calls, vehicle noises, etc.
High-accuracy sound category recognition
Music Classification
Classifies music clips by genre, instruments, etc.
Multimedia Content Moderation
Inappropriate Audio Detection
Identifies potentially inappropriate or sensitive content in audio.
Featured Recommended AI Models