A

Ast Finetuned Audioset 10 10 0.448

Developed by MIT
An Audio Spectrogram Transformer (AST) fine-tuned on the AudioSet dataset, utilizing a vision transformer architecture to process audio spectrograms, achieving excellent performance in audio classification tasks.
Downloads 326
Release Time : 11/14/2022

Model Overview

This model converts audio into spectrograms and processes them through a vision transformer, suitable for audio classification tasks, fine-tuned on the AudioSet dataset.

Model Features

Spectrogram Conversion
Converts audio signals into spectrogram form for processing by a vision transformer.
High-performance Classification
Achieves state-of-the-art results in multiple audio classification benchmarks.
Fine-tuned on AudioSet
Fine-tuned using the large-scale AudioSet dataset to enhance model generalization.

Model Capabilities

Audio Classification
Spectrogram Analysis

Use Cases

Audio Analysis
Environmental Sound Classification
Identifies and classifies various types of environmental sounds
High-accuracy classification results
Music Classification
Classifies music clips by genre or instrument
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase