A

Ast Finetuned Audioset 10 10 0.4593 Finetuned Gtzan

Developed by wkCircle
This model is an audio classification model based on the Audio Spectrogram Transformer (AST) architecture. After pre-training on the Audioset dataset, it was fine-tuned on the GTZAN music genre classification dataset.
Downloads 8
Release Time : 2/2/2025

Model Overview

This is a Transformer model for audio classification, especially suitable for music genre classification tasks. After fine-tuning on the GTZAN dataset, the model achieved an accuracy of 91%.

Model Features

High accuracy
Achieved an accuracy of 91% on the GTZAN music genre classification task
Based on the Transformer architecture
Adopts the Audio Spectrogram Transformer architecture to specifically handle audio spectrograms
Transfer learning
Pre-trained on the large-scale Audioset dataset first and then fine-tuned on GTZAN

Model Capabilities

Audio classification
Music genre recognition
Audio feature extraction

Use Cases

Music analysis
Music genre classification
Automatically identify the genre category of a music segment
Achieved an accuracy of 91% on the GTZAN dataset
Audio content analysis
Audio content classification
Classify and label audio segments
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase