Ast Finetuned Audioset 10 10 0.4593 Finetuned Gtzan
This model is an audio classification model based on the Audio Spectrogram Transformer (AST) architecture. After pre-training on the Audioset dataset, it was fine-tuned on the GTZAN music genre classification dataset.
Downloads 8
Release Time : 2/2/2025
Model Overview
This is a Transformer model for audio classification, especially suitable for music genre classification tasks. After fine-tuning on the GTZAN dataset, the model achieved an accuracy of 91%.
Model Features
High accuracy
Achieved an accuracy of 91% on the GTZAN music genre classification task
Based on the Transformer architecture
Adopts the Audio Spectrogram Transformer architecture to specifically handle audio spectrograms
Transfer learning
Pre-trained on the large-scale Audioset dataset first and then fine-tuned on GTZAN
Model Capabilities
Audio classification
Music genre recognition
Audio feature extraction
Use Cases
Music analysis
Music genre classification
Automatically identify the genre category of a music segment
Achieved an accuracy of 91% on the GTZAN dataset
Audio content analysis
Audio content classification
Classify and label audio segments
Featured Recommended AI Models
Š 2025AIbase