Whisper-base-finetuned-gtzan Open-source Speech Classification Model - Free Deployment for Accurate Music Genre Recognition

Whisper Base Finetuned Gtzan

Developed by vineetsharma

A speech classification model fine-tuned on the GTZAN dataset based on OpenAI's whisper-base model, primarily used for music genre classification tasks.

Audio Classification

Transformers

Open Source License:Apache-2.0 #Music Classification #High Accuracy #Audio Recognition

Downloads 15

Release Time : 7/3/2023

Model Overview

This model is a variant based on the whisper-base architecture, specifically optimized for music genre classification tasks. It achieved an accuracy of 87% on the GTZAN dataset.

Model Features

High Accuracy

Achieved 87% classification accuracy on the GTZAN test set

Fine-tuned Optimization

Optimized specifically for music classification tasks based on the whisper-base model

Lightweight

Based on whisper-base architecture, relatively lightweight (inferred)

Model Capabilities

Music Genre Classification

Audio Feature Extraction

Use Cases

Music Analysis

Automatic Music Genre Classification

Classify music clips by genre

87% accuracy

Music Recommendation System

Serve as a preprocessing component for music recommendation systems

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.9075	1.0	57	1.0000	0.58
0.4569	2.0	114	0.6073	0.83
0.3761	3.0	171	0.6410	0.8
0.3049	4.0	228	0.4536	0.86
0.0284	5.0	285	0.5120	0.85
0.0165	6.0	342	0.4856	0.89
0.0087	7.0	399	0.6814	0.87
0.0038	8.0	456	0.7059	0.85
0.0032	9.0	513	0.6831	0.87
0.0034	10.0	570	0.6867	0.87

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Whisper Base Finetuned Gtzan

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 whisper-base-finetuned-gtzan

🚀 Quick Start

🔧 Technical Details

Training hyperparameters

Training results

Framework versions

📄 License