A

Ast Finetuned Model

Developed by forwarder1121
This is a fine-tuned model based on Audio Spectrogram Transformer (AST), specifically designed for emotion classification in speech audio.
Downloads 174
Release Time : 11/17/2024

Model Overview

The model is fine-tuned on the CREMA-D dataset, focusing on six emotion categories (anger, disgust, fear, happiness, neutral, sadness), suitable for speech emotion recognition tasks.

Model Features

Based on Audio Spectrogram Transformer
Utilizes the advanced Audio Spectrogram Transformer architecture to effectively capture emotional features in speech.
Six Emotion Categories
Supports recognition of six emotion categories: anger, disgust, fear, happiness, neutral, and sadness.
Data Augmentation
Employs data augmentation techniques such as noise injection, time shifting, and speed perturbation during training to enhance model robustness.

Model Capabilities

Speech Emotion Recognition
Audio Classification
Emotion Analysis

Use Cases

Human-Computer Interaction
Smart Customer Service Emotion Analysis
Used to analyze users' emotional states during customer service calls to improve service quality.
Mental Health
Emotional State Monitoring
Analyzes users' emotional changes through speech for mental health applications.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase