D

DASS Small AudioSet 47.2

Developed by saurabhati
The first state space model to surpass Transformer-based audio classifiers, achieving state-of-the-art performance on AudioSet audio classification tasks while significantly reducing model size.
Downloads 47
Release Time : 3/29/2025

Model Overview

An audio classification model fine-tuned on AudioSet-2M, utilizing a state space architecture that outperforms traditional Transformer models in audio classification tasks with enhanced duration robustness.

Model Features

Efficient Performance
DASS-small with only 30M parameters outperforms the 87M-parameter AST model (mAP 47.2 vs 45.9).
Duration Robustness
Performance remains stable with long audio inputs, maintaining 96% of 10-second input performance even with 50-second inputs.
Ultra-long Audio Processing
Capable of processing audio inputs up to 2.5 hours long on a single A6000 GPU while maintaining 62% of 10-second input performance.
Distillation Learning
Trained using KL divergence loss against the teacher AST model to enhance learning efficiency.

Model Capabilities

Audio Classification
Multi-label Audio Recognition
Long Audio Processing

Use Cases

Audio Content Analysis
Environmental Sound Classification
Identify various sound categories in natural or urban environments.
Accurately recognizes animal sounds, vehicle noises, and other sound categories.
Audio Event Detection
Detect specific events or sounds in audio streams.
Capable of detecting critical events like glass breaking or alarm sounds.
Media Content Management
Video Content Tagging
Assist video content classification through audio analysis.
Improves efficiency in video content retrieval and classification.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase