Ced Tiny
CED-Tiny is a simple audio tagging model based on ViT-Transformer, achieving state-of-the-art performance on Audioset.
Downloads 54
Release Time : 11/24/2023
Model Overview
CED-Tiny is an efficient audio classification model designed for audio tagging tasks, featuring a small parameter count and fast inference speed.
Model Features
Simplified Fine-tuning
Batch normalization of Mel spectrograms eliminates the need to pre-compute dataset mean/variance during fine-tuning.
Variable-Length Input Support
Most models use static time-frequency positional encoding, limiting generalization for clips shorter than 10s. CED-Tiny supports variable-length input for greater flexibility.
Training/Inference Acceleration
Uses 64-dimensional Mel filter banks and 16x16 non-overlapping patches, generating only 248 patches for a 10s spectrogram, significantly boosting training and inference speed.
Performance Advantage
The 10M-parameter CED model outperforms most previous ~80M-parameter solutions.
Model Capabilities
Audio Classification
Audio Tagging
Variable-Length Audio Processing
Use Cases
Audio Classification
Environmental Sound Classification
Identifies and classifies various environmental sounds such as animal calls, vehicle noises, etc.
Achieved 36.5 mAP on Audioset.
Audio Event Detection
Detects specific audio events like applause, finger snaps, etc.
Achieved 48.1 mAP on Audioset.
Featured Recommended AI Models
Š 2025AIbase