C

Ced Small

Developed by mispeech
CED is a simple audio tagging model based on ViT-Transformer, achieving state-of-the-art performance on Audioset.
Downloads 18
Release Time : 11/24/2023

Model Overview

CED is a Transformer model for audio classification, specifically optimized for audio tagging tasks, supporting variable-length input and simplifying the fine-tuning process.

Model Features

Simplified Fine-Tuning
Batch normalization for Mel spectrograms eliminates the need to precompute dataset mean/variance during fine-tuning.
Variable-Length Input Support
Breaks the traditional Transformer's 10-second segment limitation, enhancing model generalization.
Efficient Training/Inference
Optimized chunking strategy significantly reduces computational costs compared to AST models.
High-Performance Compact Model
The 10M-parameter CED model outperforms most 80M-parameter solutions.

Model Capabilities

Audio Classification
Audio Tagging
Sound Event Detection

Use Cases

Sound Recognition
Environmental Sound Classification
Identify various types of environmental sounds
Achieves 49.6 mAP on Audioset
Specific Sound Detection
Detect specific sound events like finger snaps
Accurately recognizes 527 sound categories
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase