T

TIGER Speech

Developed by JusperLee
TIGER is a lightweight speech separation model that effectively extracts key acoustic features through frequency band partitioning, multi-scale, and full-band frame modeling.
Downloads 1,286
Release Time : 1/22/2025

Model Overview

TIGER is a speech separation model with significantly reduced parameter size and computational cost. Through frequency band partitioning and interleaved modeling architecture, it maintains high performance while drastically cutting down parameters and computational expenses.

Model Features

Lightweight Design
Reduces parameter count by 94.3% and MACs by 95.3% while maintaining high performance.
Frequency Band Partitioning and Compression
Utilizes prior knowledge to partition frequency bands and compress frequency information for improved efficiency.
Multi-scale Selective Attention
Employs Multi-scale Selective Attention (MSA) modules to extract contextual features.
Full-band Frame Attention
Introduces Full-band Frame Attention (F^3A) modules to capture time and frequency contextual information.

Model Capabilities

Speech Separation
Efficient Computation
Multi-scale Feature Extraction

Use Cases

Speech Processing
Speech Separation in Complex Acoustic Environments
Separates overlapping speech in environments with noise and more realistic reverberation.
Significantly outperforms TF-GridNet in inference speed and separation quality on the EchoSet dataset.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase