đ Asteroid model Awais/Audio_Source_Separation
This is an audio source separation model imported from Zenodo. It aims to separate different audio sources effectively, providing high - quality audio separation results.
đ Quick Start
This model is imported from Zenodo. You can use it for audio source separation tasks after proper configuration.
⨠Features
- Trained on Specific Dataset: This model was trained by Joris Cosentino using the librimix recipe in Asteroid. It was trained on the
sep_clean
task of the Libri2Mix dataset.
- Configurable Training: The training configuration can be adjusted according to different requirements, including data settings, filterbank parameters, masknet settings, optimization strategies, and training hyper - parameters.
đĻ Installation
No specific installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples are provided in the original document, so this section is skipped.
đ Documentation
Description
This model was trained by Joris Cosentino using the librimix recipe in Asteroid. It was trained on the sep_clean
task of the Libri2Mix dataset.
Training Config
data:
n_src: 2
sample_rate: 8000
segment: 3
task: sep_clean
train_dir: data/wav8k/min/train-360
valid_dir: data/wav8k/min/dev
filterbank:
kernel_size: 16
n_filters: 512
stride: 8
masknet:
bn_chan: 128
hid_chan: 512
mask_act: relu
n_blocks: 8
n_repeats: 3
skip_chan: 128
optim:
lr: 0.001
optimizer: adam
weight_decay: 0.0
training:
batch_size: 24
early_stop: True
epochs: 200
half_lr: True
num_workers: 2
Results
On Libri2Mix min test set:
si_sdr: 14.764543634468069
si_sdr_imp: 14.764029375607246
sdr: 15.29337970745095
sdr_imp: 15.114146605113111
sir: 24.092904661115366
sir_imp: 23.913669683141528
sar: 16.06055906916849
sar_imp: -51.980784441287454
stoi: 0.9311142440593033
stoi_imp: 0.21817376142710482
đ§ Technical Details
The model uses the ConvTasNet architecture and is trained on the Libri2Mix dataset. The training configuration is carefully designed to optimize the performance of audio source separation, including data pre - processing, filterbank design, masknet construction, and optimization strategies.
đ License
This work "ConvTasNet_Libri2Mix_sepclean_8k" is a derivative of LibriSpeech ASR corpus by Vassil Panayotov, used under CC BY 4.0. "ConvTasNet_Libri2Mix_sepclean_8k" is licensed under [Attribution - ShareAlike 3.0 Unported](https://creativecommons.org/licenses/by - sa/3.0/) by Cosentino Joris.
Additional Information
Property |
Details |
Model Type |
Asteroid model (Awais/Audio_Source_Separation ) |
Training Data |
Libri2Mix (sep_clean task) |
Tags |
asteroid, audio, ConvTasNet, audio - to - audio |