The open-source speech processing model wavlm-basic_s-r-5c_8batch_5sec_0.0001lr_unfrozen

Home

Wavlm Basic S R 5c 8batch 5sec 0.0001lr Unfrozen

Developed by reralle

A speech processing model fine-tuned based on microsoft/wavlm-large, achieving 75% accuracy on the evaluation set

Audio Classification

Transformers

#Voice feature extraction #Mini-batch optimization #Linear learning rate scheduling

Downloads 16

Release Time : 4/30/2023

Model Overview

This model is a variant of the WavLM architecture optimized for speech processing tasks, suitable for short audio segment analysis

Model Features

Efficient fine-tuning

Fine-tuned with a learning rate of 0.0001 to preserve the core capabilities of the pre-trained model

Short audio processing

Optimized for 5-second audio clips, suitable for real-time processing scenarios

Stable training

Utilizes gradient accumulation (4 steps) and linear learning rate scheduling to ensure training stability

Model Capabilities

Voice feature extraction

Short audio classification

Speech pattern recognition

Use Cases

Speech analysis

Emotion recognition

Analyze emotional tendencies in short speech segments

75% accuracy

Voice command classification

Identify categories of short voice commands

F1 score 0.75

🚀 wavlm-basic_s-r-5c_8batch_5sec_0.0001lr_unfrozen

This model is a fine - tuned version of microsoft/wavlm-large on the None dataset. It offers valuable performance in relevant tasks, achieving the following results on the evaluation set:

Loss: 0.9859
Accuracy: 0.75
F1: 0.7515

🚀 Quick Start

This model is a fine - tuned version of microsoft/wavlm-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.9859
Accuracy: 0.75
F1: 0.7515

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.003
num_epochs: 1000

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
2.2767	0.33	78	2.3002	0.1	0.0182
2.0686	0.66	156	2.4001	0.1	0.0182
1.7043	0.99	234	2.1688	0.19	0.0875
1.6238	1.32	312	2.0125	0.2533	0.1313
1.4339	1.65	390	1.7132	0.4433	0.3567
1.2106	1.97	468	1.6403	0.5233	0.4524
1.0918	2.3	546	1.6254	0.58	0.5063
0.9621	2.63	624	1.3746	0.5967	0.5248
0.8272	2.96	702	1.1466	0.6333	0.5852
0.8004	3.29	780	1.0567	0.6633	0.5944
0.676	3.62	858	0.9788	0.6967	0.6457
0.6323	3.95	936	0.9743	0.7133	0.6946
0.609	4.28	1014	1.0422	0.6967	0.6768
0.6942	4.61	1092	1.1858	0.6833	0.6661
0.5759	4.94	1170	1.1483	0.7233	0.7183
0.4296	5.27	1248	1.0037	0.73	0.7224
0.4322	5.59	1326	0.7829	0.8067	0.8046
0.4092	5.92	1404	0.8609	0.7767	0.7743
0.352	6.25	1482	1.1247	0.72	0.7128
0.2858	6.58	1560	0.9369	0.76	0.7500
0.2945	6.91	1638	1.2018	0.7267	0.7083
0.329	7.24	1716	0.9690	0.7767	0.7786

Framework versions

Transformers 4.28.1
Pytorch 2.0.0+cu118
Datasets 2.12.0
Tokenizers 0.13.3

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご