wav2vec2-common_voice-tr-demo-dist Open-source Speech Recognition Model

Wav2vec2 Common Voice Tr Demo Dist

Developed by gary109

This model is an automatic speech recognition (ASR) model fine-tuned on the Turkish COMMON_VOICE dataset based on facebook/wav2vec2-large-xlsr-53, achieving a word error rate (WER) of 33.05% on the evaluation set.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Turkish speech recognition #Multi-GPU fine-tuning #Low word error rate

Downloads 26

Release Time : 4/12/2022

Model Overview

A speech recognition model optimized for Turkish, suitable for converting Turkish audio into text.

Model Features

Turkish Optimization

Specially fine-tuned for Turkish speech data, adapting to the phonetic characteristics of Turkish.

Based on wav2vec2 Architecture

Utilizes facebook's wav2vec2-large-xlsr-53 as the base model, featuring powerful speech feature extraction capabilities.

Multi-GPU Training

Distributed training using 2 GPUs, improving training efficiency.

Model Capabilities

Turkish audio to text conversion

Continuous speech recognition

Speech content transcription

Use Cases

Speech Transcription

Turkish Meeting Minutes

Automatically convert Turkish meeting recordings into text transcripts

Word error rate approximately 33.05%

Voice Assistant

Provide speech recognition capabilities for Turkish voice assistants

🚀 wav2vec2-common_voice-tr-demo-dist

This model is a fine - tuned version of facebook/wav2vec2-large-xlsr-53 on the COMMON_VOICE - TR dataset. It offers high - performance automatic speech recognition capabilities. On the evaluation set, it achieves the following results:

Loss: 0.3934
Wer: 0.3305

📚 Documentation

Model Information

Property	Details
Model Type	Fine - tuned version of facebook/wav2vec2-large-xlsr-53 on the COMMON_VOICE - TR dataset
Tags	automatic - speech - recognition, common_voice, generated_from_trainer
Datasets	common_voice

License

This project is licensed under the Apache - 2.0 license.

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 8
seed: 42
distributed_type: multi - GPU
num_devices: 2
total_train_batch_size: 8
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 15.0
mixed_precision_training: Native AMP

Training Results

Training Loss	Epoch	Step	Validation Loss	Wer
3.5459	0.23	100	3.6773	1.0
3.2247	0.46	200	3.1491	0.9999
2.3457	0.69	300	2.4236	1.0041
0.9149	0.92	400	0.9471	0.7684
0.6622	1.15	500	0.7518	0.6863
0.7205	1.38	600	0.6387	0.6402
0.6978	1.61	700	0.5611	0.5739
0.5317	1.84	800	0.5061	0.5418
0.5222	2.07	900	0.4839	0.5344
0.4467	2.3	1000	0.5060	0.5339
0.3196	2.53	1100	0.4619	0.5213
0.276	2.76	1200	0.4595	0.5020
0.3569	2.99	1300	0.4339	0.4901
0.2236	3.22	1400	0.4602	0.4887
0.293	3.45	1500	0.4376	0.4639
0.1677	3.68	1600	0.4371	0.4605
0.1838	3.91	1700	0.4116	0.4589
0.1225	4.14	1800	0.4144	0.4495
0.2301	4.37	1900	0.4250	0.4567
0.1931	4.6	2000	0.4081	0.4470
0.1427	4.83	2100	0.4295	0.4482
0.361	5.06	2200	0.4374	0.4445
0.3272	5.29	2300	0.4088	0.4258
0.3686	5.52	2400	0.4087	0.4258
0.3087	5.75	2500	0.4100	0.4371
0.4637	5.98	2600	0.4038	0.4219
0.1485	6.21	2700	0.4361	0.4197
0.1341	6.44	2800	0.4217	0.4132
0.1185	6.67	2900	0.4244	0.4097
0.1588	6.9	3000	0.4212	0.4181
0.0697	7.13	3100	0.3981	0.4073
0.0491	7.36	3200	0.3992	0.4010
0.088	7.59	3300	0.4206	0.4022
0.0731	7.82	3400	0.3998	0.3841
0.2767	8.05	3500	0.4195	0.3829
0.1725	8.28	3600	0.4167	0.3946
0.1242	8.51	3700	0.4177	0.3821
0.1133	8.74	3800	0.3993	0.3802
0.1952	8.97	3900	0.4132	0.3904
0.1399	9.2	4000	0.4010	0.3795
0.047	9.43	4100	0.4128	0.3703
0.049	9.66	4200	0.4319	0.3670
0.0994	9.89	4300	0.4118	0.3631
0.1209	10.11	4400	0.4296	0.3722
0.0484	10.34	4500	0.4130	0.3615
0.2065	10.57	4600	0.3958	0.3668
0.133	10.8	4700	0.4102	0.3679
0.0622	11.03	4800	0.4137	0.3585
0.0999	11.26	4900	0.4042	0.3583
0.0346	11.49	5000	0.4183	0.3573
0.072	11.72	5100	0.4060	0.3530
0.0365	11.95	5200	0.3968	0.3483
0.0615	12.18	5300	0.3958	0.3485
0.1067	12.41	5400	0.3987	0.3453
0.0253	12.64	5500	0.4182	0.3405
0.0636	12.87	5600	0.4199	0.3458
0.0506	13.1	5700	0.4056	0.3412
0.0944	13.33	5800	0.4061	0.3381
0.1187	13.56	5900	0.4113	0.3381
0.0237	13.79	6000	0.3973	0.3343
0.0166	14.02	6100	0.4001	0.3357
0.1189	14.25	6200	0.3931	0.3315
0.0375	14.48	6300	0.3944	0.3329
0.0537	14.71	6400	0.3953	0.3308
0.045	14.94	6500	0.3933	0.3303

Framework Versions

Transformers 4.18.0
Pytorch 1.9.1+cu102
Datasets 1.13.3
Tokenizers 0.11.6

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご