20220517 - 150219 Open-source Speech Recognition Model - Free Support for Automatic Speech Recognition Tasks

20220517 150219

Developed by lilitket

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m, supporting automatic speech recognition (ASR) tasks.

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Multilingual Support #Low Word Error Rate

Downloads 29

Release Time : 5/17/2022

Model Overview

A speech recognition model based on the wav2vec2-xls-r-300m architecture, achieving a word error rate of 0.2344 and a character error rate of 0.0434 on the evaluation set after fine-tuning.

Model Features

Low Word Error Rate

Achieved a word error rate of 0.2344 on the evaluation set, demonstrating good performance

Low Character Error Rate

Achieved a character error rate of 0.0434 on the evaluation set, with high recognition accuracy

Based on Large-Scale Pre-trained Model

Fine-tuned from the facebook/wav2vec2-xls-r-300m model, inheriting its powerful speech feature extraction capabilities

Model Capabilities

Speech-to-Text

Automatic Speech Recognition

Use Cases

Speech Transcription

Automatic Meeting Minutes Transcription

Automatically convert meeting recordings into text transcripts

High accuracy with a word error rate of 23.44%

Voice Note Conversion

Convert voice notes into editable text

Character error rate as low as 4.34%

🚀 20220517-150219

This model is a fine - tuned version of [facebook/wav2vec2 - xls - r - 300m](https://huggingface.co/facebook/wav2vec2 - xls - r - 300m), which can be used for relevant speech - related tasks and achieve good performance on the evaluation set.

🚀 Quick Start

This model is a fine - tuned version of [facebook/wav2vec2 - xls - r - 300m](https://huggingface.co/facebook/wav2vec2 - xls - r - 300m) on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2426
Wer: 0.2344
Cer: 0.0434

🔧 Technical Details

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 8
seed: 1339
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
5.3867	0.02	200	3.2171	1.0	1.0
3.1288	0.04	400	2.9394	1.0	1.0
1.8298	0.06	600	0.9138	0.8416	0.2039
0.9751	0.07	800	0.6568	0.6928	0.1566
0.7934	0.09	1000	0.5314	0.6225	0.1277
0.663	0.11	1200	0.4759	0.5730	0.1174
0.617	0.13	1400	0.4515	0.5578	0.1118
0.5473	0.15	1600	0.4017	0.5157	0.1004
0.5283	0.17	1800	0.3872	0.5094	0.0982
0.4893	0.18	2000	0.3725	0.4860	0.0932
0.495	0.2	2200	0.3580	0.4542	0.0878
0.4438	0.22	2400	0.3443	0.4366	0.0858
0.4425	0.24	2600	0.3428	0.4284	0.0865
0.4293	0.26	2800	0.3329	0.4221	0.0819
0.3779	0.28	3000	0.3278	0.4146	0.0794
0.4116	0.29	3200	0.3242	0.4107	0.0757
0.3912	0.31	3400	0.3217	0.4040	0.0776
0.391	0.33	3600	0.3127	0.3955	0.0764
0.3696	0.35	3800	0.3153	0.3892	0.0748
0.3576	0.37	4000	0.3156	0.3846	0.0737
0.3553	0.39	4200	0.3024	0.3814	0.0726
0.3394	0.4	4400	0.3022	0.3637	0.0685
0.3345	0.42	4600	0.3130	0.3641	0.0698
0.3357	0.44	4800	0.2913	0.3602	0.0701
0.3411	0.46	5000	0.2941	0.3514	0.0674
0.3031	0.48	5200	0.3043	0.3613	0.0685
0.3305	0.5	5400	0.2967	0.3468	0.0657
0.3004	0.51	5600	0.2723	0.3309	0.0616
0.31	0.53	5800	0.2835	0.3404	0.0648
0.3224	0.55	6000	0.2743	0.3358	0.0622
0.3261	0.57	6200	0.2803	0.3358	0.0620
0.305	0.59	6400	0.2835	0.3397	0.0629
0.3025	0.61	6600	0.2684	0.3340	0.0639
0.2952	0.62	6800	0.2654	0.3256	0.0617
0.2903	0.64	7000	0.2588	0.3174	0.0596
0.2907	0.66	7200	0.2789	0.3256	0.0623
0.2887	0.68	7400	0.2634	0.3142	0.0605
0.291	0.7	7600	0.2644	0.3097	0.0582
0.2646	0.72	7800	0.2753	0.3089	0.0582
0.2683	0.73	8000	0.2703	0.3036	0.0574
0.2808	0.75	8200	0.2544	0.2994	0.0561
0.2724	0.77	8400	0.2584	0.3051	0.0592
0.2516	0.79	8600	0.2575	0.2959	0.0557
0.2561	0.81	8800	0.2594	0.2945	0.0552
0.264	0.83	9000	0.2607	0.2987	0.0552
0.2383	0.84	9200	0.2641	0.2983	0.0546
0.2548	0.86	9400	0.2714	0.2930	0.0538
0.2284	0.88	9600	0.2542	0.2945	0.0555
0.2354	0.9	9800	0.2564	0.2937	0.0551
0.2624	0.92	10000	0.2466	0.2891	0.0542
0.24	0.94	10200	0.2404	0.2895	0.0528
0.2372	0.95	10400	0.2590	0.2782	0.0518
0.2357	0.97	10600	0.2629	0.2867	0.0531
0.2439	0.99	10800	0.2722	0.2902	0.0556
0.2204	1.01	11000	0.2618	0.2856	0.0535
0.2043	1.03	11200	0.2662	0.2789	0.0520
0.2081	1.05	11400	0.2744	0.2831	0.0532
0.199	1.06	11600	0.2586	0.2800	0.0519
0.2063	1.08	11800	0.2711	0.2842	0.0531
0.2116	1.1	12000	0.2463	0.2782	0.0529
0.2095	1.12	12200	0.2371	0.2757	0.0510
0.1786	1.14	12400	0.2693	0.2768	0.0520
0.1999	1.16	12600	0.2625	0.2793	0.0513
0.1985	1.17	12800	0.2734	0.2796	0.0532
0.187	1.19	13000	0.2654	0.2676	0.0514
0.188	1.21	13200	0.2548	0.2648	0.0489
0.1853	1.23	13400	0.2684	0.2641	0.0509
0.197	1.25	13600	0.2589	0.2662	0.0507
0.1873	1.27	13800	0.2633	0.2686	0.0516
0.179	1.28	14000	0.2682	0.2598	0.0508
0.2008	1.3	14200	0.2505	0.2609	0.0493
0.1802	1.32	14400	0.2470	0.2598	0.0493
0.1903	1.34	14600	0.2572	0.2672	0.0500
0.1852	1.36	14800	0.2576	0.2633	0.0491
0.1933	1.38	15000	0.2649	0.2602	0.0493
0.191	1.4	15200	0.2578	0.2612	0.0484
0.1863	1.41	15400	0.2572	0.2566	0.0488
0.1785	1.43	15600	0.2661	0.2520	0.0478
0.1755	1.45	15800	0.2637	0.2605	0.0485
0.1677	1.47	16000	0.2481	0.2559	0.0478
0.1633	1.49	16200	0.2584	0.2531	0.0476
0.166	1.51	16400	0.2576	0.2595	0.0487
0.1798	1.52	16600	0.2517	0.2570	0.0488
0.1879	1.54	16800	0.2555	0.2531	0.0479
0.1636	1.56	17000	0.2419	0.2467	0.0464
0.1706	1.58	17200	0.2426	0.2457	0.0463
0.1763	1.6	17400	0.2427	0.2496	0.0467
0.1687	1.62	17600	0.2507	0.2496	0.0467
0.1662	1.63	17800	0.2553	0.2474	0.0466
0.1637	1.65	18000	0.2576	0.2450	0.0461
0.1744	1.67	18200	0.2394	0.2414	0.0454
0.1597	1.69	18400	0.2442	0.2443	0.0452
0.1606	1.71	18600	0.2488	0.2435	0.0453
0.1558	1.73	18800	0.2563	0.2464	0.0464
0.172	1.74	19000	0.2501	0.2411	0.0452
0.1594	1.76	19200	0.2481	0.2460	0.0458
0.1732	1.78	19400	0.2427	0.2414	0.0443
0.1706	1.8	19600	0.2367	0.2418	0.0446
0.1724	1.82	19800	0.2376	0.2390	0.0444
0.1621	1.84	20000	0.2430	0.2382	0.0438
0.1501	1.85	20200	0.2445	0.2404	0.0438
0.1526	1.87	20400	0.2472	0.2361	0.0436
0.1756	1.89	20600	0.2431	0.2400	0.0437
0.1598	1.91	20800	0.2472	0.2368	0.0439
0.1554	1.93	21000	0.2431	0.2347	0.0435
0.1354	1.95	21200	0.2427	0.2354	0.0438
0.1587	1.96	21400	0.2427	0.2347	0.0435
0.1541	1.98	21600	0.2426	0.2344	0.0434

Framework versions

Transformers 4.18.0.dev0
Pytorch 1.10.0+cu113
Datasets 2.1.0
Tokenizers 0.11.6

📄 License

This model is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご