malaysian-whisper-base开源语音识别模型 - 免费支持马来语和英语识别

首页

Malaysian Whisper Base

由 mesolitica 开发

基于马来西亚数据集微调的Whisper基础模型，支持马来语和英语的语音识别

语音识别

Transformers

支持多种语言#马来语语音识别 #多方言支持 #英语-马来语双语

下载量 143

发布时间 : 1/1/2024

模型简介

该模型是基于Whisper架构的语音识别模型，专门针对马来西亚地区的马来语和英语进行了微调，适用于马来西亚口音和方言的语音转文字任务。

模型特点

马来西亚语言优化

专门针对马来西亚地区的马来语和英语口音进行优化，包括标准马来语和方言

多源训练数据

使用了包括IMDA语音转文字数据集、马来西亚YouTube视频伪标注数据集等多种数据源进行训练

双语支持

同时支持马来语和英语的语音识别，包括马来式英语

时间戳支持

能够生成带时间戳的转录结果

模型能力

马来语语音识别

英语语音识别

带时间戳的转录

马来西亚口音识别

使用案例

语音转录

会议记录

将马来西亚地区的会议录音自动转录为文字

准确识别马来西亚口音的马来语和英语

媒体内容字幕生成

为马来西亚YouTube视频自动生成字幕

支持方言和当地口音的识别

语音分析

语音数据分析

分析马来西亚地区的语音数据以获取洞察

能够处理马来西亚特有的语言变体

🚀 马来西亚微调版Whisper Base

本项目在马来西亚数据集上对Whisper Base进行微调，旨在提升其在马来西亚相关语音识别任务中的性能。该项目解决了在马来西亚多语言语音场景下，现有语音识别模型识别准确率不高的问题，为马来西亚地区的语音处理提供了更精准、更适配的解决方案。

🚀 快速开始

本项目在以下数据集上对Whisper Base进行微调：

IMDA STT，数据集链接
伪标签马来西亚YouTube视频，数据集链接
马来语对话语音语料库，数据集链接
Haqkiem TTS数据集，此为私有数据集，你可以从这里请求访问权限
伪标签努山塔拉有声读物，数据集链接

脚本链接：https://github.com/mesolitica/malaya-speech/tree/malaysian-speech/session/whisper

Wandb链接：https://wandb.ai/huseinzol05/malaysian-whisper-base?workspace=user-huseinzol05

Wandb报告链接：https://wandb.ai/huseinzol05/malaysian-whisper-base/reports/Finetune-Whisper--Vmlldzo2Mzg2NDgx

✨ 主要特性

支持的微调语言

ms，马来语，包括标准马来语和当地马来语。
en，英语，包括标准英语和马来西亚式英语（Manglish）。

💻 使用示例

基础用法

from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq, pipeline
from datasets import Audio
import requests

sr = 16000
audio = Audio(sampling_rate=sr)

processor = AutoProcessor.from_pretrained("mesolitica/malaysian-whisper-base")
model = AutoModelForSpeechSeq2Seq.from_pretrained("mesolitica/malaysian-whisper-base")

r = requests.get('https://huggingface.co/datasets/huseinzol05/malaya-speech-stt-test-set/resolve/main/test.mp3')
y = audio.decode_example(audio.encode_example(r.content))['array']
inputs = processor([y], return_tensors = 'pt')
r = model.generate(inputs['input_features'], language='ms', return_timestamps=True)
processor.tokenizer.decode(r[0])

输出结果：

'<|startoftranscript|><|ms|><|transcribe|> Zamily On Aging di Vener Australia, Australia yang telah diadakan pada tahun 1982 dan berasaskan unjuran tersebut maka jabatan perangkaan Malaysia menganggarkan menjelang tahun 2005 sejumlah 15% penduduk kita adalah daripada kalangan warga emas. Untuk makluman Tuan Yang Pertua dan juga Alian Bohon, pembangunan sistem pendafiran warga emas ataupun kita sebutkan event adalah usaha kerajaan ke arah merealisasikan objektif yang telah digangkatkan<|endoftext|>'

高级用法

r = model.generate(inputs['input_features'], language='en', return_timestamps=True)
processor.tokenizer.decode(r[0])

输出结果：

<|startoftranscript|><|en|><|transcribe|> Assembly on Aging, Divina Australia, Australia, which has been provided in 1982 and the operation of the transportation of Malaysia's implementation to prevent the tourism of the 25th, 15% of our population is from the market. For the information of the President and also the respected, the development of the market system or we have made an event.<|endoftext|>