malaysian-whisper-base開源語音識別模型 - 免費支持馬來語和英語識別

首頁

Malaysian Whisper Base

由mesolitica開發

基於馬來西亞數據集微調的Whisper基礎模型，支持馬來語和英語的語音識別

語音識別

Transformers

支持多種語言#馬來語語音識別 #多方言支持 #英語-馬來語雙語

下載量 143

發布時間 : 1/1/2024

模型概述

該模型是基於Whisper架構的語音識別模型，專門針對馬來西亞地區的馬來語和英語進行了微調，適用於馬來西亞口音和方言的語音轉文字任務。

模型特點

馬來西亞語言優化

專門針對馬來西亞地區的馬來語和英語口音進行優化，包括標準馬來語和方言

多源訓練數據

使用了包括IMDA語音轉文字數據集、馬來西亞YouTube視頻偽標註數據集等多種數據源進行訓練

雙語支持

同時支持馬來語和英語的語音識別，包括馬來式英語

時間戳支持

能夠生成帶時間戳的轉錄結果

模型能力

馬來語語音識別

英語語音識別

帶時間戳的轉錄

馬來西亞口音識別

使用案例

語音轉錄

會議記錄

將馬來西亞地區的會議錄音自動轉錄為文字

準確識別馬來西亞口音的馬來語和英語

媒體內容字幕生成

為馬來西亞YouTube視頻自動生成字幕

支持方言和當地口音的識別

語音分析

語音數據分析

分析馬來西亞地區的語音數據以獲取洞察

能夠處理馬來西亞特有的語言變體

🚀 馬來西亞微調版Whisper Base

本項目在馬來西亞數據集上對Whisper Base進行微調，旨在提升其在馬來西亞相關語音識別任務中的性能。該項目解決了在馬來西亞多語言語音場景下，現有語音識別模型識別準確率不高的問題，為馬來西亞地區的語音處理提供了更精準、更適配的解決方案。

🚀 快速開始

本項目在以下數據集上對Whisper Base進行微調：

IMDA STT，數據集鏈接
偽標籤馬來西亞YouTube視頻，數據集鏈接
馬來語對話語音語料庫，數據集鏈接
Haqkiem TTS數據集，此為私有數據集，你可以從這裡請求訪問權限
偽標籤努山塔拉有聲讀物，數據集鏈接

腳本鏈接：https://github.com/mesolitica/malaya-speech/tree/malaysian-speech/session/whisper

Wandb鏈接：https://wandb.ai/huseinzol05/malaysian-whisper-base?workspace=user-huseinzol05

Wandb報告鏈接：https://wandb.ai/huseinzol05/malaysian-whisper-base/reports/Finetune-Whisper--Vmlldzo2Mzg2NDgx

✨ 主要特性

支持的微調語言

ms，馬來語，包括標準馬來語和當地馬來語。
en，英語，包括標準英語和馬來西亞式英語（Manglish）。

💻 使用示例

基礎用法

from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq, pipeline
from datasets import Audio
import requests

sr = 16000
audio = Audio(sampling_rate=sr)

processor = AutoProcessor.from_pretrained("mesolitica/malaysian-whisper-base")
model = AutoModelForSpeechSeq2Seq.from_pretrained("mesolitica/malaysian-whisper-base")

r = requests.get('https://huggingface.co/datasets/huseinzol05/malaya-speech-stt-test-set/resolve/main/test.mp3')
y = audio.decode_example(audio.encode_example(r.content))['array']
inputs = processor([y], return_tensors = 'pt')
r = model.generate(inputs['input_features'], language='ms', return_timestamps=True)
processor.tokenizer.decode(r[0])

輸出結果：

'<|startoftranscript|><|ms|><|transcribe|> Zamily On Aging di Vener Australia, Australia yang telah diadakan pada tahun 1982 dan berasaskan unjuran tersebut maka jabatan perangkaan Malaysia menganggarkan menjelang tahun 2005 sejumlah 15% penduduk kita adalah daripada kalangan warga emas. Untuk makluman Tuan Yang Pertua dan juga Alian Bohon, pembangunan sistem pendafiran warga emas ataupun kita sebutkan event adalah usaha kerajaan ke arah merealisasikan objektif yang telah digangkatkan<|endoftext|>'

高級用法

r = model.generate(inputs['input_features'], language='en', return_timestamps=True)
processor.tokenizer.decode(r[0])

輸出結果：

<|startoftranscript|><|en|><|transcribe|> Assembly on Aging, Divina Australia, Australia, which has been provided in 1982 and the operation of the transportation of Malaysia's implementation to prevent the tourism of the 25th, 15% of our population is from the market. For the information of the President and also the respected, the development of the market system or we have made an event.<|endoftext|>