asr-crdnn-german開源德語語音識別模型 - 低錯誤率精準語音轉文字

首頁

Asr Crdnn German

由jfreiwa開發

基於Mozilla Common Voice 6.1、德語維基百科語音語料庫和m-ailabs語料庫訓練的德語ASR模型，詞錯誤率7.24%

語音識別

PyTorch

德語#德語語音轉寫 #無語言模型 #多語料訓練

下載量 17

發布時間 : 3/29/2022

模型概述

這是一個德語自動語音識別(ASR)模型，採用CRDNN架構，支持將德語語音轉換為文本。

模型特點

多數據源訓練

整合了Mozilla Common Voice、德語維基百科語音和m-ailabs三個高質量德語語音數據集

低詞錯誤率

在測試集上達到7.24%的詞錯誤率(WER)

開源實現

完整的訓練代碼和預訓練模型已在GitHub開源

模型能力

德語語音轉文本

長語音轉錄

即時語音識別

使用案例

語音轉錄

會議記錄

將德語會議錄音自動轉換為文字記錄

準確率約92.76%

字幕生成

為德語視頻內容自動生成字幕

語音助手

德語語音指令識別

用於德語語音控制系統的語音識別模塊

🚀 德語自動語音識別模型

本模型是一個用於德語自動語音識別的模型，它基於特定數據集訓練而成，能實現語音到文本的轉換，為德語語音處理相關應用提供支持。

🚀 快速開始

本模型在 Mozilla Common Voice 6.1、Spoken Wikipedia Corpus 和 m-ailabs 語料庫上進行訓練。

https://nats.gitlab.io/swc/
https://commonvoice.mozilla.org/de/datasets
https://www.caito.de/2019/01/03/the-m-ailabs-speech-dataset/

我們未提供語言模型。你可以在此處找到訓練代碼。

✨ 主要特性

性能出色：該模型的字錯率（WER）為 7.24%。（你可以在這裡找到此模型的更新版本）

📦 安裝指南

安裝 SpeechBrain

首先，請使用以下命令安裝 SpeechBrain：

pip install speechbrain

請注意，我們建議你閱讀相關教程，進一步瞭解 SpeechBrain。

💻 使用示例

基礎用法

from speechbrain.pretrained import EncoderDecoderASR

asr_model = EncoderDecoderASR.from_hparams(source="jfreiwa/asr-crdnn-german", savedir="pretrained_models/asr-crdnn-german")
asr_model.transcribe_file("jfreiwa/asr-crdnn-german/example-de.wav")

高級用法

# 在 GPU 上進行推理，在調用 from_hparams 方法時添加 run_opts={"device":"cuda"}
from speechbrain.pretrained import EncoderDecoderASR

asr_model = EncoderDecoderASR.from_hparams(source="jfreiwa/asr-crdnn-german", savedir="pretrained_models/asr-crdnn-german", run_opts={"device":"cuda"})
asr_model.transcribe_file("jfreiwa/asr-crdnn-german/example-de.wav")

📚 詳細文檔

侷限性

我們不保證該模型在其他數據集上的性能表現。

關於 SpeechBrain

官網：https://speechbrain.github.io/
代碼：https://github.com/speechbrain/speechbrain/
HuggingFace：https://huggingface.co/speechbrain/

引用 SpeechBrain

如果您在研究或商業中使用了 SpeechBrain，請進行引用：

@misc{speechbrain,
  title={{SpeechBrain}: A General-Purpose Speech Toolkit},
  author={Mirco Ravanelli and Titouan Parcollet and Peter Plantinga and Aku Rouhe and Samuele Cornell and Loren Lugosch and Cem Subakan and Nauman Dawalatabad and Abdelwahab Heba and Jianyuan Zhong and Ju-Chieh Chou and Sung-Lin Yeh and Szu-Wei Fu and Chien-Feng Liao and Elena Rastorgueva and François Grondin and William Aris and Hwidong Na and Yan Gao and Renato De Mori and Yoshua Bengio},
  year={2021},
  eprint={2106.04624},
  archivePrefix={arXiv},
  primaryClass={eess.AS},
  note={arXiv:2106.04624}
}

引用我們的論文

如果您在研究中使用了此模型，請引用我們的論文：

@inproceedings{freiwald2022,
  author={J. Freiwald and P. Pracht and S. Gergen and D. Kolossa},
  title={Open-Source End-To-End Learning for Privacy-Preserving German {ASR}},
  year=2022,
  booktitle={DAGA 2022}
}