whisper-large-v3-german开源语音识别模型 - 免费部署精准处理德语语音

首页

Whisper Large V3 German

由 primeline 开发

基于Whisper Large v3的德语语音识别微调模型，针对德语语音处理和识别进行了优化

语音识别

Transformers

德语开源协议:Apache-2.0 #德语语音识别 #高精度转录 #长音频处理

下载量 8,745

发布时间 : 11/8/2023

模型简介

该模型是基于OpenAI Whisper Large v3架构的德语语音识别模型，专门针对德语语音转录任务进行了微调优化。

模型特点

德语优化

专门针对德语语音识别进行了微调优化

高性能

在Common Voice德语测试集上WER仅为3.002%，CER为0.81%

多场景适用

支持多种德语语音识别场景

模型能力

德语语音转录

语音指令识别

自动字幕生成

语音搜索处理

听写功能支持

使用案例

媒体与内容

视频字幕生成

为德语视频内容自动生成字幕

高准确率的德语转录

办公效率

语音听写

将德语语音转换为文字

提高文字输入效率

智能交互

语音控制

识别德语语音指令

实现语音控制交互

🚀 whisper-large-v3-german 语音识别模型

本模型基于 Whisper Large v3 进行微调，专为德语语音识别而设计。Whisper 是 OpenAI 开发的强大语音识别平台，此模型经过特别优化，能高效处理和识别德语语音。

🚀 快速开始

代码示例

import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model_id = "primeline/whisper-large-v3-german"
model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)
processor = AutoProcessor.from_pretrained(model_id)
pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    max_new_tokens=128,
    chunk_length_s=30,
    batch_size=16,
    return_timestamps=True,
    torch_dtype=torch_dtype,
    device=device,
)
dataset = load_dataset("distil-whisper/librispeech_long", "clean", split="validation")
sample = dataset[0]["audio"]
result = pipe(sample)
print(result["text"])

✨ 主要特性

精准识别：针对德语语音进行了优化，在 Common Voice de 数据集上，测试 WER 为 3.002 %，测试 CER 为 0.81 %，能实现高精度的语音识别。
广泛应用：可用于多种应用场景，如德语语音转录、语音命令和控制、德语视频自动字幕、德语语音搜索查询以及文字处理程序中的听写功能等。

📦 模型信息

模型详情

属性	详情
模型名称	whisper-large-v3-german by Florian Zimmermeister @primeLine
模型类型	基于 Whisper Large v3 微调的德语语音识别模型
任务类型	自动语音识别
数据集	Common Voice de (common_voice_15, de)
测试 WER	3.002 %
测试 CER	0.81 %
新版本	primeline/whisper-large-v3-turbo-german

模型家族

模型	参数	链接
Whisper large v3 german	1.54B	link
Whisper large v3 turbo german	809M	link
Distil-whisper large v3 german	756M	link
tiny whisper	37.8M	link

🔧 技术细节

训练数据

本模型的训练数据包含大量来自不同来源的德语语音。这些数据经过精心挑选和处理，以优化识别性能。

训练过程

模型训练使用了以下超参数：

批量大小：1024
训练轮数：2
学习率：1e-5
数据增强：无

📄 许可证

本模型采用 Apache-2.0 许可证。

👨‍💻 关于我们

我们是德国 AI 基础设施的合作伙伴，提供强大的 AI 基础设施，助力您在深度学习、机器学习和高性能计算领域实现目标。我们的基础设施针对 AI 训练和推理进行了优化。

模型作者：Florian Zimmermeister

免责声明

本模型并非 primeLine Group 的产品。

它是 [Florian Zimmermeister](https://huggingface.co/flozi00) 开展的研究成果，计算资源由 primeLine 赞助。

该模型由 primeLine 在本账户下发布，但并非 primeLine Solutions GmbH 的商业产品。

请注意，尽管我们已尽最大努力测试和开发此模型，但仍可能出现错误。

使用此模型需自行承担风险。我们不对该模型产生的任何错误输出承担责任。