whisper_tflite_models开源模型 - F-Droid平台支持语音转录与翻译

首页

Whisper Tflite Models

由 DocWolle 开发

适用于F-Droid平台上Whisper应用的TFLite模型，支持语音转录和翻译功能。

语音识别开源协议:MIT #多语言语音转录 #实时语音翻译 #TFLite部署

下载量 11.20k

发布时间 : 12/27/2024

模型简介

该模型专为Whisper应用设计，提供语音转录和翻译功能，支持多种语言，并可通过强制解码器ID指定特定语言操作。

模型特点

强制解码器ID

通过设置强制解码器ID，可以强制模型执行特定语言的转录或翻译操作。

多语言支持

支持多种语言的转录和翻译，具体语言代码可参考提供的链接。

TFLite优化

模型经过TFLite优化，适合在移动设备上运行。

模型能力

语音转录

语音翻译

多语言处理

使用案例

语音处理

语音转录

将语音内容转录为文本，支持多种语言。

生成准确的文本转录结果。

语音翻译

将语音内容翻译为指定语言的文本。

生成翻译后的文本结果。

🚀 Whisper TFLite模型：用于F-Droid上的Whisper应用

本项目提供适用于F-Droid平台上Whisper应用的TFLite模型。“转录 - 翻译” 模型为 “serving_transcribe” 和 “serving_translate” 提供了签名，以强制模型执行特定操作。

🚀 快速开始

转录和翻译功能代码示例

以下代码展示了如何使用模型进行转录和翻译操作：

@tf.function(
    input_signature=[
        tf.TensorSpec((1, 80, 3000), tf.float32, name="input_features"),
    ],
)
def transcribe(self, input_features):
    outputs = self.model.generate(
        input_features,
        max_new_tokens=450,  # 可按需修改
        return_dict_in_generate=True,
        forced_decoder_ids=[[2, 50359], [3, 50363]],  # 强制转录任意语言，不包含时间戳
    )
    return {"sequences": outputs["sequences"]}

@tf.function(
    input_signature=[
        tf.TensorSpec((1, 80, 3000), tf.float32, name="input_features"),
    ],
)
def translate(self, input_features):
    outputs = self.model.generate(
        input_features,
        max_new_tokens=450,  # 可按需修改
        return_dict_in_generate=True,
        forced_decoder_ids=[[2, 50358], [3, 50363]],  # 强制翻译任意语言，不包含时间戳
    )
    return {"sequences": outputs["sequences"]}

指定特定语言转录和翻译

若要强制对特定语言进行转录，可按如下方式设置解码器ID：

def transcribe(self, input_features):
    outputs = self.model.generate(
        input_features,
        max_new_tokens=450,  # 可按需修改
        return_dict_in_generate=True,
        forced_decoder_ids=[[1, 50261], [2, 50359], [3, 50363]],  # 强制转录德语（50261），不包含时间戳（50363）
    )
    return {"sequences": outputs["sequences"]}

def translate(self, input_features):
    outputs = self.model.generate(
        input_features,
        max_new_tokens=450,  # 可按需修改
        return_dict_in_generate=True,
        forced_decoder_ids=[[1, 50261], [2, 50358], [3, 50363]],  # 不同的强制解码器ID
    )
    return {"sequences": outputs["sequences"]}