nllb-200-distilled-600M-wo-fr-en开源模型 - 精准双向翻译沃洛夫语、法语和英语

首页

Nllb 200 Distilled 600M Wo Fr En

由 bilalfaye 开发

该模型是基于NLLB-200-distilled-600M微调的多语言翻译模型，专门优化沃洛夫语、法语和英语之间的双向翻译。

机器翻译

Transformers

支持多种语言开源协议:MIT #沃洛夫语翻译 #多语言互译 #低资源优化

下载量 114

发布时间 : 1/20/2025

模型简介

该模型支持沃洛夫语、法语和英语之间的双向翻译，包括沃洛夫语↔法语、沃洛夫语↔英语和法语↔英语的翻译任务。

模型特点

多语言双向翻译

支持沃洛夫语、法语和英语之间的六种翻译方向

优化预处理数据

使用经过深度预处理的沃洛夫语-法语-英语平行语料库进行微调

高效推理

基于蒸馏版NLLB模型，在保持性能的同时提高推理效率

模型能力

沃洛夫语到法语翻译

法语到沃洛夫语翻译

英语到沃洛夫语翻译

沃洛夫语到英语翻译

法语到英语翻译

英语到法语翻译

使用案例

语言服务

跨语言沟通

帮助沃洛夫语使用者与法语或英语使用者进行交流

实现准确流畅的日常对话翻译

文档翻译

将官方文件或教育材料在沃洛夫语、法语和英语之间转换

保持专业术语的准确性和上下文一致性

教育

语言学习辅助

帮助学习沃洛夫语、法语或英语的学生理解不同语言之间的对应关系

提供即时翻译参考，加速语言学习过程

🚀 翻译模型

本模型是一款专为法语 - 沃洛夫语以及沃洛夫语 - 法语翻译而优化的模型。它基于 nllb - 200 - distilled - 600M 进行微调，使用了 bilalfaye/english - wolof - french - translation 和 bilalfaye/english - wolof - french - translation - bis 数据集进行训练，这些数据集经过了大量预处理，以提高翻译质量。

支持语言

该模型支持以下双向翻译：

沃洛夫语到法语
法语到沃洛夫语
英语到沃洛夫语
沃洛夫语到英语
法语到英语
英语到法语

测试应用链接：https://huggingface.co/spaces/bilalfaye/WoFrEn - Translator

🚀 快速开始

✨ 主要特性

基于 nllb - 200 - distilled - 600M 微调，适用于多种语言对的翻译。
支持双向翻译，涵盖沃洛夫语、法语和英语。
经过数据集预处理，提升翻译质量。

📦 安装指南

安装所需库：

!pip install transformers

💻 使用示例

基础用法

手动推理：

from transformers import NllbTokenizer, AutoModelForSeq2SeqLM
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
model_load_name = 'bilalfaye/nllb-200-distilled-600M-wo-fr-en'

# Load model and tokenizer
model = AutoModelForSeq2SeqLM.from_pretrained(model_load_name).to(device)
tokenizer = NllbTokenizer.from_pretrained(model_load_name)

def translate(
    text, src_lang='wol_Latn', tgt_lang='french_Latn',
    a=32, b=3, max_input_length=1024, num_beams=4, **kwargs
):
    """Turn a text or a list of texts into a list of translations"""
    tokenizer.src_lang = src_lang
    tokenizer.tgt_lang = tgt_lang
    inputs = tokenizer(
        text, return_tensors='pt', padding=True, truncation=True,
        max_length=max_input_length
    )
    model.eval()
    result = model.generate(
        **inputs.to(model.device),
        forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang),
        max_new_tokens=int(a + b * inputs.input_ids.shape[1]),
        num_beams=num_beams, **kwargs
    )
    return tokenizer.batch_decode(result, skip_special_tokens=True)

# Example usage
print(translate("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="french_Latn")[0])
print(translate("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="eng_Latn")[0])
print(translate("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0])
print(translate("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="eng_Latn")[0])
print(translate("Hello, how are you?", src_lang="eng_Latn", tgt_lang="wol_Latn")[0])
print(translate("Hello, how are you?", src_lang="eng_Latn", tgt_lang="fr_Latn")[0])

高级用法

使用管道进行推理：

from transformers import pipeline

model_name = 'bilalfaye/nllb-200-distilled-600M-wo-fr-en'
device = "cuda" if torch.cuda.is_available() else "cpu"

translator = pipeline("translation", model=model_name, device=device)

print(translator("Ndax mën nga ko waxaat su la neexee?", src_lang="wol_Latn", tgt_lang="fra_Latn")[0]['translation_text'])
print(translator("Bonjour, où allez-vous?", src_lang="fra_Latn", tgt_lang="wol_Latn")[0]['translation_text'])

📚 详细文档

信息表格

属性	详情
模型类型	基于 nllb - 200 - distilled - 600M 微调的翻译模型
训练数据	bilalfaye/english - wolof - french - translation 和 bilalfaye/english - wolof - french - translation - bis 数据集
支持语言	沃洛夫语（wo）、法语（fr）、英语（en）
评估指标	BLEU、CHRF
基础模型	facebook/nllb - 200 - distilled - 600M
任务类型	翻译