ALMA-13B-R开源机器翻译模型 - 性能超GPT-4及冠军模型，高效翻译首选

首页

ALMA 13B R

由 haoranxu 开发

ALMA-13B-R是基于ALMA模型开发的机器翻译模型，采用对比偏好优化（CPO）进行LoRA微调，性能超越GPT-4和WMT冠军模型。

机器翻译

Transformers

开源协议:MIT #对比偏好优化 #超越GPT-4翻译 #LoRA微调

下载量 4,216

发布时间 : 1/17/2024

模型简介

ALMA-13B-R是一个高性能机器翻译模型，通过对比偏好优化技术提升翻译质量，支持多种语言对翻译任务。

模型特点

对比偏好优化（CPO）

采用创新的对比偏好优化方法进行LoRA微调，显著提升翻译质量。

高性能翻译

在多个测试集上达到或超越GPT-4和WMT冠军模型的翻译水平。

LoRA微调

使用LoRA（低秩适应）技术进行高效微调，降低计算资源需求。

模型能力

高质量机器翻译

多语言翻译

上下文理解

使用案例

专业翻译

技术文档翻译

将技术文档从一种语言翻译为另一种语言，保持专业术语准确性。

达到专业人工翻译水平

文学翻译

文学作品的高质量翻译，保持原文风格和意境。

超越传统机器翻译系统

商业应用

跨国企业沟通

为企业内部跨语言沟通提供即时翻译支持。

提高沟通效率

🚀 [ALMA-R：基于对比偏好优化的机器翻译模型]

ALMA-R 是在 ALMA 模型的基础上进一步发展而来。与 ALMA 采用的监督微调不同，ALMA-R 通过我们提出的 对比偏好优化（Contrastive Preference Optimization，CPO） 进行 LoRA 微调。CPO 微调需要使用我们的三元组偏好数据进行偏好学习。目前，ALMA-R 在性能上已经能够与 GPT - 4 或 WMT 获胜模型相媲美，甚至超越它们！

📄 许可证

本项目采用 MIT 许可证。

📚 引用信息

@misc{xu2024contrastive,
      title={Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation}, 
      author={Haoran Xu and Amr Sharaf and Yunmo Chen and Weiting Tan and Lingfeng Shen and Benjamin Van Durme and Kenton Murray and Young Jin Kim},
      year={2024},
      eprint={2401.08417},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{xu2023paradigm,
      title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models}, 
      author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
      year={2023},
      eprint={2309.11674},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

📦 下载 ALMA(-R) 模型和数据集

我们发布了论文中介绍的六个翻译模型：

ALMA - 7B
ALMA - 7B - LoRA
ALMA - 7B - R（新！）：在 ALMA - 7B - LoRA 的基础上，通过对比偏好优化进行进一步的 LoRA 微调。
ALMA - 13B
ALMA - 13B - LoRA
ALMA - 13B - R（新！）：在 ALMA - 13B - LoRA 的基础上，通过对比偏好优化进行进一步的 LoRA 微调（最佳模型！）

模型检查点已在 Hugging Face 上发布：

模型	基础模型链接	LoRA 链接
ALMA - 7B	[haoranxu/ALMA - 7B](https://huggingface.co/haoranxu/ALMA - 7B)	-
ALMA - 7B - LoRA	[haoranxu/ALMA - 7B - Pretrain](https://huggingface.co/haoranxu/ALMA - 7B - Pretrain)	[haoranxu/ALMA - 7B - Pretrain - LoRA](https://huggingface.co/haoranxu/ALMA - 7B - Pretrain - LoRA)
ALMA - 7B - R（新！）	[haoranxu/ALMA - 7B - R (LoRA merged)](https://huggingface.co/haoranxu/ALMA - 7B - R)	-
ALMA - 13B	[haoranxu/ALMA - 13B](https://huggingface.co/haoranxu/ALMA - 13B)	-
ALMA - 13B - LoRA	[haoranxu/ALMA - 13B - Pretrain](https://huggingface.co/haoranxu/ALMA - 13B - Pretrain)	[haoranxu/ALMA - 13B - Pretrain - LoRA](https://huggingface.co/haoranxu/ALMA - 13B - Pretrain - LoRA)
ALMA - 13B - R（新！）	[haoranxu/ALMA - 13B - R (LoRA merged)](https://huggingface.co/haoranxu/ALMA - 13B - R)	-

⚠️ 重要提示

请注意，ALMA - 7B - Pretrain 和 ALMA - 13B - Pretrain 不是翻译模型。它们仅经历了第一阶段的单语微调（7B 模型使用 200 亿个标记，13B 模型使用 120 亿个标记），需要与它们的 LoRA 模型结合使用。

ALMA 和 ALMA - R 使用的数据集也已在 Hugging Face 上发布（新！）：

数据集	训练/验证集	测试集
人工编写的平行数据（ALMA）	[训练和验证](https://huggingface.co/datasets/haoranxu/ALMA - Human - Parallel)	[WMT'22](https://huggingface.co/datasets/haoranxu/WMT22 - Test)
三元组偏好数据	[训练](https://huggingface.co/datasets/haoranxu/ALMA - R - Preference)	[WMT'22](https://huggingface.co/datasets/haoranxu/WMT22 - Test) 和 [WMT'23](https://huggingface.co/datasets/haoranxu/WMT23 - Test)

💻 使用示例

基础用法

以下是使用我们的最佳系统（ALMA - 13B - R）进行翻译的快速入门示例，将“我爱机器翻译。”翻译成英语：

import torch
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

# Load base model and LoRA weights
model = AutoModelForCausalLM.from_pretrained("haoranxu/ALMA-13B-R", torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("haoranxu/ALMA-13B-R", padding_side='left')

# Add the source sentence into the prompt template
prompt="Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"
input_ids = tokenizer(prompt, return_tensors="pt", padding=True, max_length=40, truncation=True).input_ids.cuda()

# Translation
with torch.no_grad():
    generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=20, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)