🚀 [ALMA-R]
ALMA-R is built upon ALMA models. Instead of using the Supervised Fine - tuning in ALMA, it undergoes further LoRA fine - tuning with our proposed Contrastive Preference Optimization (CPO). CPO fine - tuning requires our triplet preference data for preference learning. Currently, ALMA - R can match or even outperform GPT - 4 or WMT winners!
@misc{xu2024contrastive,
title={Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation},
author={Haoran Xu and Amr Sharaf and Yunmo Chen and Weiting Tan and Lingfeng Shen and Benjamin Van Durme and Kenton Murray and Young Jin Kim},
year={2024},
eprint={2401.08417},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{xu2023paradigm,
title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models},
author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
year={2023},
eprint={2309.11674},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
🚀 Quick Start
A quick start to use our best system (ALMA - 13B - R) for translation. An example of translating "我爱机器翻译。" into English:
import torch
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("haoranxu/ALMA-13B-R", torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("haoranxu/ALMA-13B-R", padding_side='left')
prompt="Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"
input_ids = tokenizer(prompt, return_tensors="pt", padding=True, max_length=40, truncation=True).input_ids.cuda()
with torch.no_grad():
generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=20, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)
✨ Features
- Based on ALMA: ALMA - R builds on the foundation of ALMA models.
- CPO Fine - tuning: Utilizes Contrastive Preference Optimization (CPO) for LoRA fine - tuning, which requires triplet preference data for preference learning.
- High Performance: Can match or exceed GPT - 4 or WMT winners in performance.
📦 Installation
The text does not provide specific installation steps, so this section is skipped.
💻 Usage Examples
Basic Usage
import torch
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("haoranxu/ALMA-13B-R", torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("haoranxu/ALMA-13B-R", padding_side='left')
prompt="Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:"
input_ids = tokenizer(prompt, return_tensors="pt", padding=True, max_length=40, truncation=True).input_ids.cuda()
with torch.no_grad():
generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=20, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)
📚 Documentation
Download ALMA(-R) Models and Dataset 🚀
We release six translation models presented in the paper:
- ALMA - 7B
- ALMA - 7B - LoRA
- ALMA - 7B - R (NEW!): Further LoRA fine - tuning upon ALMA - 7B - LoRA with contrastive preference optimization.
- ALMA - 13B
- ALMA - 13B - LoRA
- ALMA - 13B - R (NEW!): Further LoRA fine - tuning upon ALMA - 13B - LoRA with contrastive preference optimization (BEST MODEL!).
Model checkpoints are released at huggingface:
Property |
Details |
ALMA - 7B |
Base Model Link: [haoranxu/ALMA - 7B](https://huggingface.co/haoranxu/ALMA - 7B), LoRA Link: - |
ALMA - 7B - LoRA |
Base Model Link: [haoranxu/ALMA - 7B - Pretrain](https://huggingface.co/haoranxu/ALMA - 7B - Pretrain), LoRA Link: [haoranxu/ALMA - 7B - Pretrain - LoRA](https://huggingface.co/haoranxu/ALMA - 7B - Pretrain - LoRA) |
ALMA - 7B - R (NEW!) |
Base Model Link: [haoranxu/ALMA - 7B - R (LoRA merged)](https://huggingface.co/haoranxu/ALMA - 7B - R), LoRA Link: - |
ALMA - 13B |
Base Model Link: [haoranxu/ALMA - 13B](https://huggingface.co/haoranxu/ALMA - 13B), LoRA Link: - |
ALMA - 13B - LoRA |
Base Model Link: [haoranxu/ALMA - 13B - Pretrain](https://huggingface.co/haoranxu/ALMA - 13B - Pretrain), LoRA Link: [haoranxu/ALMA - 13B - Pretrain - LoRA](https://huggingface.co/haoranxu/ALMA - 13B - Pretrain - LoRA) |
ALMA - 13B - R (NEW!) |
Base Model Link: [haoranxu/ALMA - 13B - R (LoRA merged)](https://huggingface.co/haoranxu/ALMA - 13B - R), LoRA Link: - |
Note that ALMA - 7B - Pretrain
and ALMA - 13B - Pretrain
are NOT translation models. They only experience stage 1 monolingual fine - tuning (20B tokens for the 7B model and 12B tokens for the 13B model), and should be utilized in conjunction with their LoRA models.
Datasets used by ALMA and ALMA - R are also released at huggingface now (NEW!)
Property |
Train / Validation |
Test |
Human - Written Parallel Data (ALMA) |
[train and validation](https://huggingface.co/datasets/haoranxu/ALMA - Human - Parallel) |
[WMT'22](https://huggingface.co/datasets/haoranxu/WMT22 - Test) |
Triplet Preference Data |
[train](https://huggingface.co/datasets/haoranxu/ALMA - R - Preference) |
[WMT'22](https://huggingface.co/datasets/haoranxu/WMT22 - Test) and [WMT'23](https://huggingface.co/datasets/haoranxu/WMT23 - Test) |
Please find more details in our GitHub repository
🔧 Technical Details
The text does not provide specific technical details, so this section is skipped.
📄 License
The project is licensed under the MIT license.