LaMaTE Open-Source Translation Model - Achieve High-Performance and Efficient Translation Based on Llama-3-8B

Lamate

Developed by NiuTrans

LaMaTE is a high-performance efficient translation model developed based on Llama-3-8B, utilizing large language models as machine translation encoders paired with lightweight decoders.

Machine Translation

Safetensors

Supports Multiple LanguagesOpen Source License:MIT #Efficient Translation #Multilingual Support #Lightweight Decoder

Downloads 20

Release Time : 3/3/2025

Model Overview

LaMaTE is a high-performance machine translation model that bridges large language model representations with decoders through adapters, employing a two-stage training strategy to enhance performance and efficiency.

Model Features

Efficient Decoding

Decoding speed increased by 2.4 to 6.5 times

Memory Optimization

KV cache memory consumption reduced by 75%

Multilingual Support

Excellent performance in various language translation tasks

Model Capabilities

Text Translation

Multilingual Processing

Use Cases

Machine Translation

English-Chinese Translation

Translate English text into Chinese

High-quality translation results

Multilingual Translation

Supports mutual translation between multiple languages including English, Chinese, German, Czech, etc.

Excellent performance in multilingual translation tasks

🚀 LaMaTE

LaMaTE is a high - performance and efficient translation model developed based on Llama - 3 - 8B, offering enhanced efficiency, reduced memory usage, and competitive performance in diverse translation tasks.

🚀 Quick Start

For more detailed usage, please refer to github

⚠️ Important Note

Our implementation is developed with transformers v4.39.2. We recommend installing this version for best compatibility.

To deploy LaMaTE, utilize the from_pretrained() method followed by the generate() method for immediate use:

from modeling_llama_seq2seq import LlamaCrossAttentionEncDec
from transformers import AutoTokenizer, AutoConfig

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
config = AutoConfig.from_pretrained(model_name_or_path, trust_remote_code=True)
model = LlamaCrossAttentionEncDec.from_pretrained(model_name_or_path, config=config)

prompt = "Translate the following text from English into Chinese.\nEnglish: The harder you work at it, the more progress you will make.\nChinese: ",
input_ids = tokenizer(prompt, return_tensors="pt")
outputs_tokenized = model.generate(
    **input_ids,
    num_beams=5,
    do_sample=False
)
outputs = tokenizer.batch_decode(outputs_tokenized, skip_special_tokens=True)
print(outputs)

✨ Features

LaMaTE is a high-performance and efficient translation model developed based on Llama-3-8B. It utilizes large language models (LLMs) as machine translation(MT) encoders, paired with lightweight decoders. The model integrates an adapter to bridge LLM representations with the decoder, employing a two-stage training strategy to enhance performance and efficiency.

Key Features of LaMaTE

Enhanced Efficiency: Offers 2.4× to 6.5× faster decoding speeds.
Reduced Memory Usage: Reduces KV cache memory consumption by 75%.
Competitive Performance: Exhibits robust performance across diverse translation tasks.

📚 Documentation

Property	Details
License	MIT
Datasets	NiuTrans/ComMT
Languages	en, zh, de, cs
Metrics	bleu, comet
Base Model	meta-llama/Meta-Llama-3-8B
Pipeline Tag	translation

📄 License

The model is released under the MIT license.

📖 Citation

@misc{luoyf2025lamate,
      title={Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation}, 
      author={Yingfeng Luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, Jingbo Zhu},
      year={2025},
      eprint={2503.06594},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご