library_name: transformers
license: cc-by-nc-4.0
base_model: atlasia/Terjman-Nano-v1
metrics:
- bleu
- chrf
- ter
model-index:
- name: Terjman-Nano-v2.0
results: []
datasets:
- BounharAbdelaziz/Terjman-v2-English-Darija-Dataset-350K
language:
- ary
- en
pipeline_tag: translation
🇲🇦 Terjman-Nano-v2.0 (77M) 🚀
Terjman-Nano-v2.0 is an improved version of atlasia/Terjman-Nano-v1, built on the powerful Transformer architecture and fine-tuned for high-quality, accurate translations.
This version is based on atlasia/Terjman-Nano-v1 and has been trained on a larger and more refined dataset, leading to improved translation performance. The model achieves results on par with gpt-4o-2024-08-06 on TerjamaBench, an evaluation benchmark for English-Moroccan darija translation models, that challenges the models more on the cultural aspect.
🚀 Features
✅ Fine-tuned for English->Moroccan darija translation.
✅ State-of-the-art performance among open-source models.
✅ Compatible with 🤗 Transformers and easily deployable on various hardware setups.
🔥 Performance Comparison
The following table compares Terjman-Nano-v2.0 against proprietary and open-source models using BLEU, chrF, and TER scores. Higher BLEU/chrF and lower TER indicate better translation quality.
Model |
Size |
BLEU↑ |
chrF↑ |
TER↓ |
Proprietary Models |
|
|
|
|
gemini-exp-1206 |
* |
30.69 |
54.16 |
67.62 |
claude-3-5-sonnet-20241022 |
* |
30.51 |
51.80 |
67.42 |
gpt-4o-2024-08-06 |
* |
28.30 |
50.13 |
71.77 |
Open-Source Models |
|
|
|
|
Terjman-Ultra-v2.0 |
1.3B |
25.00 |
44.70 |
77.20 |
Terjman-Supreme-v2.0 |
3.3B |
23.43 |
44.57 |
78.17 |
Terjman-Large-v2.0 |
240M |
22.67 |
42.57 |
83.00 |
Terjman-Nano-v2.0 (This model) |
77M |
18.84 |
38.41 |
94.73 |
atlasia/Terjman-Large-v1.2 |
240M |
16.33 |
37.10 |
89.13 |
MBZUAI-Paris/Atlas-Chat-9B |
9B |
14.80 |
35.26 |
93.95 |
facebook/nllb-200-3.3B |
3.3B |
14.76 |
34.17 |
94.33 |
atlasia/Terjman-Nano |
77M |
09.98 |
26.55 |
106.49 |
🔬 Model Details
- Base Model: atlasia/Terjman-Nano-v1
- Architecture: Transformer-based sequence-to-sequence model
- Training Data: High-quality parallel corpora with high quality translations
- Training Precision: FP16 for efficient inference
🚀 How to Use
You can use the model with the Hugging Face Transformers library:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model_name = "BounharAbdelaziz/Terjman-Nano-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
def translate(text):
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs)
return tokenizer.decode(output[0], skip_special_tokens=True)
text = "Hello there! Today the weather is so nice in Geneva, couldn't ask for more to enjoy the holidays :)"
translation = translate(text)
print("Translation:", translation)
🖥️ Deployment
Run in a Hugging Face Space
Try the model interactively in the Terjman-Nano Space 🤗
Use with Text Generation Inference (TGI)
For fast inference, use Hugging Face TGI:
pip install text-generation
text-generation-launcher --model-id BounharAbdelaziz/Terjman-Nano-v2.0
Run Locally with Transformers & PyTorch
pip install transformers torch
python -c "from transformers import pipeline; print(pipeline('translation', model='BounharAbdelaziz/Terjman-Nano-v2.0')('Hello there!'))"
Deploy on an API Server
Use FastAPI to serve translations as an API:
from fastapi import FastAPI
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
app = FastAPI()
model_name = "BounharAbdelaziz/Terjman-Nano-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
@app.get("/translate/")
def translate(text: str):
inputs = tokenizer(text, return_tensors="pt")
output = model.generate(**inputs)
return {"translation": tokenizer.decode(output[0], skip_special_tokens=True)}
🛠️ Training Details Hyperparameters**
The model was fine-tuned using the following training settings:
- Learning Rate:
0.0001
- Training Batch Size:
64
- Evaluation Batch Size:
64
- Seed:
42
- Gradient Accumulation Steps:
4
- Total Effective Batch Size:
256
- Optimizer:
AdamW (Torch)
with betas=(0.9,0.999)
, epsilon=1e-08
- Learning Rate Scheduler:
Linear
- Warmup Ratio:
0.1
- Epochs:
5
- Precision:
Mixed FP16
for efficient training
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
📜 License
This model is released under the CC BY-NC (Creative Commons Attribution-NonCommercial) license, meaning it can be used for research and personal projects but not for commercial purposes. For commercial use, please get in touch :)
@misc{terjman-v2,
title = {Terjman-v2: High-Quality English-Moroccan Darija Translation Model},
author={Abdelaziz Bounhar},
year={2025},
howpublished = {\url{https://huggingface.co/BounharAbdelaziz/Terjman-Nano-v2.0}},
license = {CC BY-NC}
}