🚀 Mizan-Rerank-v1
A revolutionary open-source model for reranking Arabic long texts with exceptional efficiency and accuracy.

🚀 Quick Start
Mizan-Rerank-v1 is a leading open-source model based on the Transformer architecture, specifically designed for reranking search results in Arabic texts. With only 149 million parameters, it offers a perfect balance between performance and efficiency, outperforming larger models while using significantly fewer resources.
✨ Features
- Lightweight & Efficient: 149M parameters vs competitors with 278 - 568M parameters
- Long Text Processing: Handles up to 8192 tokens with sliding window technique
- High-Speed Inference: 3x faster than comparable models
- Arabic Language Optimization: Specifically fine-tuned for Arabic language nuances
- Resource Efficient: 75% less memory consumption than competitors
📊 Performance Benchmarks
Hardware Performance (RTX 4090 24GB)
Property |
Details |
Model |
RAM Usage |
Mizan-Rerank-v1 |
1 GB |
bg-rerank-v2-m3 |
4 GB |
jina-reranker-v2-base-multilingual |
2.5 GB |
MIRACL Dataset Results (ndcg@10)
Model |
Score |
Mizan-Rerank-v1 |
0.8865 |
bge-reranker-v2-m3 |
0.8863 |
jina-reranker-v2-base-multilingual |
0.8481 |
Namaa-ARA-Reranker-V1 |
0.7941 |
Namaa-Reranker-v1 |
0.7176 |
ms-marco-MiniLM-L12-v2 |
0.1750 |
Reranking and Triplet Datasets (ndcg@10)
Model |
Reranking Dataset |
Triplet Dataset |
Mizan-Rerank-v1 |
1.0000 |
1.0000 |
bge-reranker-v2-m3 |
1.0000 |
0.9998 |
jina-reranker-v2-base-multilingual |
1.0000 |
1.0000 |
Namaa-ARA-Reranker-V1 |
1.0000 |
0.9989 |
Namaa-Reranker-v1 |
1.0000 |
0.9994 |
ms-marco-MiniLM-L12-v2 |
0.8906 |
0.9087 |
🔧 Technical Details
Mizan-Rerank-v1 was trained on a diverse corpus of 741,159,981 tokens from:
- Authentic Arabic open-source datasets
- Manually crafted and processed text
- Purpose-generated synthetic data
This comprehensive training approach enables deep understanding of Arabic linguistic contexts.
📚 Documentation
How It Works
- Query reception: The model receives a user query and candidate texts
- Content analysis: Analyzes semantic relationships between query and each text
- Relevance scoring: Assigns a relevance score to each text
- Reranking: Sorts results by descending relevance score
💻 Usage Examples
Basic Usage
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("ALJIACHI/Mizan-Rerank-v1")
tokenizer = AutoTokenizer.from_pretrained("ALJIACHI/Mizan-Rerank-v1")
def get_relevance_score(query, passage):
inputs = tokenizer(query, passage, return_tensors="pt", padding=True, truncation=True, max_length=8192)
outputs = model(**inputs)
return outputs.logits.item()
query = "ما هو تفسير الآية وجعلنا من الماء كل شيء حي"
passages = [
"تعني الآية أن الماء هو عنصر أساسي في حياة جميع الكائنات الحية، وهو ضروري لاستمرار الحياة.",
"تم اكتشاف كواكب خارج المجموعة الشمسية تحتوي على مياه متجمدة.",
"تحدث القرآن الكريم عن البرق والرعد في عدة مواضع مختلفة."
]
scores = [(passage, get_relevance_score(query, passage)) for passage in passages]
reranked_passages = sorted(scores, key=lambda x: x[1], reverse=True)
for passage, score in reranked_passages:
print(f"Score: {score:.4f} | Passage: {passage}")
Practical Examples
Example 1
Question: What is the new tax law in 2024?
Text |
Score |
The official newspaper published a new law in 2024 stating a 5% increase in taxes on large companies. |
0.9989 |
Taxes are an important source of national income and their rates vary from country to country. |
0.0001 |
The government launched a new renewable energy project in 2024. |
0.0001 |
Example 2
Question: What is the interpretation of the verse "And We made from water every living thing"?
Text |
Score |
The verse means that water is an essential element in the life of all living things and is necessary for the continuation of life. |
0.9996 |
Planets outside the solar system containing frozen water have been discovered. |
0.0000 |
The Holy Quran mentions lightning and thunder in several different places. |
0.0000 |
Example 3
Question: What are the benefits of vitamin D?
Text |
Score |
Vitamin D helps strengthen bone health and the immune system and plays an important role in calcium absorption. |
0.9991 |
Vitamin D is used as a preservative in some food industries. |
0.9941 |
Vitamin D can be obtained through sun exposure or by taking dietary supplements. |
0.9938 |
📈 Applications
Mizan-Rerank-v1 opens new horizons for Arabic NLP applications:
- Specialized Arabic search engines
- Archiving systems and digital libraries
- Conversational AI applications
- E-learning platforms
- Information retrieval systems
📝 Citation
If you use Mizan-Rerank-v1 in your research, please cite:
@software{Mizan_Rerank_v1_2025,
author = {Ali Aljiachi},
title = {Mizan-Rerank-v1: A Revolutionary Arabic Text Reranking Model},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/Aljiachi/Mizan-Rerank-v1}
}
@misc{modernbert,
title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference},
author={Benjamin Warner and Antoine Chaffin and Benjamin Clavié and Orion Weller and Oskar Hallström and Said Taghadouini and Alexis Gallagher and Raja Biswas and Faisal Ladhak and Tom Aarsen and Nathan Cooper and Griffin Adams and Jeremy Howard and Iacopo Poli},
year={2024},
eprint={2412.13663},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.13663},
}
📄 License
We release the Mizan-Rerank model model weights under the Apache 2.0 license.