SILMA-Kashif-2B-Instruct-v1.0 Open Source Model - Supports Arabic-English RAG, capable of entity extraction and multi-domain processing

SILMA Kashif 2B Instruct V1.0

Developed by silma-ai

SILMA Kashif 2B Instruct v1.0 is an open-source model designed specifically for Arabic and English RAG (Retrieval Augmented Generation) tasks. It is built on Google Gemma and has the capabilities of entity extraction and multi-domain processing.

Large Language Model

Transformers

Supports Multiple Languages#Arabic RAG #Multi-domain Q&A #Entity extraction

Downloads 3,432

Release Time : 1/26/2025

Model Overview

This model focuses on context-based Q&A tasks, supports Arabic and English, can handle texts in multiple domains such as finance, healthcare, and law, and has the capabilities of entity extraction and complex question processing.

Model Features

Multi-language Q&A

Can answer questions in Arabic and English based on context

Entity extraction

Has the ability to extract entities from text

Multi-domain adaptation

Can handle questions from texts in different domains such as finance, healthcare, and law

Complex question processing

Can handle multi-hop questions, ambiguous contexts, and complex prompts

Answer screening

Has the ability to deny and reject, and can identify and exclude inaccurate answers

Model Capabilities

Text generation

Q&A system

Entity recognition

Multi-language processing

Context understanding

Use Cases

Finance domain

Financial agreement analysis

Parse financial agreement texts and answer related questions

Accurately extract key entities and clauses in the agreement

Legal domain

Legal document interpretation

Understand the content of legal documents and answer related questions

Correctly identify legal entities and case details

Medical domain

Medical literature Q&A

Answer professional questions based on medical literature

Provide accurate medical information retrieval

🚀 SILMA Kashif v1.0 (The Arabic RAG Model)

SILMA Kashif 2B Instruct v1.0 is the first release in the SILMA Kashif model family, specifically designed for RAG (Retrieval-Augmented Generation) tasks.
Kashif excels at answering questions based on context in both Arabic and English. Additionally, the model can perform Entity Extraction tasks as a secondary skill.
Based on our evaluations using the SILMA RAGQA Benchmark, SILMA Kashif 2B v1.0 is the top-performing open model for RAG within the 3 - 9 billion parameter range.
SILMA Kashif is built on Google Gemma's powerful foundational models, combining their strengths to offer unparalleled performance for users.
Kashif is an open-weight model, free to use under our open license.
Finally, the model has a context length of 12k.

⚠️ Important Note

Kashif is a specialized model and should ONLY be used in RAG setups. If you're looking for a general-purpose model, please refer to SILMA 9B Instruct v1.0.

✨ Features

The model has undergone intensive training to master a wide range of tasks and achieve excellent performance:

Ability to answer questions in Arabic and English.
Capability to handle short and long contexts.
Capacity to provide short and long answers effectively.
Skill to answer complex numerical questions.
Proficiency in answering questions based on tabular data.
Competence in answering multi-hop questions, i.e., answering a single question using data from multiple paragraphs.
Ability to perform negative rejection, identifying and excluding inaccurate answers and providing a more accurate statement like "The answer cannot be found in the given context".
Skill to handle multi-domains, answering questions based on texts from different fields such as finance, medical, legal, etc.
Capacity to deal with ambiguous contexts.
Ability to extract entities from text.
Competence to handle diverse and complex prompts.

📊 Model Evaluation

Dataset	Exact Match	Rouge1	BLEU	BERTScore
ragbench-finqa-en-test	0.000	0.587	0.321	0.760
ragbench-tatqa-ar-test	0.000	0.484	0.130	0.774
ragbench-tatqa-en-test	0.059	0.646	0.423	0.808
rag-instruct-benchmark-tester-en	0.370	0.683	0.196	0.791
ragbench-expertqa-en-test	0.000	0.465	0.151	0.677
ragbench-msmarco-ar-test	0.000	0.144	0.096	0.781
sciq-ar-test	0.170	0.000	0.000	0.753
ragbench-covidqa-en-test	0.020	0.521	0.242	0.734
ragbench-emanual-ar-test	0.000	0.237	0.159	0.806
ragbench-finqa-ar-test	0.000	0.377	0.109	0.780
xquad-r-validation-en	0.120	0.326	0.041	0.603
ragbench-emanual-en-test	0.000	0.565	0.288	0.722
xquad-r-ar-validation	0.070	0.130	0.042	0.698
boolq-ar-test	0.450	0.000	0.000	0.700
ragbench-hotpotqa-en-test	0.060	0.732	0.503	0.837
ragbench-covidqa-ar-test	0.000	0.179	0.104	0.783
ragbench-msmarco-en-test	0.020	0.491	0.207	0.729
Benchmark Average Scores	0.079	0.386	0.177	0.749

SILMA RAG QA Benchmark Score: 0.3478

👩‍💻 SILMA AI

silma.ai is a leading GenAI startup that specializes in building and customizing cutting-edge Large Language Models (LLMs) and AI technologies for the Arabic language.

📦 Installation

First, install the Transformers library with:

pip install -U transformers

💻 Usage Examples

Basic Usage

Running with the `pipeline` API

import torch
from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="silma-ai/SILMA-Kashif-2B-Instruct-v1.0",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda",  # replace with "mps" to run on a Mac device
)

messages = [
    {"role": "user", "content":
"""
أجب على السؤال بناءً على السياق أدناه

السياق:
تشمل الاتفاقيات رسوم حمل سنوية ثابت قدها 30 مليون جنيه إسترليني للقنوات نظراً لأن كلاً من مزوديها قادرين على تأمين دفعات إضافية إذا ما حققت هذه القنوات أهدافاً متعلقةً بالأداء.
لا يوجد حالياً ما يشير إلى ما إذا كان الاتفاق الجديد يشمل محتوىً إضافياً كالفيديو عند الطلب والدقة العالية ، كذلك الذي سبق أن قدمته بي سكاي بي.
وقد وافقت كل من بي سكاي بي و فيرجين ميديا على إنهاء الدعاوى القضائية بالمحكمة العليا ضد بعضهما بشأن معاليم الحمل التي تخص قنواتهما الأساسية.

السؤال: ماسم الشركة التي وافقت على إنهاء دعواها القضائية ضد بي سكاي بي بالمحكمة العليا؟
الإجابة:
"""},
]

outputs = pipe(messages, max_new_tokens=600)
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
print(assistant_response)

Response:

فيرجين ميديا

"وقد وافقت كل من بي سكاي بي و فيرجين ميديا على إنهاء الدعاوى القضائية بالمحكمة العليا ضد بعضهما بشأن معاليم الحمل التي تخص قنواتهما الأساسية."

💡 Usage Tip

For advanced usage examples such as multi-gpu, quantization or chat templates, please refer to SILMA v1.0 examples.

Running with Ollama

ollama run hf.co/silma-ai/SILMA-Kashif-2B-Instruct-v1.0-GGUF

Prompt Format

Here is a recommended way to prompt the model. You can modify the prompt based on your specific requirements, but if you encounter any challenges, following the format below, which was used to train the model, may be helpful:

Arabic

أجب على السؤال بناءً على السياق أدناه

السياق:
.....
.....

السؤال: ...
الإجابة: ...

English

Answer the following question using the provided context below

Context: 
.....
.....

Question: ...
Answer: ...

GPU Requirements

The following are the minimum/recommended GPU requirements for running inference:

Recommended:
- At least one GPU with a minimum of 24 GB of GPU memory.
- Examples: Nvidia RTX 4090.
Minimum:
- At least one GPU with 8 GB of GPU memory.
- Examples: Nvidia RTX 3070, RTX 3080 or T4.

🔧 Effect of Quantization

We have observed a 2.6% drop in score (to 0.338) for the same model quantized to 4-bit.

📄 License

The model is released under the gemma license.

📚 Citation

@misc{silma-kashif-2b-2024,
  author       = {{SILMA-AI}},
  title        = {SILMA Kashif 2B Instruct v1.0},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/silma-ai/SILMA-Kashif-2B-Instruct-v1.0}}
}

📋 Intended Usage

The model should only be used in question-answering use-cases such as RAG.
The model can also be used to extract entities from text.

⚠️ Limitations

Due to its relatively small number of parameters, the model is not very effective for handling complex numerical and financial reasoning, such as solving tricky calculations.
The model has been specifically trained for text-based question answering, which may limit its ability to perform tasks beyond this scope, including simple tasks.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご