Bee1reason-arabic-Qwen-14B Open-source Model - A Practical Tool for Optimizing Arabic Logical Reasoning Ability

Bee1reason Arabic Qwen 14B

Developed by beetleware

A large language model for Arabic logical reasoning fine-tuned based on Qwen3-14B, optimized specifically for enhancing Arabic logical reasoning ability

Large Language Model

Transformers

ArabicOpen Source License:Apache-2.0 #Arabic logical reasoning #LoRA fine-tuning #Qwen3 optimization

Downloads 904

Release Time : 5/21/2025

Model Overview

This model is a large language model fine-tuned based on the Qwen3-14B base model, specifically optimized for the logical and deductive reasoning abilities in Arabic while retaining general dialogue capabilities.

Model Features

Optimization for Arabic logical reasoning

Specifically fine-tuned and optimized for the logical and deductive reasoning abilities in Arabic

Efficient fine-tuning technology

Fine-tuned using LoRA (Low-Rank Adaptation) technology, combined with the Unsloth library for efficient training

Retention of dialogue capabilities

Retains the general dialogue capabilities of the base model while optimizing logical reasoning abilities

16-bit precision model

The final model is a merged 16-bit (float16) precision model, which can be directly used for inference

Model Capabilities

Arabic text generation

Logical reasoning

Deductive reasoning

Inductive reasoning

Abductive reasoning

Dialogue interaction

Use Cases

Education

Logical thinking training

Used for logical thinking training and reasoning ability cultivation of Arabic students

Intelligent assistant

Arabic intelligent Q&A

Build an Arabic intelligent assistant that requires complex reasoning abilities

🚀 Bee1reason-arabic-Qwen-14B: A Qwen3 14B Model Fine-tuned for Arabic Logical Reasoning

Bee1reason-arabic-Qwen-14B is a large language model fine-tuned from the unsloth/Qwen3-14B base model. It is tailored for Arabic logical reasoning while maintaining general conversational abilities.

🚀 Quick Start

Bee1reason-arabic-Qwen-14B is a Large Language Model (LLM) fine-tuned from the unsloth/Qwen3-14B base model, which is based on Qwen/Qwen2-14B. This model is designed to enhance logical and deductive reasoning capabilities in Arabic while retaining general conversational skills.

✨ Features

Built on unsloth/Qwen3-14B: Leverages the power and performance of the Qwen3 14-billion parameter base model.
Fine-tuned for Arabic Logical Reasoning: Trained on a dataset containing Arabic logical reasoning tasks.
Conversational Format: Follows a conversational format, expecting user and assistant roles. Training data may include "thinking steps" (often within <think>...</think> tags) before the final answer, beneficial for tasks requiring explanation or complex inference.
Unsloth Efficiency: The Unsloth library was used for fine-tuning, enabling faster training and reduced GPU memory consumption.
Merged 16-bit Model: The final weights are a full float16 precision model, ready for direct use without applying LoRA adapters to a separate base model.

📦 Installation

Install VLLM

pip install vllm

(VLLM installation might have specific CUDA and PyTorch version requirements. Refer to the VLLM documentation for the latest installation prerequisites.)

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch

model_id = "beetlware/Bee1reason-arabic-Qwen-14B"

# Load the Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Load the Model
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16, # or torch.float16 if bfloat16 is not supported
    device_map="auto", # Distributes the model on available devices (GPU/CPU)
)

# Ensure the model is in evaluation mode for inference
model.eval()

Advanced Usage

Inference with Thinking Steps

user_prompt_with_thinking_request = "استخدم التفكير المنطقي خطوة بخطوة: إذا كان لدي 4 تفاحات والشجرة فيها 20 تفاحة، فكم تفاحة لدي إجمالاً؟" # "Use step-by-step logical thinking: If I have 4 apples and the tree has 20 apples, how many apples do I have in total?"

messages_with_thinking = [
    {"role": "user", "content": user_prompt_with_thinking_request}
]

# Apply the chat template
# Qwen3 uses a specific chat template. tokenizer.apply_chat_template is the correct way to format it.
chat_prompt_with_thinking = tokenizer.apply_chat_template(
    messages_with_thinking,
    tokenize=False,
    add_generation_prompt=True # Important for adding the assistant's generation prompt
)

inputs_with_thinking = tokenizer(chat_prompt_with_thinking, return_tensors="pt").to(model.device)

print("\n--- Inference with Thinking Request (Example) ---")
streamer_think = TextStreamer(tokenizer, skip_prompt=True)
with torch.no_grad(): # Important to disable gradients during inference
    outputs_think = model.generate(
        **inputs_with_thinking,
        max_new_tokens=512,
        temperature=0.6, # Recommended settings for reasoning by Qwen team
        top_p=0.95,
        top_k=20,
        pad_token_id=tokenizer.eos_token_id,
        streamer=streamer_think
    )

Normal Inference

# --- Example for Normal Inference (Conversation without explicit thinking request) ---
user_prompt_normal = "ما هي عاصمة مصر؟" # "What is the capital of Egypt?"
messages_normal = [
    {"role": "user", "content": user_prompt_normal}
]

chat_prompt_normal = tokenizer.apply_chat_template(
    messages_normal,
    tokenize=False,
    add_generation_prompt=True
)
inputs_normal = tokenizer(chat_prompt_normal, return_tensors="pt").to(model.device)

print("\n\n--- Normal Inference (Example) ---")
streamer_normal = TextStreamer(tokenizer, skip_prompt=True)
with torch.no_grad():
    outputs_normal = model.generate(
        **inputs_normal,
        max_new_tokens=100,
        temperature=0.7, # Recommended settings for normal chat
        top_p=0.8,
        top_k=20,
        pad_token_id=tokenizer.eos_token_id,
        streamer=streamer_normal
    )

Usage with VLLM

python -m vllm.entrypoints.openai.api_server \
    --model beetlware/Bee1reason-arabic-Qwen-14B \
    --tokenizer beetlware/Bee1reason-arabic-Qwen-14B \
    --dtype bfloat16 \
    --max-model-len 2048 \
    # --tensor-parallel-size N  # If you have multiple GPUs
    # --gpu-memory-utilization 0.9 # To adjust GPU memory usage

import openai

client = openai.OpenAI(
    base_url="http://localhost:8000/v1", # VLLM server address
    api_key="dummy_key" # VLLM doesn't require an actual API key by default
)

completion = client.chat.completions.create(
    model="beetlware/Bee1reason-arabic-Qwen-14B", # Model name as specified in VLLM
    messages=[
        {"role": "user", "content": "اشرح نظرية النسبية العامة بكلمات بسيطة."} # "Explain the theory of general relativity in simple terms."
    ],
    max_tokens=256,
    temperature=0.7,
    stream=True # To enable streaming
)

print("Streaming response from VLLM:")
full_response = ""
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        token = chunk.choices[0].delta.content
        print(token, end="", flush=True)
        full_response += token
print("\n--- End of stream ---")

📚 Documentation

Training Data

The model was primarily fine-tuned on a custom Arabic logical reasoning dataset, beetlware/arabic-reasoning-dataset-logic, available on the Hugging Face Hub. This dataset includes various types of reasoning tasks (deduction, induction, abduction), with each task comprising the question text, a proposed answer, and a detailed solution including thinking steps.

The data was converted into a conversational format for training, typically with:

User Role: Containing the problem/question text.
Assistant Role: Containing the detailed solution, including thinking steps (often within <think>...</think> tags) followed by the final answer.

Fine-tuning Details

Base Model: unsloth/Qwen3-14B
Fine-tuning Technique: LoRA (Low-Rank Adaptation)
- r (rank): 32
- lora_alpha: 32
- target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
- lora_dropout: 0
- bias: "none"
Libraries Used: Unsloth (for efficient model loading and PEFT application) and Hugging Face TRL (SFTTrainer)
Max Sequence Length (max_seq_length): 2048 tokens
Training Parameters (example from notebook):
- per_device_train_batch_size: 2
- gradient_accumulation_steps: 4 (simulating a total batch size of 8)
- warmup_steps: 5
- max_steps: 30 (in the notebook, adjustable for a full run)
- learning_rate: 2e-4 (recommended to reduce to 2e-5 for longer training runs)
- optim: "adamw_8bit"
Final Save: LoRA weights were merged with the base model and saved in merged_16bit (float16) precision.

🔧 Technical Details

The model's performance is highly dependent on the quality and diversity of the training data. It may exhibit biases present in the data it was trained on. Despite fine-tuning for logical reasoning, the model might still make errors on very complex or unfamiliar reasoning tasks. The model may "hallucinate" or produce incorrect information, especially for topics not well-covered in its training data. Capabilities in languages other than Arabic (if primarily trained on Arabic) might be limited.

📄 License

This model is released under the apache-2.0 license.

Additional Information

Property	Details
Developed by	[loai abdalslam/Organization - beetleware]
Upload/Release Date	[21-5-2025]
Contact / Issue Reporting	[loai.abdalsalm@beetleware.com]

Beetleware

Our Offices

KSA Office
- Phone: (+966) 54 597 3282
- Email: ahmed.taha@beetleware.com
Egypt Office
- Phone: (+2) 010 67 256 306
- Email: ahmed.abullah@beetleware.com
Oman Office
- Phone: (+968) 9522 8632

Uploaded model

Developed by: beetlware AI Team
License: apache-2.0
Finetuned from model: unsloth/qwen3-14b-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご