Tom-Qwen-7B-Instruct Open-Source Conversation Model - Free to use and accurately execute instruction-based conversations

Tom Qwen 7B Instruct

Developed by theprint

A 7-billion parameter model fine-tuned based on Qwen/Qwen2.5-7B-Instruct, focusing on step-by-step instruction execution and dialogue tasks.

Large Language Model EnglishOpen Source License:Apache-2.0 #LoRA Fine-tuning #Instruction Decomposition #Dialogue Optimization

Downloads 142

Release Time : 7/18/2025

Model Overview

This is a large language model fine-tuned with LoRA, specifically designed for step-by-step instruction execution and dialogue interaction.

Model Features

Efficient Fine-tuning

Efficient training using LoRA (Low-Rank Adaptation) technology with a rank of 128

Multiple Quantization Versions

Provides multiple quantization versions from 3-bit to 8-bit to meet different hardware requirements

Dialogue Optimization

Specifically fine-tuned for dialogue scenarios to optimize step-by-step instruction execution capabilities

Model Capabilities

Text Generation

Dialogue Interaction

Instruction Execution

Brainstorming

Use Cases

Dialogue System

Intelligent Assistant

Answer users' questions as a dialogue assistant

Content Generation

Creative Writing

Assist users in brainstorming and creative writing

🚀 Tom-Qwen-7B-Instruct

A fine-tuned 7B parameter model specialized for step-by-step instruction and conversation.

🚀 Quick Start

Tom-Qwen-7B-Instruct is a fine - tuned model designed for efficient conversation and instruction - following. You can quickly start using it with the code examples below.

✨ Features

Fine - tuned Efficiency: This model is a fine - tuned version of Qwen/Qwen2.5 - 7B - Instruct using the Unsloth framework with LoRA for efficient training.
Multiple Usage Modes: Supports various usage methods, including using with unsloth, standard transformers, and llama.cpp.
Quantized Versions: Provides different quantized GGUF versions for different hardware and performance requirements.

📦 Installation

No specific local installation steps are provided in the original document. You can directly use the pretrained model from the Hugging Face repository.

💻 Usage Examples

Basic Usage

from unsloth import FastLanguageModel
import torch

# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="theprint/Tom-Qwen-7B-Instruct",
    max_seq_length=4096,
    dtype=None,
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Example usage
inputs = tokenizer(["Your prompt here"], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Advanced Usage

Alternative Usage (Standard Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "theprint/Tom-Qwen-7B-Instruct",
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("theprint/Tom-Qwen-7B-Instruct")

# Example usage
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Your question here"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)

Using with llama.cpp

# Download a quantized version (q4_k_m recommended for most use cases)
wget https://huggingface.co/theprint/Tom-Qwen-7B-Instruct/resolve/main/gguf/Tom-Qwen-7B-Instruct-q4_k_m.gguf

# Run with llama.cpp
./llama.cpp/main -m Tom-Qwen-7B-Instruct-q4_k_m.gguf -p "Your prompt here" -n 256

📚 Documentation

Model Details

Property	Details
Developed by	theprint
Model Type	Causal Language Model (Fine - tuned with LoRA)
Language	en
License	apache - 2.0
Base model	Qwen/Qwen2.5 - 7B - Instruct
Fine - tuning method	LoRA with rank 128

GGUF Quantized Versions

You can find quantized gguf versions of this model in the /gguf - folder.

Quantized GGUF versions are in the gguf/ directory for use with llama.cpp:

Tom-Qwen-7B-Instruct-f16.gguf (14531.9 MB) - 16 - bit float (original precision, largest file)
Tom-Qwen-7B-Instruct-q3_k_m.gguf (3632.0 MB) - 3 - bit quantization (medium quality)
Tom-Qwen-7B-Instruct-q4_k_m.gguf (4466.1 MB) - 4 - bit quantization (medium, recommended for most use cases)
Tom-Qwen-7B-Instruct-q5_k_m.gguf (5192.6 MB) - 5 - bit quantization (medium, good quality)
Tom-Qwen-7B-Instruct-q6_k.gguf (5964.5 MB) - 6 - bit quantization (high quality)
Tom-Qwen-7B-Instruct-q8_0.gguf (7723.4 MB) - 8 - bit quantization (very high quality)

Intended Use

This model is intended for conversation, brainstorming, and general instruction following.

Training Details

Training Data

Synthesized data set created specifically for this, focused on practical tips and well - being.

Property	Details
Dataset	theprint/Tom - 4.2k - alpaca
Format	alpaca

Training Procedure

Training epochs: 3
LoRA rank: 128
Learning rate: 0.0002
Batch size: 4
Framework: Unsloth + transformers + PEFT
Hardware: NVIDIA RTX 5090

🔧 Technical Details

This model is fine - tuned based on Qwen/Qwen2.5 - 7B - Instruct using the Unsloth framework with LoRA. The LoRA rank is set to 128, which helps in efficient training and adaptation. The training data is a synthesized alpaca - formatted dataset, and the training process runs for 3 epochs with a learning rate of 0.0002 and a batch size of 4.

📄 License

This model is released under the apache - 2.0 license.

⚠️ Important Note

This model may hallucinate or provide incorrect information. It is not suitable for critical decision making.

📖 Citation

If you use this model, please cite:

@misc{tom_qwen_7b_instruct,
  title={Tom-Qwen-7B-Instruct: Fine-tuned Qwen/Qwen2.5-7B-Instruct},
  author={theprint},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/theprint/Tom-Qwen-7B-Instruct}
}

🙏 Acknowledgments

Base model: Qwen/Qwen2.5-7B-Instruct
Training dataset: theprint/Tom-4.2k-alpaca
Fine - tuning framework: Unsloth
Quantization: llama.cpp

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご