đ Tom-Qwen-7B-Instruct
A fine-tuned 7B parameter model specialized for step-by-step instruction and conversation.
đ Quick Start
Tom-Qwen-7B-Instruct is a fine - tuned model designed for efficient conversation and instruction - following. You can quickly start using it with the code examples below.
⨠Features
- Fine - tuned Efficiency: This model is a fine - tuned version of Qwen/Qwen2.5 - 7B - Instruct using the Unsloth framework with LoRA for efficient training.
- Multiple Usage Modes: Supports various usage methods, including using with
unsloth
, standard transformers
, and llama.cpp
.
- Quantized Versions: Provides different quantized GGUF versions for different hardware and performance requirements.
đĻ Installation
No specific local installation steps are provided in the original document. You can directly use the pretrained model from the Hugging Face repository.
đģ Usage Examples
Basic Usage
from unsloth import FastLanguageModel
import torch
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="theprint/Tom-Qwen-7B-Instruct",
max_seq_length=4096,
dtype=None,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
inputs = tokenizer(["Your prompt here"], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Advanced Usage
Alternative Usage (Standard Transformers)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"theprint/Tom-Qwen-7B-Instruct",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("theprint/Tom-Qwen-7B-Instruct")
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Your question here"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)
Using with llama.cpp
wget https://huggingface.co/theprint/Tom-Qwen-7B-Instruct/resolve/main/gguf/Tom-Qwen-7B-Instruct-q4_k_m.gguf
./llama.cpp/main -m Tom-Qwen-7B-Instruct-q4_k_m.gguf -p "Your prompt here" -n 256
đ Documentation
Model Details
Property |
Details |
Developed by |
theprint |
Model Type |
Causal Language Model (Fine - tuned with LoRA) |
Language |
en |
License |
apache - 2.0 |
Base model |
Qwen/Qwen2.5 - 7B - Instruct |
Fine - tuning method |
LoRA with rank 128 |
GGUF Quantized Versions
You can find quantized gguf versions of this model in the /gguf - folder.
Quantized GGUF versions are in the gguf/
directory for use with llama.cpp:
Tom-Qwen-7B-Instruct-f16.gguf
(14531.9 MB) - 16 - bit float (original precision, largest file)
Tom-Qwen-7B-Instruct-q3_k_m.gguf
(3632.0 MB) - 3 - bit quantization (medium quality)
Tom-Qwen-7B-Instruct-q4_k_m.gguf
(4466.1 MB) - 4 - bit quantization (medium, recommended for most use cases)
Tom-Qwen-7B-Instruct-q5_k_m.gguf
(5192.6 MB) - 5 - bit quantization (medium, good quality)
Tom-Qwen-7B-Instruct-q6_k.gguf
(5964.5 MB) - 6 - bit quantization (high quality)
Tom-Qwen-7B-Instruct-q8_0.gguf
(7723.4 MB) - 8 - bit quantization (very high quality)
Intended Use
This model is intended for conversation, brainstorming, and general instruction following.
Training Details
Training Data
Synthesized data set created specifically for this, focused on practical tips and well - being.
Property |
Details |
Dataset |
theprint/Tom - 4.2k - alpaca |
Format |
alpaca |
Training Procedure
- Training epochs: 3
- LoRA rank: 128
- Learning rate: 0.0002
- Batch size: 4
- Framework: Unsloth + transformers + PEFT
- Hardware: NVIDIA RTX 5090
đ§ Technical Details
This model is fine - tuned based on Qwen/Qwen2.5 - 7B - Instruct using the Unsloth framework with LoRA. The LoRA rank is set to 128, which helps in efficient training and adaptation. The training data is a synthesized alpaca - formatted dataset, and the training process runs for 3 epochs with a learning rate of 0.0002 and a batch size of 4.
đ License
This model is released under the apache - 2.0 license.
â ī¸ Important Note
This model may hallucinate or provide incorrect information. It is not suitable for critical decision making.
đ Citation
If you use this model, please cite:
@misc{tom_qwen_7b_instruct,
title={Tom-Qwen-7B-Instruct: Fine-tuned Qwen/Qwen2.5-7B-Instruct},
author={theprint},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/theprint/Tom-Qwen-7B-Instruct}
}
đ Acknowledgments