đ Zurich 14B GammaCorpus v2-50k
A Qwen 2.5 model fine-tuned on the GammaCorpus dataset
This project presents Zurich 14B GammaCorpus v2 - 50k, a fine - tuned model based on Alibaba's Qwen 2.5. It aims to outperform similar - sized models and showcases the GammaCorpus v2 - 50k dataset.

đ Quick Start
Zurich 14B GammaCorpus v2-50k is a fine - tune of Alibaba's Qwen 2.5 14B Instruct model. It is designed to outperform other models of similar size while also highlighting GammaCorpus v2-50k.
⨠Features
- High - Performance: Outperforms other models of similar size.
- Showcases Dataset: Demonstrates the effectiveness of GammaCorpus v2-50k.
đĻ Installation
Requirements
â ī¸ Important Note
We strongly recommend you use the latest version of the transformers
package.
You may install it via pip
as follows:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "rubenroy/Zurich-14B-GCv2-50k"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "How tall is the Eiffel tower?"
messages = [
{"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 14B model developed by Alibaba Cloud, and fine - tuned by Ruben Roy. You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
đ Documentation
Model Details
Property |
Details |
Base Model |
Qwen/Qwen2.5-14B-Instruct |
Model Type |
Causal Language Models |
Architecture |
Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias |
Number of Parameters |
14.7B |
Number of Paramaters (Non - Embedding) |
13.1B |
Number of Layers |
48 |
Number of Attention Heads (GQA) |
40 for Q and 8 for KV |
Training Details
Zurich-14B-GCv2-50k was fine - tuned using 1 A100 GPU for approximately 20 minutes and trained with the Unsloth framework. It was trained for 60 Epochs.
About GammaCorpus
This model, along with all Zurich models, is trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that contains structured and filtered multi - turn conversations. It has 4 versions with different sizes:
GammaCorpus v1
- 10k UNFILTERED
- 50k UNFILTERED
- 70k UNFILTERED
Link to the GCv1 dataset collection: GCv1
GammaCorpus v2
- 10k
- 50k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.
- 100k
- 500k
- 1m
- 5m
Link to the GCv2 dataset collection: GCv2
GammaCorpus CoT
Link to the GC - CoT dataset collection: GC - CoT
GammaCorpus QA
Link to the GC - QA dataset collection: GC - QA
The link to the full GammaCorpus dataset collection can be found here.
đ§ Technical Details
Zurich 14B GammaCorpus v2 - 50k is based on the Qwen 2.5 14B Instruct model. It uses a Transformer architecture with RoPE, SwiGLU, RMSNorm, and Attention QKV bias. The fine - tuning process was carried out using the Unsloth framework on a single A100 GPU for about 20 minutes over 60 epochs.
đ License
The model is released under the Apache 2.0 License. Please refer to the license for usage rights and restrictions.
Known Limitations
â ī¸ Important Note
We have tried our best to mitigate as much bias as possible, but please be aware that the model might generate some biased answers.