Zurich-14B-GCv2-10k Open-Source AI Model - Fine-tuned on Gamma Corpus, Outperforms Models of the Same Scale!

Zurich 14B GCv2 10k

Developed by rubenroy

A Qwen 2.5 model fine-tuned on the Gamma Corpus, designed to surpass other models of the same scale

EnglishOpen Source License:Apache-2.0 #Multi-turn Dialogue Optimization #High Parameter Count 14B #Gamma Corpus Fine-tuning

Downloads 47

Release Time : 1/29/2025

Model Overview

Zurich-14B Gamma Corpus v2-10k is a fine-tuned version of Alibaba's Qwen 2.5 14B Instruct model, showcasing the potential of the Gamma Corpus v2-10k.

Model Features

Efficient Fine-tuning

Trained for approximately 10 minutes over 60 epochs on a single A100 GPU using the Unsloth framework

Advanced Architecture

Utilizes a transformer architecture with RoPE, SwiGLU, RMSNorm, and attention QKV bias

Multi-turn Dialogue Support

Trained on the Gamma Corpus, excels in handling structured multi-turn dialogues

Model Capabilities

Text Generation

Multi-turn Dialogue

Question Answering System

Use Cases

Dialogue Systems

AI Assistant

Can serve as an intelligent assistant to handle user queries

Capable of generating coherent and helpful responses

Question Answering Systems

Factual Queries

Answers questions about factual information

Can provide accurate factual responses

🚀 Zurich 14B GammaCorpus v2-10k

A Qwen 2.5 model fine-tuned on the GammaCorpus dataset. This model is designed to outperform similar-sized models and showcase the capabilities of GammaCorpus v2-10k.

🚀 Quick Start

Zurich 14B GammaCorpus v2-10k is a fine-tune of Alibaba's Qwen 2.5 14B Instruct model. It aims to outperform other models of similar size while demonstrating the effectiveness of GammaCorpus v2-10k.

✨ Features

Based on the powerful Qwen 2.5 14B Instruct model.
Fine-tuned on the GammaCorpus dataset for better performance.
Capable of handling multi - turn conversations effectively.

📦 Installation

Requirements

We strongly recommend you use the latest version of the transformers package. You may install it via pip as follows:

pip install transformers

💻 Usage Examples

Basic Usage

Here is a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "rubenroy/Zurich-14B-GCv2-10k"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "How tall is the Eiffel tower?"
messages = [
    {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 14B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

📚 Documentation

Model Details

Property	Details
Base Model	Qwen/Qwen2.5-14B-Instruct
Model Type	Causal Language Models
Architecture	Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
Number of Parameters	14.7B
Number of Paramaters (Non - Embedding)	13.1B
Number of Layers	48
Number of Attention Heads (GQA)	40 for Q and 8 for KV

Training Details

Zurich-14B-GCv2-10k underwent fine-tuning with 1 A100 GPU for ~10 minutes and was trained with the Unsloth framework. It was trained for 60 Epochs.

About GammaCorpus

This model, along with all Zurich models, is trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that contains structured and filtered multi - turn conversations. GammaCorpus has 4 versions with different sizes in each. The following are the versions and sizes:

GammaCorpus v1

10k UNFILTERED
50k UNFILTERED
70k UNFILTERED

Link to the GCv1 dataset collection: GCv1

GammaCorpus v2

10k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.
50k
100k
500k
1m
5m

Link to the GCv2 dataset collection: GCv2

GammaCorpus CoT

Math 170k

Link to the GC - CoT dataset collection: GC - CoT

GammaCorpus QA

Fact 450k

Link to the GC - QA dataset collection: GC - QA

The link to the full GammaCorpus dataset collection can be found here.

🔧 Technical Details

Zurich 14B GammaCorpus v2 - 10k is based on the Qwen 2.5 14B Instruct architecture. It uses techniques like RoPE, SwiGLU, RMSNorm, and Attention QKV bias in its Transformer architecture. The fine - tuning process was carried out using the Unsloth framework on a single A100 GPU for approximately 10 minutes over 60 epochs.

📄 License

The model is released under the Apache 2.0 License. Please refer to the license for usage rights and restrictions.

⚠️ Important Note

We have tried our best to mitigate as much bias as possible, but please be aware that the model might generate some biased answers.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご