Zurich-14B-GCv2-50k Open-Source Large Language Model - Fine-tuned on Qwen 2.5, with Powerful Practical Features

Home

Zurich 14B GCv2 50k

Developed by rubenroy

A large language model fine-tuned based on Qwen 2.5 14B Instruct, trained using the Gamma Corpus v2-50k dataset

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Multi-turn Dialogue Optimization #Efficient Fine-tuning #Knowledge-intensive QA

Downloads 39

Release Time : 1/29/2025

Model Overview

This model is a fine-tuned version of Alibaba's Qwen 2.5 14B Instruct, specializing in text generation and dialogue tasks, aiming to surpass other models of similar scale.

Model Features

Efficient Fine-tuning

Achieved fine-tuning in just 20 minutes on a single A100 GPU using the Unsloth framework

Multi-turn Dialogue Optimization

Trained on the Gamma Corpus v2-50k dataset, excels in handling multi-turn dialogue scenarios

Advanced Architecture

Incorporates advanced Transformer components like RoPE, SwiGLU, and RMSNorm

Model Capabilities

Text Generation

Dialogue System

Chat Functionality

Multi-turn Dialogue Processing

Use Cases

Dialogue Systems

Intelligent Customer Service

Used to build intelligent customer service systems capable of handling complex user queries

Capable of understanding context and providing coherent responses

Personal Assistant

Can serve as a personal digital assistant for daily queries and tasks

Delivers natural and smooth conversational experiences

Content Generation

Creative Writing

Assists in story creation and content generation

Capable of generating coherent and creative text

🚀 Zurich 14B GammaCorpus v2-50k

A Qwen 2.5 model fine-tuned on the GammaCorpus dataset

This project presents Zurich 14B GammaCorpus v2 - 50k, a fine - tuned model based on Alibaba's Qwen 2.5. It aims to outperform similar - sized models and showcases the GammaCorpus v2 - 50k dataset.

🚀 Quick Start

Zurich 14B GammaCorpus v2-50k is a fine - tune of Alibaba's Qwen 2.5 14B Instruct model. It is designed to outperform other models of similar size while also highlighting GammaCorpus v2-50k.

✨ Features

High - Performance: Outperforms other models of similar size.
Showcases Dataset: Demonstrates the effectiveness of GammaCorpus v2-50k.

📦 Installation

Requirements

⚠️ Important Note

We strongly recommend you use the latest version of the transformers package.

You may install it via pip as follows:

pip install transformers

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "rubenroy/Zurich-14B-GCv2-50k"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "How tall is the Eiffel tower?"
messages = [
    {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 14B model developed by Alibaba Cloud, and fine - tuned by Ruben Roy. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

📚 Documentation

Model Details

Property	Details
Base Model	Qwen/Qwen2.5-14B-Instruct
Model Type	Causal Language Models
Architecture	Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
Number of Parameters	14.7B
Number of Paramaters (Non - Embedding)	13.1B
Number of Layers	48
Number of Attention Heads (GQA)	40 for Q and 8 for KV

Training Details

Zurich-14B-GCv2-50k was fine - tuned using 1 A100 GPU for approximately 20 minutes and trained with the Unsloth framework. It was trained for 60 Epochs.

About GammaCorpus

This model, along with all Zurich models, is trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that contains structured and filtered multi - turn conversations. It has 4 versions with different sizes:

GammaCorpus v1

10k UNFILTERED
50k UNFILTERED
70k UNFILTERED

Link to the GCv1 dataset collection: GCv1

GammaCorpus v2

10k
50k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.
100k
500k
1m
5m

Link to the GCv2 dataset collection: GCv2

GammaCorpus CoT

Math 170k

Link to the GC - CoT dataset collection: GC - CoT

GammaCorpus QA

Fact 450k

Link to the GC - QA dataset collection: GC - QA

The link to the full GammaCorpus dataset collection can be found here.

🔧 Technical Details

Zurich 14B GammaCorpus v2 - 50k is based on the Qwen 2.5 14B Instruct model. It uses a Transformer architecture with RoPE, SwiGLU, RMSNorm, and Attention QKV bias. The fine - tuning process was carried out using the Unsloth framework on a single A100 GPU for about 20 minutes over 60 epochs.

📄 License

The model is released under the Apache 2.0 License. Please refer to the license for usage rights and restrictions.

Known Limitations

⚠️ Important Note

We have tried our best to mitigate as much bias as possible, but please be aware that the model might generate some biased answers.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご