StableLM Zephyr 3B Open-source Instruction Tuning Model - Trained on Multiple Datasets with Excellent Performance

Stablelm Zephyr 3b GGUF

Developed by brittlewis12

StableLM Zephyr 3B is a 3-billion-parameter instruction-tuned model trained on public datasets, synthetic datasets, and Direct Preference Optimization (DPO), delivering excellent performance.

Large Language Model EnglishOpen Source License:Other #Instruction Fine-tuning #Lightweight DPO Optimization #Multi-task Evaluation

Downloads 51

Release Time : 4/25/2025

Model Overview

This is a 3-billion-parameter causal language model optimized for instruction tasks, suitable for text generation and dialogue tasks.

Model Features

Instruction Tuning Optimization

The model is specifically trained with instruction tuning to better understand and execute user instructions.

Direct Preference Optimization (DPO)

Trained using Direct Preference Optimization, improving the quality and relevance of model outputs.

Multi-dataset Training

Trained on multiple high-quality datasets including ultrachat_200k, ultrafeedback_binarized, etc.

GGUF Format Support

Provides GGUF format model files for easy deployment on various devices.

Model Capabilities

Text Generation

Dialogue Systems

Instruction Understanding & Execution

Content Creation

Use Cases

Dialogue Systems

Smart Assistant

Can be used as an intelligent conversational assistant

Achieved a 76% win rate in AlpacaEval evaluation

Content Creation

Text Generation

Can be used to generate various types of textual content

🚀 StableLM Zephyr 3B GGUF

This repository offers GGUF format model files for Stability AI's StableLM Zephyr 3B, a 3-billion parameter instruction-tuned model.

🚀 Quick Start

This repo contains GGUF format model files for Stability AI’s StableLM Zephyr 3B.

StableLM Zephyr 3B is a 3 billion parameter instruction tuned inspired by HugginFaceH4's Zephyr 7B training pipeline this model was trained on a mix of publicly available datasets, synthetic datasets using Direct Preference Optimization (DPO), evaluation for this model based on MT Bench and Alpaca Benchmark.

✨ Features

What is GGUF?

GGUF is a file format for representing AI models. It is the third version of the format, introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. Converted using llama.cpp b1960 (26d6076)

Prompt template: Zephyr

<|system|>
{{system_message}}<|endoftext|>
<|user|>
{{prompt}}<|endoftext|>
<|assistant|>

📚 Documentation

Download & run with cnvrs on iPhone, iPad, and Mac!

cnvrs.ai

cnvrs is the best app for private, local AI on your device:

create & save Characters with custom system prompts & temperature settings
download and experiment with any GGUF model you can find on HuggingFace!
make it your own with custom Theme colors
powered by Metal ⚡️ & Llama.cpp, with haptics during response streaming!
try it out yourself today, on Testflight!
follow cnvrs on twitter to stay up to date

Original Model Evaluations

mt-bench

Model	Size	Alignment	MT-Bench (score)	AlpacaEval (win rate %)
StableLM Zephyr 3B 🪁	3B	DPO	6.64	76.00
StableLM Zephyr (SFT only)	3B	SFT	6.04	71.15
Capybara v1.9	3B	dSFT	5.94	-
MPT-Chat	7B	dSFT	5.42	-
Xwin-LM v0.1	7B	dPPO	6.19	87.83
Mistral-Instruct v0.1	7B	-	6.84	-
Zephyr-7b-α	7B	dDPO	6.88	-
Zephyr-7b-β	7B	dDPO	7.34	90.60
Falcon-Instruct	40B	dSFT	5.17	45.71
Guanaco	65B	SFT	6.41	71.80
Llama2-Chat	70B	RLHF	6.86	92.66
Vicuna v1.3	33B	dSFT	7.12	88.99
WizardLM v1.0	70B	dSFT	7.71	-
Xwin-LM v0.1	70B	dPPO	-	95.57
GPT-3.5-turbo	-	RLHF	7.94	89.37
Claude 2	-	RLHF	8.06	91.36
GPT-4	-	RLHF	8.99	95.28

Task	Value
ARC (25-shot)	47.0
HellaSwag (10-shot)	74.2
MMLU (5-shot)	46.3
TruthfulQA (0-shot)	46.5
Winogrande (5-shot)	65.5
GSM8K (5-shot)	42.3
BigBench (Avg)	35.26
AGI Benchmark (Avg)	33.23

📄 License

This model is under the other license.

📦 Model Information

Property	Details
Base Model	stabilityai/stablelm-zephyr-3b
Model Creator	Stability AI
Model Name	stablelm-zephyr-3b
Model Type	stablelm_epoch
Datasets	HuggingFaceH4/ultrachat_200k, HuggingFaceH4/ultrafeedback_binarized, meta-math/MetaMathQA, WizardLM/WizardLM_evol_instruct_V2_196k, Intel/orca_dpo_pairs
License	other
License Link	https://huggingface.co/stabilityai/stablelm-zephyr-3b/blob/main/LICENSE
Language	en
Inference	false
Tags	causal-lm, stablelm_epoch
Pipeline Tag	text-generation
Quantized By	brittlewis12

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご