Model Overview
Model Features
Model Capabilities
Use Cases
🚀 360Zhinao3 (360 Zhinao)
360Zhinao3 is a powerful model open - sourced by Qihoo 360. It offers various versions with enhanced capabilities, and can be used for commercial purposes free of charge. You can access more information and experience it on the official website.
🚀 Quick Start
Here is a simple example to illustrate how to quickly use 360Zhinao3-7B, 360Zhinao3-7B-Instruct, and 360Zhinao3-7B-O1.5 with 🤗Transformers.
💻 Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers.generation import GenerationConfig
MODEL_NAME_OR_PATH = "qihoo360/360Zhinao3-7B"
tokenizer = AutoTokenizer.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True).cuda()
generation_config = GenerationConfig.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True)
generation_config.max_new_tokens = 1024
inputs = tokenizer('中国二十四节气\n1. 立春\n2. 雨水\n3. 惊蛰\n4. 春分\n5. 清明\n', return_tensors='pt')
inputs = inputs.to(model.device)
pred = model.generate(input_ids=inputs["input_ids"], generation_config=generation_config)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
Advanced Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers.generation import GenerationConfig
MODEL_NAME_OR_PATH = "qihoo360/360Zhinao3-7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True).cuda()
generation_config = GenerationConfig.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True)
generation_config.max_new_tokens = 2048
messages = []
#round-1
print(f"user: 简单介绍一下刘德华")
messages.append({"role": "user", "content": "简单介绍一下刘德华"})
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
pred = model.generate(input_ids=input_ids, generation_config=generation_config)
response = tokenizer.decode(pred.cpu()[0][len(input_ids[0]):], skip_special_tokens=True)
messages.append({"role": "assistant", "content": response})
print(f"gpt: {response}")
#round-1
print(f"user: 他有什么代表作?")
messages.append({"role": "user", "content": "他有什么代表作?"})
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
pred = model.generate(input_ids=input_ids, generation_config=generation_config)
response = tokenizer.decode(pred.cpu()[0][len(input_ids[0]):], skip_special_tokens=True)
messages.append({"role": "assistant", "content": response})
print(f"gpt: {response}")
import re
import json
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers.generation import GenerationConfig
MODEL_NAME_OR_PATH = "qihoo360/360Zhinao3-7B-O1.5"
tokenizer = AutoTokenizer.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True).cuda()
generation_config = GenerationConfig.from_pretrained(
MODEL_NAME_OR_PATH,
trust_remote_code=True)
generation_config.max_new_tokens = 2048
messages = []
#round-1
print(f"user: 请详细解答这道数学题:[具体数学题内容]")
messages.append({"role": "user", "content": "请详细解答这道数学题:[具体数学题内容]"})
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
pred = model.generate(input_ids=input_ids, generation_config=generation_config)
response = tokenizer.decode(pred.cpu()[0][len(input_ids[0]):], skip_special_tokens=True)
messages.append({"role": "assistant", "content": response})
print(f"gpt: {response}")
✨ Features
Model Versions
- 360Zhinao3-7B: Continuously pre - trained with 700B high - quality tokens on the basis of 360Zhinao2-7B.
- 360Zhinao3-7B-Instruct: Performs well in multiple evaluations, ranking first in some open - source models of the same level.
- 360Zhinao3-7B-O1.5: Fine - tuned on the basis of 360Zhinao3-7B-Instruct, showing good performance in long - chain reasoning tasks.
Model Performance
The model has achieved excellent results in multiple benchmarks. For example, in the Base Model evaluation, the benchmark average score of 360Zhinao3-7B ranks first among models with less than 10B parameters.
📦 Download URL
Size | Model | BF16 |
---|---|---|
7B | 360Zhinao3-7B | 🤗 |
7B | 360Zhinao3-7B-Instruct | 🤗 |
7B | 360Zhinao3-7B-O1.5 | 🤗 |
📚 Documentation
Model Evaluation
Base Model
We used the open - source tool opencompass to conduct multi - dimensional evaluation of the model. The benchmark average score of the model ranks first among models with less than 10B parameters. It is competitive in the same size.
Type | Datasets | language | glm4 - 9b | Qwen2.5 - 7B | internlm2.5 - 7b | Yi1.5 - 9B | gemma2 - 9b | Llama3.1 - 8B | 360Zhinao2 - 7B | 360Zhinao3 - 7B |
---|---|---|---|---|---|---|---|---|---|---|
Exam | ceval | zh | 75.83 | 81.41 | 77.71 | 73.51 | 56.36 | 51.67 | 83.04 | 84.7 |
Exam | mmlu | en | 75.5 | 75.5 | 71.55 | 71.43 | 72.22 | 66.75 | 67.84 | 75.42 |
Exam | cmmlu | zh | 74.24 | 81.79 | 78.77 | 74.2 | 58.89 | 52.49 | 73.8 | 82.17 |
Exam | ARC - c | en | 94.92 | 80 | 85.08 | 87.46 | 77.63 | 80.68 | 87.12 | 88.14 |
Exam | ARC - e | en | 98.41 | 84.83 | 95.24 | 94.53 | 78.84 | 89.77 | 92.77 | 94 |
Language | WiC | en | 51.57 | 52.82 | 50.78 | 50.63 | 50.47 | 50 | 49.84 | 50.31 |
Language | WSC | en | 68.27 | 68.27 | 69.23 | 66.35 | 68.27 | 67.31 | 65.38 | 71.15 |
Knowledge | BoolQ | en | 81.8 | 83.88 | 89.51 | 84.46 | 85.6 | 82.2 | 88.29 | 88.38 |
Knowledge | commonsense_qa | en | 71.17 | 73.22 | 68.55 | 71.58 | 68.47 | 71.25 | 69.78 | 71.33 |
Understanding | C3 | zh | 91.51 | 92 | 93.04 | 85.86 | 81.64 | 83.51 | 93.26 | 92.77 |
Understanding | race - middle | en | 91.99 | 91.02 | 92.06 | 91.16 | 88.09 | 81.69 | 90.46 | 90.04 |
Understanding | race - high | en | 90.71 | 87.91 | 90.08 | 88.34 | 82.08 | 78.73 | 86.74 | 85.96 |
Understanding | lcsts | zh | 18.29 | 15.82 | 15.96 | 16.49 | 10.62 | 17.29 | 18.61 | 18.85 |
Understanding | eprstmt - dev | zh | 91.88 | 86.88 | 91.25 | 91.88 | 48.12 | 83.12 | 90 | 92.50 |
Understanding | lambada | en | 71.67 | 71.14 | 69.98 | 70.64 | 75.43 | 74.23 | 72.56 | 68.17 |
Reasoning | hellaswag | en | 70.25 | 72.76 | 70.38 | 71.55 | 66.83 | 74.65 | 71.49 | 73.61 |
Reasoning | siqa | en | 81.73 | 72.52 | 78.97 | 76.2 | 58.96 | 64.18 | 77.12 | 79.02 |
Reasoning | bbh | en | 73.68 | 54.63 | 59.43 | 67.86 | 68.45 | 59.9 | 46.54 | 73.74 |
Code | humaneval | en | 69.51 | 75 | 60.37 | 26.22 | 5.49 | 27.44 | 60.98 | 64.63 |
Code | mbpp | en | 60 | 60 | 43.6 | 56.8 | 51.2 | 42.6 | 54 | 67.80 |
Math | math | en | 26.86 | 38 | 27.14 | 27.06 | 28.52 | 15.32 | 38.34 | 37.60 |
Math | gsm8k | en | 78.54 | 79.76 | 52.54 | 71.11 | 73.09 | 56.25 | 75.51 | 78.77 |
Overall | avg_zh | 70.35 | 71.58 | 71.35 | 68.39 | 51.13 | 57.62 | 71.74 | 74.20 | |
Overall | avg_all | 73.11 | 71.78 | 69.60 | 68.88 | 61.60 | 62.32 | 70.61 | 74.83 |
Instruct Model
We have evaluated and compared the 360Zhinao3-7B-Instruct model on three popular evaluations: IFEval, MT - bench, and CF - Bench. MT - bench and CFBench both rank first among open - source models of the same level and have strong competitiveness. In IFEval (prompt strict), it is second only to glm4 - 9b and has the highest score in the 7B size.
Model | MT - bench | IFEval(strict prompt) | CFBench(CSR,ISR,PSR) | ||
---|---|---|---|---|---|
Qwen2.5 - 7B - Instruct | 8.07 | 0.556 | 0.81 | 0.46 | 0.57 |
Yi - 9B - 16k - Chat | 7.44 | 0.455 | 0.75 | 0.4 | 0.52 |
GLM4 - 9B - Chat | 8.08 | 0.634 | 0.82 | 0.48 | 0.61 |
InternLM2.5 - 7B - Chat | 7.39 | 0.540 | 0.78 | 0.4 | 0.54 |
360Zhinao2 - 7B - Chat - 4k | 7.86 | 0.577 | 0.8 | 0.44 | 0.57 |
360Zhinao3 - 7B - Instruct | 8.17 | 0.626 | 0.83 | 0.52 | 0.64 |
Long COT Model
We used the previously open - sourced [Light - R1](https://github.com/Qihoo360/Light - R1) method of Zhinao to continue fine - tuning the Long COT of 360Zhinao3-7B - Instruct, as well as RFT and GRPO. There is still a certain gap compared with the latest OpenThinker2-7B, but it surpasses all previous models based on the general Qwen2.5-7B - Instruct.
Model | Date | Base Model | AIME24 | AIME25 | GPQA Diamond |
---|---|---|---|---|---|
OpenThinker2 - 7B | 25.4.3 | Qwen2.5 - 7B - Instruct | 50 | 33.3 | 49.3 |
OpenThinker - 7B | 25.1.28 | Qwen2.5 - 7B - Instruct | 31.3 | 23.3 | 42.4 |
360Zhinao3 - 7B - O1.5 | 25.4.14 | 360Zhinao3 - 7B - Instruct | 54.2 | 36.3 | 40.0 |
OpenR1 - Qwen - 7B | 25.2.11 | Qwen2.5 - Math - 7B - Instruct | 48.7 | 34.7 | 21.2 |
DeepSeek - R1 - Distill - Qwen - 7B | 25.1.20 | Qwen2.5 - Math - 7B - Instruct | 57.3 | 33.3 | 47.3 |
Light - R1 - 7B - DS | 25.3.12 | DeepSeek - R1 - Distill - Qwen - 7B | 59.1 | 44.3 | 49.4 |
Areal - boba - RL - 7B | 25.3.31 | DeepSeek - R1 - Distill - Qwen - 7B | 61.9 | 48.3 | 47.6 |
📄 License
This project is licensed under the Apache - 2.0 license.
Feel free to visit 360Zhinao's official website https://ai.360.com for more experience.
News and Updates
- [2025.04.14] 🔥🔥🔥 We have released the 360Zhinao3 series of models, and at the same time opened up 360Zhinao3-7B, 360Zhinao3-7B-Instruct, and the long thought chain model 360Zhinao3-7B-O1.5.
- [2024.11.18] We release 360Zhinao2-7B, providing access to both the Base model and Chat models with text lengths of 4K, 32K, and 360K.
- [2024.05.23] We released two models, 360Zhinao-search and 360Zhinao-1.8B-Reranking, which ranked first respectively in the Retrieval and Reranking tasks of C - MTEB Leaderboard.
- [2024.05.20] We extended llama3 and released llama3-8B-360Zhinao-360k-Instruct 🤗
- [2024.04.12] We released 360Zhinao-7B v1.0, including the base model and three chat models with context lengths 4K, 32K and 360K. Technical report is on arXiv.

