đ Hunyuan-7B Model
The 7B models released by Hunyuan use better data allocation and training, achieving a good balance between computing and performance, and standing out among large language models.
⨠Features
- The 7B models released by Hunyuan this time, Hunyuan-7B-Pretrain-0124 and Hunyuan-7B-Instruct-0124, have strong performance and are currently one of the strongest Chinese 7B Dense models.
- Extended long text capability to 256K and utilizes Grouped Query Attention (GQA).
- This open - source release offers two inference backend options tailored for the Hunyuan - 7B model: the popular vLLM - backend and the TensorRT - LLM Backend. Initially, the vLLM solution is open - sourced, and the TRT - LLM solution will be released soon.
- The Hunyuan - 7B open - source model is fully compatible with the Hugging Face format, enabling fine - tuning using the hf - deepspeed framework.
đ Documentation
Model Introduction
The 7B models released by Hunyuan this time: Hunyuan-7B-Pretrain-0124 and Hunyuan-7B-Instruct-0124, use better data allocation and training, have strong performance, and have achieved a good balance between computing and performance. It stands out from many large - scale language models and is currently one of the strongest Chinese 7B Dense models.
Introduction to Technical Advantages
Model
- Extended long text capability to 256K and utilizes Grouped Query Attention (GQA).
Inference Framework
This open - source release offers two inference backend options tailored for the Hunyuan - 7B model: the popular vLLM - backend and the TensorRT - LLM Backend. In this release, we are initially open - sourcing the vLLM solution, with plans to release the TRT - LLM solution in the near future.
Training Framework
The Hunyuan - 7B open - source model is fully compatible with the Hugging Face format, enabling researchers and developers to perform model fine - tuning using the hf - deepspeed framework. Learn more: Tencent - Hunyuan - Large.
Related News
- 2025.1.24 We have open - sourced Hunyuan - 7B - Pretrain - 0124 and Hunyuan - 7B - Instruct - 0124 on Hugging Face.
Benchmark
Note: The following benchmarks are evaluated by TRT - LLM - backend
Hunyuan - 7B - Pretrain
Property |
Qwen2.5 - 7B |
Llama3 - 8B |
OLMO2 - 7B |
HunYuan - 7B - V2 |
MMLU |
74.26 |
66.95 |
63.7 |
75.37 |
MMLU - Pro |
46.17 |
34.04 |
31 |
47.54 |
MMLU - CF |
61.01 |
55.21 |
52.94 |
59.62 |
MMLU - Redux |
73.47 |
66.44 |
63.74 |
74.54 |
BBH |
70.4 |
62.16 |
38.01 |
70.77 |
HellaSwag |
75.82 |
78.24 |
61.97 |
80.77 |
WinoGrande |
69.69 |
73.64 |
74.43 |
71.51 |
PIQA |
79.33 |
80.52 |
80.63 |
81.45 |
SIQA |
77.48 |
61.05 |
65.2 |
79.73 |
NaturalQuestions |
31.77 |
35.43 |
36.9 |
33.52 |
DROP |
68.2 |
60.13 |
60.8 |
68.63 |
ARC - C |
91.64 |
77.59 |
74.92 |
91.97 |
TriviaQA |
69.31 |
78.61 |
78 |
74.31 |
Chinese - SimpleQA |
30.37 |
19.4 |
7.35 |
30.51 |
SimpleQA |
4.98 |
7.68 |
4.51 |
3.73 |
CMMLU |
81.39 |
50.25 |
38.79 |
82.19 |
C - Eval |
81.11 |
50.4 |
38.53 |
82.12 |
C3 |
71.77 |
61.5 |
54 |
79.07 |
GSM8K |
82.71 |
57.54 |
67.5 |
93.33 |
MATH |
49.6 |
18.45 |
19 |
62.15 |
CMATH |
84.33 |
52.83 |
44 |
88.5 |
HumanEval |
57.93 |
35.98 |
15.24 |
59.15 |
Hunyuan - 7B - Instruct
Property |
Qwen2.5 - 7B - Instruct |
Llama - 3 - 8B - Instruct |
OLMo - 2 - 1124 - 7B - DPO |
Hunyuan - 7B - Instruct |
ARC - C |
89.83 |
82.4 |
- |
88.81 |
BBH |
66.24 |
- |
46.6 |
76.47 |
CEval |
76.82 |
- |
- |
81.8 |
CMMLU |
78.55 |
- |
- |
82.29 |
DROP_F1 |
80.63 |
- |
60.5 |
82.96 |
GPQA |
36.87 |
34.6 |
- |
47.98 |
Gsm8k |
80.14 |
80.6 |
85.1 |
90.14 |
HellaSwag |
83.34 |
- |
- |
86.57 |
HumanEval |
84.8 |
60.4 |
- |
84.0 |
MMLU |
72.36 |
68.5 |
61.3 |
79.18 |
đ Quick Start
You can refer to the content in Tencent - Hunyuan - Large to get started quickly. The training and inference code can use the version provided in this github repository.
Inference Framework
This open - source release offers two inference backend options tailored for the Hunyuan - 7B model: the popular vLLM - backend and the TensorRT - LLM Backend. In this release, we are initially open - sourcing the vLLM solution, with plans to release the TRT - LLM solution in the near future.
Inference Performance
This section presents the efficiency test results of deploying various models using vLLM, including inference speed (tokens/s) under different batch sizes.
Property |
Model |
Number of GPUs (GPU productA) |
input_length |
batch = 1 |
batch = 4 |
vLLM |
hunyuan - 7B |
1 |
2048 |
78.9 |
279.5 |
đ License
The model is under the tencent - license.
Contact Us
If you would like to leave a message for our R & D and product teams, welcome to contact our open - source team. You can also contact us via email (hunyuan_opensource@tencent.com).