🚀 glm-4-9b-hf
GLM-4-9B是智谱AI推出的GLM-4系列最新一代预训练模型的开源版本,在语义、数学、推理、代码和知识等数据集评估中表现出色,还具备多语言支持等先进特性。
🚀 快速开始
如需更多推理代码和要求,请访问我们的 GitHub页面。
请严格按照 依赖项 进行安装,否则将无法正常运行
使用Transformers库(4.46.0及更高版本)进行推理:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
MODEL_PATH = "THUDM/glm-4-9b-hf"
model = AutoModelForCausalLM.from_pretrained(
MODEL_PATH,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
trust_remote_code=True,
device_map="auto"
).eval()
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
encoding = tokenizer("what is your name?<|endoftext|>")
inputs = {key: torch.tensor([value]).to(device) for key, value in encoding.items()}
gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1}
with torch.no_grad():
outputs = model.generate(**inputs, **gen_kwargs)
outputs = outputs[:, inputs['input_ids'].shape[1]:]
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
✨ 主要特性
GLM-4-9B是智谱AI推出的GLM-4系列最新一代预训练模型的开源版本。在语义、数学、推理、代码和知识等数据集评估中,GLM-4-9B 及其经过人类偏好对齐的版本 GLM-4-9B-Chat 表现出超越Llama-3-8B的卓越性能。除了多轮对话外,GLM-4-9B-Chat还具备网页浏览、代码执行、自定义工具调用(Function Call)和长文本推理(支持高达128K上下文)等先进特性。这一代模型增加了多语言支持,支持包括日语、韩语和德语在内的26种语言。此外,还推出了支持1M上下文长度(约200万个汉字)的 GLM-4-9B-Chat-1M 模型以及基于GLM-4-9B的多模态模型GLM-4V-9B。GLM-4V-9B 在1120*1120的高分辨率下具备中英双语对话能力。在包括中英综合能力、感知与推理、文本识别和图表理解等各种多模态评估中,GLM-4V-9B的表现优于GPT-4-turbo-2024-04-09、Gemini 1.0 Pro、Qwen-VL-Max和Claude 3 Opus。
对GLM-4-9B基础模型在一些典型任务上的评估结果如下:
模型 |
MMLU |
C-Eval |
GPQA |
GSM8K |
MATH |
HumanEval |
Llama-3-8B |
66.6 |
51.2 |
- |
45.8 |
- |
- |
Llama-3-8B-Instruct |
68.4 |
51.3 |
34.2 |
79.6 |
30.0 |
62.2 |
ChatGLM3-6B-Base |
61.4 |
69.0 |
- |
72.3 |
25.7 |
- |
GLM-4-9B |
74.7 |
77.1 |
34.3 |
84.0 |
30.4 |
70.1 |
本仓库为GLM-4-9B的基础版本,支持8K上下文长度。
📄 许可证
GLM-4模型的权重可在 许可证 的条款下使用。
📚 详细文档
如果您觉得我们的工作有帮助,请考虑引用以下论文。
@misc{glm2024chatglm,
title={ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools},
author={Team GLM and Aohan Zeng and Bin Xu and Bowen Wang and Chenhui Zhang and Da Yin and Diego Rojas and Guanyu Feng and Hanlin Zhao and Hanyu Lai and Hao Yu and Hongning Wang and Jiadai Sun and Jiajie Zhang and Jiale Cheng and Jiayi Gui and Jie Tang and Jing Zhang and Juanzi Li and Lei Zhao and Lindong Wu and Lucen Zhong and Mingdao Liu and Minlie Huang and Peng Zhang and Qinkai Zheng and Rui Lu and Shuaiqi Duan and Shudan Zhang and Shulin Cao and Shuxun Yang and Weng Lam Tam and Wenyi Zhao and Xiao Liu and Xiao Xia and Xiaohan Zhang and Xiaotao Gu and Xin Lv and Xinghan Liu and Xinyi Liu and Xinyue Yang and Xixuan Song and Xunkai Zhang and Yifan An and Yifan Xu and Yilin Niu and Yuantao Yang and Yueyan Li and Yushi Bai and Yuxiao Dong and Zehan Qi and Zhaoyu Wang and Zhen Yang and Zhengxiao Du and Zhenyu Hou and Zihan Wang},
year={2024},
eprint={2406.12793},
archivePrefix={arXiv},
primaryClass={id='cs.CL' full_name='Computation and Language' is_active=True alt_name='cmp-lg' in_archive='cs' is_general=False description='Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.'}
}
⚠️ 重要提示
请阅读 中文版本。如果您使用本仓库的权重,请更新到 transformers>=4.46.0 。这些权重与旧版本的transformers库 不兼容。