🚀 glm-4-9b-hf
GLM-4-9B是智譜AI推出的GLM-4系列最新一代預訓練模型的開源版本,在語義、數學、推理、代碼和知識等數據集評估中表現出色,還具備多語言支持等先進特性。
🚀 快速開始
如需更多推理代碼和要求,請訪問我們的 GitHub頁面。
請嚴格按照 依賴項 進行安裝,否則將無法正常運行
使用Transformers庫(4.46.0及更高版本)進行推理:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
MODEL_PATH = "THUDM/glm-4-9b-hf"
model = AutoModelForCausalLM.from_pretrained(
MODEL_PATH,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
trust_remote_code=True,
device_map="auto"
).eval()
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
encoding = tokenizer("what is your name?<|endoftext|>")
inputs = {key: torch.tensor([value]).to(device) for key, value in encoding.items()}
gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1}
with torch.no_grad():
outputs = model.generate(**inputs, **gen_kwargs)
outputs = outputs[:, inputs['input_ids'].shape[1]:]
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
✨ 主要特性
GLM-4-9B是智譜AI推出的GLM-4系列最新一代預訓練模型的開源版本。在語義、數學、推理、代碼和知識等數據集評估中,GLM-4-9B 及其經過人類偏好對齊的版本 GLM-4-9B-Chat 表現出超越Llama-3-8B的卓越性能。除了多輪對話外,GLM-4-9B-Chat還具備網頁瀏覽、代碼執行、自定義工具調用(Function Call)和長文本推理(支持高達128K上下文)等先進特性。這一代模型增加了多語言支持,支持包括日語、韓語和德語在內的26種語言。此外,還推出了支持1M上下文長度(約200萬個漢字)的 GLM-4-9B-Chat-1M 模型以及基於GLM-4-9B的多模態模型GLM-4V-9B。GLM-4V-9B 在1120*1120的高分辨率下具備中英雙語對話能力。在包括中英綜合能力、感知與推理、文本識別和圖表理解等各種多模態評估中,GLM-4V-9B的表現優於GPT-4-turbo-2024-04-09、Gemini 1.0 Pro、Qwen-VL-Max和Claude 3 Opus。
對GLM-4-9B基礎模型在一些典型任務上的評估結果如下:
模型 |
MMLU |
C-Eval |
GPQA |
GSM8K |
MATH |
HumanEval |
Llama-3-8B |
66.6 |
51.2 |
- |
45.8 |
- |
- |
Llama-3-8B-Instruct |
68.4 |
51.3 |
34.2 |
79.6 |
30.0 |
62.2 |
ChatGLM3-6B-Base |
61.4 |
69.0 |
- |
72.3 |
25.7 |
- |
GLM-4-9B |
74.7 |
77.1 |
34.3 |
84.0 |
30.4 |
70.1 |
本倉庫為GLM-4-9B的基礎版本,支持8K上下文長度。
📄 許可證
GLM-4模型的權重可在 許可證 的條款下使用。
📚 詳細文檔
如果您覺得我們的工作有幫助,請考慮引用以下論文。
@misc{glm2024chatglm,
title={ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools},
author={Team GLM and Aohan Zeng and Bin Xu and Bowen Wang and Chenhui Zhang and Da Yin and Diego Rojas and Guanyu Feng and Hanlin Zhao and Hanyu Lai and Hao Yu and Hongning Wang and Jiadai Sun and Jiajie Zhang and Jiale Cheng and Jiayi Gui and Jie Tang and Jing Zhang and Juanzi Li and Lei Zhao and Lindong Wu and Lucen Zhong and Mingdao Liu and Minlie Huang and Peng Zhang and Qinkai Zheng and Rui Lu and Shuaiqi Duan and Shudan Zhang and Shulin Cao and Shuxun Yang and Weng Lam Tam and Wenyi Zhao and Xiao Liu and Xiao Xia and Xiaohan Zhang and Xiaotao Gu and Xin Lv and Xinghan Liu and Xinyi Liu and Xinyue Yang and Xixuan Song and Xunkai Zhang and Yifan An and Yifan Xu and Yilin Niu and Yuantao Yang and Yueyan Li and Yushi Bai and Yuxiao Dong and Zehan Qi and Zhaoyu Wang and Zhen Yang and Zhengxiao Du and Zhenyu Hou and Zihan Wang},
year={2024},
eprint={2406.12793},
archivePrefix={arXiv},
primaryClass={id='cs.CL' full_name='Computation and Language' is_active=True alt_name='cmp-lg' in_archive='cs' is_general=False description='Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.'}
}
⚠️ 重要提示
請閱讀 中文版本。如果您使用本倉庫的權重,請更新到 transformers>=4.46.0 。這些權重與舊版本的transformers庫 不兼容。