🚀 InternLM-XComposer2-4KHD
InternLM-XComposer2-4KHD是基於InternLM2的通用視覺語言大模型(VLLM),具備4K分辨率圖像理解能力。
InternLM-XComposer2-4KHD
[💻Github Repo](https://github.com/InternLM/InternLM-XComposer)
[Paper](https://arxiv.org/abs/2401.16420)
🚀 快速開始
我們提供一個簡單的示例,展示如何使用🤗 Transformers調用InternLM-XComposer。
基礎用法
import torch
from transformers import AutoModel, AutoTokenizer
torch.set_grad_enabled(False)
model = AutoModel.from_pretrained('internlm/internlm-xcomposer2-4khd-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2-4khd-7b', trust_remote_code=True)
query1 = '<ImageHere>Illustrate the fine details present in the image'
image = './example.webp'
with torch.cuda.amp.autocast():
response, his = model.chat(tokenizer, query=query, image=image, hd_num=55, history=[], do_sample=False, num_beams=3)
print(response)
query1 = 'what is the detailed explanation of the third part.'
with torch.cuda.amp.autocast():
response, _ = model.chat(tokenizer, query=query1, image=image, hd_num=55, history=his, do_sample=False, num_beams=3)
print(response)
📦 安裝指南
從Transformers導入模型
要使用Transformers加載InternLM-XComposer2-4KHD模型,請使用以下代碼:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
ckpt_path = "internlm/internlm-xcomposer2-4khd-7b"
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
model = model.eval()
📄 許可證
代碼採用Apache-2.0許可證,而模型權重完全開放用於學術研究,也允許免費商業使用。如需申請商業許可證,請填寫申請表([英文](application form)/中文)。如有其他問題或合作需求,請聯繫internlm@pjlab.org.cn。