🚀 InternLM-XComposer2
InternLM-XComposer2 は、InternLM2 をベースとした、高度なテキストと画像の理解と合成を行うためのビジョン言語大規模モデル(VLLM)です。このモデルは、テキストと画像の相互作用を深く理解し、自然な文章を生成することができます。
InternLM-XComposer2
[💻Github Repo](https://github.com/InternLM/InternLM-XComposer)
[Paper](https://arxiv.org/abs/2401.16420)
当社では、InternLM-XComposer2 シリーズを2つのバージョンでリリースしています。
- InternLM-XComposer2-VL:InternLM2をLLMの初期化とした事前学習済みのVLLMモデルで、様々なマルチモーダルベンチマークで強力な性能を発揮します。
- InternLM-XComposer2:Free-from Interleaved Text-Image Composition 用に微調整されたVLLMです。
これはInternLM-XComposer2の4ビットバージョンです。使用する前に、auto_gptq の最新バージョンをインストールしてください。
🚀 クイックスタート
必要なライブラリのインポート
import torch, auto_gptq
from PIL import Image
from transformers import AutoModel, AutoTokenizer
from auto_gptq.modeling import BaseGPTQForCausalLM
auto_gptq.modeling._base.SUPPORTED_MODELS = ["internlm"]
torch.set_grad_enabled(False)
モデルクラスの定義
class InternLMXComposer2QForCausalLM(BaseGPTQForCausalLM):
layers_block_name = "model.layers"
outside_layer_modules = [
'vit', 'vision_proj', 'model.tok_embeddings', 'model.norm', 'output',
]
inside_layer_modules = [
["attention.wqkv.linear"],
["attention.wo.linear"],
["feed_forward.w1.linear", "feed_forward.w3.linear"],
["feed_forward.w2.linear"],
]
モデルとトークナイザーの初期化
model = InternLMXComposer2QForCausalLM.from_quantized(
'internlm/internlm-xcomposer2-7b-4bit', trust_remote_code=True, device="cuda:0").eval()
tokenizer = AutoTokenizer.from_pretrained(
'internlm/internlm-xcomposer2-7b-4bit', trust_remote_code=True)
画像の読み込みと前処理
img_path_list = [
'panda.jpg',
'bamboo.jpeg',
]
images = []
for img_path in img_path_list:
image = Image.open(img_path).convert("RGB")
image = model.vis_processor(image)
images.append(image)
image = torch.stack(images)
クエリの設定と応答の取得
query = '<ImageHere> <ImageHere>please write an article based on the images. Title: my favorite animal.'
with torch.cuda.amp.autocast():
response, history = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
print(response)
出力結果の例
#My Favorite Animal: The Panda
#The panda, also known as the giant panda, is one of the most beloved animals in the world. These adorable creatures are native to China and can be found in the wild in a few select locations, but they are more commonly seen in captivity at zoos or wildlife reserves.
#Pandas have a distinct black-and-white coloration that makes them instantly recognizable. They are known for their love of bamboo, which they eat almost exclusively. In fact, pandas spend up to 14 hours a day eating, with the majority of their diet consisting of bamboo. Despite this seemingly unbalanced diet, pandas are actually quite healthy and have a low body fat percentage, thanks to their ability to digest bamboo efficiently.
#In addition to their unique eating habits, pandas are also known for their playful personalities. They are intelligent and curious creatures, often engaging in activities like playing with toys or climbing trees. However, they do not typically exhibit these behaviors in the wild, where they are solitary creatures who prefer to spend their time alone.
#One of the biggest threats to the panda's survival is habitat loss due to deforestation. As a result, many pandas now live in captivity, where they are cared for by dedicated staff and provided with enrichment opportunities to keep them engaged and stimulated. While it is important to protect these animals from extinction, it is also crucial to remember that they are still wild creatures and should be treated with respect and care.
#Overall, the panda is an amazing animal that has captured the hearts of people around the world. Whether you see them in the wild or in captivity, there is no denying the charm and allure of these gentle giants.
📄 ライセンス
コードはApache 2.0ライセンスの下で提供されています。一方、モデルの重みは学術研究に完全にオープンであり、無料の商用利用も許可されています。商用ライセンスを申請するには、申請書(英語)/申請表(中国語)に記入してください。その他の質問やコラボレーションについては、internlm@pjlab.org.cn までご連絡ください。