书生・浦语2オープンソースビジョン言語大規模モデル - 無料でデプロイして画像と文章の理解と創作を実現する

ホーム

Internlm Xcomposer2 7b 4bit

internlmによって開発

書生・浦語2はInternLM2をベースにした視覚言語大モデル(VLLM)で、先進的なテキストと画像の理解と創作能力を備えています。

画像生成テキスト

Transformers

オープンソースライセンス:その他 #テキストと画像の交互創作 #マルチモーダル理解 #4ビット量子化

ダウンロード数 74

リリース時間 : 2/6/2024

モデル概要

書生・浦語2は視覚言語大モデルで、テキストと画像の理解と創作に特化し、自由な形式のテキストと画像の交互創作タスクをサポートします。

モデル特徴

先進的なテキストと画像の理解能力

複数のマルチモーダルベンチマークテストで優れた成績を収め、強力なテキストと画像の理解能力を備えています。

自由形式のテキストと画像の交互創作

自由形式のテキストと画像の交互創作タスクのために微調整されており、複雑なテキストと画像の交互創作をサポートします。

4ビット量子化バージョン

4ビット量子化バージョンを提供し、ハードウェア要件を低減しながら高い性能を維持します。

モデル能力

テキストと画像の理解

テキストと画像の創作

マルチモーダルインタラクション

自由形式のテキストと画像の交互創作

使用事例

コンテンツ創作

テキストと画像の記事創作

提供された画像に基づいて一貫性のある記事内容を生成します。

画像の内容に合った記事を生成します。例えば、「私の好きな動物：ジャイアントパンダ」。

教育

教育補助

教育用画像に基づいて説明文や問題解答を生成します。

🚀 InternLM-XComposer2

InternLM-XComposer2 は、InternLM2 をベースとした、高度なテキストと画像の理解と合成を行うためのビジョン言語大規模モデル（VLLM）です。このモデルは、テキストと画像の相互作用を深く理解し、自然な文章を生成することができます。

InternLM-XComposer2

[💻Github Repo](https://github.com/InternLM/InternLM-XComposer) [Paper](https://arxiv.org/abs/2401.16420)

当社では、InternLM-XComposer2 シリーズを2つのバージョンでリリースしています。

InternLM-XComposer2-VL：InternLM2をLLMの初期化とした事前学習済みのVLLMモデルで、様々なマルチモーダルベンチマークで強力な性能を発揮します。
InternLM-XComposer2：Free-from Interleaved Text-Image Composition 用に微調整されたVLLMです。

これはInternLM-XComposer2の4ビットバージョンです。使用する前に、auto_gptq の最新バージョンをインストールしてください。

🚀 クイックスタート

必要なライブラリのインポート

import torch, auto_gptq
from PIL import Image
from transformers import AutoModel, AutoTokenizer 
from auto_gptq.modeling import BaseGPTQForCausalLM

auto_gptq.modeling._base.SUPPORTED_MODELS = ["internlm"]
torch.set_grad_enabled(False)

モデルクラスの定義

class InternLMXComposer2QForCausalLM(BaseGPTQForCausalLM):
    layers_block_name = "model.layers"
    outside_layer_modules = [
        'vit', 'vision_proj', 'model.tok_embeddings', 'model.norm', 'output', 
    ]
    inside_layer_modules = [
        ["attention.wqkv.linear"],
        ["attention.wo.linear"],
        ["feed_forward.w1.linear", "feed_forward.w3.linear"],
        ["feed_forward.w2.linear"],
    ]

モデルとトークナイザーの初期化

# init model and tokenizer
model = InternLMXComposer2QForCausalLM.from_quantized(
  'internlm/internlm-xcomposer2-7b-4bit', trust_remote_code=True, device="cuda:0").eval()
tokenizer = AutoTokenizer.from_pretrained(
  'internlm/internlm-xcomposer2-7b-4bit', trust_remote_code=True)

画像の読み込みと前処理

img_path_list = [
    'panda.jpg',
    'bamboo.jpeg',
]
images = []
for img_path in img_path_list:
    image = Image.open(img_path).convert("RGB")
    image = model.vis_processor(image)
    images.append(image)
image = torch.stack(images)

クエリの設定と応答の取得

query = '<ImageHere> <ImageHere>please write an article based on the images. Title: my favorite animal.'
with torch.cuda.amp.autocast():
    response, history = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
print(response)

出力結果の例

#My Favorite Animal: The Panda
#The panda, also known as the giant panda, is one of the most beloved animals in the world. These adorable creatures are native to China and can be found in the wild in a few select locations, but they are more commonly seen in captivity at zoos or wildlife reserves.
#Pandas have a distinct black-and-white coloration that makes them instantly recognizable. They are known for their love of bamboo, which they eat almost exclusively. In fact, pandas spend up to 14 hours a day eating, with the majority of their diet consisting of bamboo. Despite this seemingly unbalanced diet, pandas are actually quite healthy and have a low body fat percentage, thanks to their ability to digest bamboo efficiently.
#In addition to their unique eating habits, pandas are also known for their playful personalities. They are intelligent and curious creatures, often engaging in activities like playing with toys or climbing trees. However, they do not typically exhibit these behaviors in the wild, where they are solitary creatures who prefer to spend their time alone.
#One of the biggest threats to the panda's survival is habitat loss due to deforestation. As a result, many pandas now live in captivity, where they are cared for by dedicated staff and provided with enrichment opportunities to keep them engaged and stimulated. While it is important to protect these animals from extinction, it is also crucial to remember that they are still wild creatures and should be treated with respect and care.
#Overall, the panda is an amazing animal that has captured the hearts of people around the world. Whether you see them in the wild or in captivity, there is no denying the charm and allure of these gentle giants.

📄 ライセンス

コードはApache 2.0ライセンスの下で提供されています。一方、モデルの重みは学術研究に完全にオープンであり、無料の商用利用も許可されています。商用ライセンスを申請するには、申請書（英語）/申請表（中国語）に記入してください。その他の質問やコラボレーションについては、internlm@pjlab.org.cn までご連絡ください。