Heron GIT Japanese StableLM Base 7B開源視覺語言模型

首頁

Heron Chat Git Ja Stablelm Base 7b V0

由turing-motors開發

Heron GIT Japanese StableLM Base 7B 是一個能夠就輸入圖像進行對話的視覺語言模型。

圖像生成文本

Transformers

日語#日語視覺對話 #圖像描述生成 #多模態問答

下載量 57

發布時間 : 9/6/2023

模型概述

該模型是一個視覺語言模型，能夠根據輸入的圖像進行對話，主要用於日語環境下的圖像理解和問答任務。

模型特點

日語視覺語言理解

專門針對日語環境優化的視覺語言模型，能夠理解圖像內容並用日語進行描述和問答。

兩階段訓練

先在STAIR Captions上進行預訓練，然後在LLaVA-Instruct-150K-JA和Japanese Visual Genome上進行微調。

基於StableLM

使用Japanese StableLM Base Alpha作為語言模型基礎，具有良好的日語理解和生成能力。

模型能力

圖像描述生成

視覺問答

日語對話

圖像內容理解

使用案例

聊天應用

圖像對話機器人

用戶上傳圖片後，模型可以就圖片內容進行對話和問答。

能夠生成與圖片內容相關的日語回答。

研究

視覺語言模型研究

可用於日語環境下視覺語言理解的研究和實驗。

🚀 Heron GIT 日語 StableLM Base 7B

Heron GIT 日語 StableLM Base 7B 是一款視覺語言模型，能夠針對輸入的圖像進行對話交流，為圖像相關的交互與研究提供了有力支持。

🚀 快速開始

你可以按照安裝指南進行操作。

✨ 主要特性

Heron GIT 日語 StableLM Base 7B 作為視覺語言模型，可就輸入圖像進行對話。
該模型使用 the heron library 進行訓練。

📦 安裝指南

請參考安裝指南完成安裝。

💻 使用示例

基礎用法

import requests
from PIL import Image

import torch
from transformers import AutoProcessor
from heron.models.git_llm.git_japanese_stablelm_alpha import GitJapaneseStableLMAlphaForCausalLM

device_id = 0

# prepare a pretrained model
model = GitJapaneseStableLMAlphaForCausalLM.from_pretrained(
    'turing-motors/heron-chat-git-ja-stablelm-base-7b-v0', torch_dtype=torch.float16
)
model.eval()
model.to(f"cuda:{device_id}")

# prepare a processor
processor = AutoProcessor.from_pretrained('turing-motors/heron-chat-git-ja-stablelm-base-7b-v0')

# prepare inputs
url = "https://www.barnorama.com/wp-content/uploads/2016/12/03-Confusing-Pictures.jpg"
image = Image.open(requests.get(url, stream=True).raw)

text = f"##human: これは何の寫真ですか？\n##gpt: "

# do preprocessing
inputs = processor(
    text,
    image,
    return_tensors="pt",
    truncation=True,
)
inputs = {k: v.to(f"cuda:{device_id}") for k, v in inputs.items()}

# set eos token
eos_token_id_list = [
    processor.tokenizer.pad_token_id,
    processor.tokenizer.eos_token_id,
]

# do inference
with torch.no_grad():
    out = model.generate(**inputs, max_length=256, do_sample=False, temperature=0., eos_token_id=eos_token_id_list)

# print result
print(processor.tokenizer.batch_decode(out)[0])

📚 詳細文檔

模型詳情

屬性	詳情
開發者	Turing Inc.
適配器類型	GIT
語言模型	Japanese StableLM Base Alpha
語言	日語

訓練情況

該模型首先使用適配器結合 STAIR Captions 進行訓練。在第二階段，使用 LoRA 結合 LLaVA - Instruct - 150K - JA 和日語視覺基因組進行微調。

訓練數據集

使用與限制

預期用途

該模型旨在用於類似聊天的應用程序以及研究目的。

限制

模型可能會產生不準確或錯誤的信息，其準確性無法保證，目前仍處於研究和開發階段。

如何引用

@misc{GitJapaneseStableLM, 
    url    = {[https://huggingface.co/turing-motors/heron-chat-git-ja-stablelm-base-7b-v0](https://huggingface.co/turing-motors/heron-chat-git-ja-stablelm-base-7b-v0)}, 
    title  = {Heron GIT Japanese StableLM Base 7B}, 
    author = {Yuichi Inoue, Kotaro Tanahashi, and Yu Yamaguchi}
}

引用文獻

@misc{JapaneseInstructBLIPAlpha, 
    url    = {[https://huggingface.co/stabilityai/japanese-instructblip-alpha](https://huggingface.co/stabilityai/japanese-instructblip-alpha)}, 
    title  = {Japanese InstructBLIP Alpha}, 
    author = {Shing, Makoto and Akiba, Takuya}
}