アインシュタイン - v7 - Qwen2 - 7Bオープンソーステキスト生成モデル - 無料で利用可能、多くの科学分野で優れた性能を発揮

ホーム

Einstein V7 Qwen2 7B

Weyaxiによって開発

Einstein-v7-Qwen2-7Bは、Qwen/Qwen2-7Bをベースに、さまざまな科学分野のデータセットで全量微調整して得られたテキスト生成モデルで、科学、物理、化学、生物学、数学などの複数の分野で優れた性能を発揮します。

大規模言語モデル

Transformers

英語オープンソースライセンス:その他 #科学分野の専門家 #多学科知識ベース #ChatML対話最適化

ダウンロード数 1,927

リリース時間 : 6/24/2024

モデル概要

このモデルはQwen2-7Bアーキテクチャに基づく全量微調整バージョンで、科学分野のテキスト生成タスクに特化しており、多分野の知識問答とコンテンツ生成をサポートします。

モデル特徴

多分野の科学知識

科学、物理、化学、生物学、数学などの複数の分野で専門的に訓練され、専門分野のテキスト生成能力を備えています。

高性能ハードウェア最適化

8xMI300Xハードウェアを使用して微調整され、ハードウェア性能を最大限に引き出します。

ChatMLテンプレートサポート

ChatML対話テンプレートをサポートし、対話形式のテキスト生成を容易にします。

長文脈処理

8192のシーケンス長をサポートし、長いテキストコンテンツを処理できます。

モデル能力

科学分野のテキスト生成

多分野の知識問答

専門コンテンツ作成

教育支援

研究サポート

使用事例

教育

科学知識の解説

学生に複雑な科学概念や原理を説明します。

正確で分かりやすい科学知識の説明を提供します。

宿題のサポート

学生が科学、数学などの学科の宿題の問題を解決するのを支援します。

段階的な解答と詳細な説明を提供します。

研究

文献要約

研究者に科学文献の要約と要点を生成します。

文献の核心内容を迅速に理解できます。

研究アイデアの生成

研究者が新しい研究アイデアと実験設計を生成するのを支援します。

革新的な研究方向の提案を提供します。

🚀 🔬 Einstein-v7-Qwen2-7B

このモデルは、多様なデータセットを用いてQwen/Qwen2-7Bを完全にファインチューニングしたバージョンです。このモデルは、axolotlを使用して8xMI300Xでファインチューニングされています。このモデルは、TensorWaveのコンピューティングリソースを用いてトレーニングされています。

image/png

🚀 クイックスタート

このモデルを使用するには、以下の手順に従ってください。まず、モデルのベースとなるQwen/Qwen2-7Bを理解し、その上で多様なデータセットを用いたファインチューニングについて把握することが重要です。

✨ 主な機能

多様なデータセットを用いてQwen/Qwen2-7Bを完全にファインチューニングしたモデルです。
8xMI300Xを使用し、axolotlを用いてファインチューニングされています。
TensorWaveのコンピューティングリソースを用いてトレーニングされています。

📚 ドキュメント

ベースモデル

ベースモデル: Qwen/Qwen2-7B

データセット

allenai/ai2_arc
camel-ai/physics
camel-ai/chemistry
camel-ai/biology
camel-ai/math
metaeval/reclor
openbookqa
mandyyyyii/scibench
derek-thomas/ScienceQA
TIGER-Lab/ScienceEval
jondurbin/airoboros-3.2
LDJnr/Capybara
Cot-Alpaca-GPT4-From-OpenHermes-2.5
STEM-AI-mtl/Electrical-engineering
knowrohit07/saraswati-stem
sablo/oasst2_curated
lmsys/lmsys-chat-1m
TIGER-Lab/MathInstruct
bigbio/med_qa
meta-math/MetaMathQA-40K
openbookqa
piqa
metaeval/reclor
derek-thomas/ScienceQA
scibench
sciq
Open-Orca/SlimOrca
migtissera/Synthia-v1.3
TIGER-Lab/ScienceEval
allenai/WildChat
microsoft/orca-math-word-problems-200k
openchat/openchat_sharegpt4_dataset
teknium/GPTeacher-General-Instruct
m-a-p/CodeFeedback-Filtered-Instruction
totally-not-an-llm/EverythingLM-data-V3
HuggingFaceH4/no_robots
OpenAssistant/oasst_top1_2023-08-25
WizardLM/WizardLM_evol_instruct_70k
abacusai/SystemChat-1.1
H-D-T/Buzz-V1.2

axolotl設定

axolotl設定を表示

axolotlバージョン: 0.4.0

base_model: Qwen/Qwen2-7B
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: false
strict: false

chat_template: chatml
datasets:
  - path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/buzz_unstacked_chosen_math_removed_filtered.json
    ds_type: json
    type: alpaca
    conversation: chatml

  - path: data/capybara_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/everythinglm-data-v3_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/gpt4_data_lmys_1m_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/gpteacher-instruct-special-alpaca.json
    ds_type: json
    type: gpteacher
    conversation: chatml

  - path: data/merged_all.json
    ds_type: json
    type: alpaca
    conversation: chatml

  - path: data/no_robots_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/oasst_top1_from_fusechatmixture_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/pippa_bagel_repo_3k_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/rpguild_quarter_alignment_lab_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/sharegpt_gpt4_english.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/slimorca_dedup_filtered_95k_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/soda_diaolog_longest_tenth_buzz_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/synthia-v1.3_sharegpt_12500.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/system_conversations_dolphin_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml
  
dataset_prepared_path: last_run_prepared
val_set_size: 0.002

output_dir: ./Einstein-v7-Qwen2-7B-model

sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

wandb_project: Einstein
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:
hub_model_id: Weyaxi/Einstein-v7-Qwen2-7B

gradient_accumulation_steps: 4
micro_batch_size: 6
num_epochs: 2
optimizer: paged_adamw_8bit
lr_scheduler: cosine
learning_rate: 0.00001 # look

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: unsloth
gradient_checkpointing_kwargs:
   use_reentrant: true # look
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:

deepspeed: deepspeed_configs/zero3_bf16.json
weight_decay: 0.05
fsdp:
fsdp_config:
special_tokens:
  eos_token: "<|im_end|>"
  pad_token: "<|end_of_text|>"
tokens:
  - "<|im_start|>"
  - "<|im_end|>"

プロンプトテンプレート

モデルを使用する際には、ChatMLプロンプトテンプレートを使用できます。

ChatML

<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
{asistant}<|im_end|>

このプロンプトテンプレートは、chat templateとして利用可能です。つまり、tokenizer.apply_chat_template()メソッドを使用してメッセージをフォーマットできます。

messages = [
    {"role": "system", "content": "You are helpful AI asistant."},
    {"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)