Einstein-v7-Qwen2-7B開源文本生成模型 - 免費可用，在多科學領域表現出色

首頁

Einstein V7 Qwen2 7B

由Weyaxi開發

Einstein-v7-Qwen2-7B是基於Qwen/Qwen2-7B在多種科學領域數據集上進行全量微調得到的文本生成模型，在科學、物理、化學、生物、數學等多個領域表現出色。

大型語言模型

Transformers

英語開源協議:其他 #科學領域專家 #多學科知識庫 #ChatML對話優化

下載量 1,927

發布時間 : 6/24/2024

模型概述

該模型是基於Qwen2-7B架構的全量微調版本，專注於科學領域的文本生成任務，支持多領域知識問答和內容生成。

模型特點

多領域科學知識

在科學、物理、化學、生物、數學等多個領域進行專門訓練，具備專業領域的文本生成能力

高性能硬件優化

使用8xMI300X硬件進行微調，充分發揮硬件性能

ChatML模板支持

支持ChatML對話模板，便於對話式文本生成

長上下文處理

支持8192的序列長度，能夠處理長文本內容

模型能力

科學領域文本生成

多領域知識問答

專業內容創作

教育輔助

研究支持

使用案例

教育

科學知識講解

為學生解釋複雜的科學概念和原理

提供準確、易懂的科學知識解釋

作業輔導

幫助學生解決科學、數學等學科的作業問題

提供分步解答和詳細解釋

研究

文獻摘要

為科研人員生成科學文獻的摘要和關鍵點

快速理解文獻核心內容

研究思路生成

幫助研究人員生成新的研究思路和實驗設計

提供創新的研究方向建議

🚀 🔬 Einstein-v7-Qwen2-7B

Einstein-v7-Qwen2-7B 是基於 Qwen/Qwen2-7B 在多種數據集上進行全量微調得到的模型。它在科學、物理、化學、生物、數學等多個領域表現出色，為文本生成任務提供了強大的支持。

image/png

🚀 快速開始

模型基礎信息

屬性	詳情
基礎模型	Qwen/Qwen2-7B
模型類型	基於 Qwen2-7B 全量微調的文本生成模型
訓練數據集	allenai/ai2_arc、camel-ai/physics、camel-ai/chemistry 等眾多數據集

提示模板

在使用該模型時，可以使用 ChatML 提示模板：

ChatML

<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
{asistant}<|im_end|>

這個提示模板可以作為聊天模板使用，意味著你可以使用 tokenizer.apply_chat_template() 方法來格式化消息：

messages = [
    {"role": "system", "content": "You are helpful AI asistant."},
    {"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)

✨ 主要特性

多領域數據集訓練：使用了涵蓋科學、物理、化學、生物、數學等多個領域的數據集進行訓練，使模型在這些領域的文本生成任務中表現出色。
特定硬件微調：使用 8xMI300X 硬件進行微調，充分發揮硬件性能。
支持 ChatML 模板：方便用戶進行對話式文本生成。

📦 安裝指南

文檔未提供具體安裝步驟，故跳過該章節。

💻 使用示例

基礎用法

使用 ChatML 模板進行文本生成：

messages = [
    {"role": "system", "content": "You are helpful AI asistant."},
    {"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)

📚 詳細文檔

數據集使用情況

本模型訓練所使用的數據集在模型卡片的元數據部分列出。需要注意的是，元數據中提到的某些數據集可能根據各種標準進行了過濾。過濾過程的結果和相關信息在另一個倉庫中：Weyaxi/sci-datasets/main

量化版本

GGUF @bartowski

https://huggingface.co/bartowski/Einstein-v7-Qwen2-7B-GGUF

ExLlamaV2 @bartowski

https://huggingface.co/bartowski/Einstein-v7-Qwen2-7B-exl2

評估結果

Open LLM Leaderboard v2 評估結果詳細結果可查看這裡

指標	值
平均值	24.01
IFEval (0-Shot)	41.00
BBH (3-Shot)	32.84
MATH Lvl 5 (4-Shot)	15.18
GPQA (0-shot)	6.60
MuSR (0-shot)	14.06
MMLU-PRO (5-shot)	34.40

訓練相關信息

本模型進行了 2 個 epoch 的全量微調，總步數為 500。

損失圖

image/png

🔧 技術細節

axolotl 配置

查看 axolotl 配置

axolotl 版本：0.4.0

base_model: Qwen/Qwen2-7B
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: false
strict: false

chat_template: chatml
datasets:
  - path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/buzz_unstacked_chosen_math_removed_filtered.json
    ds_type: json
    type: alpaca
    conversation: chatml

  - path: data/capybara_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/everythinglm-data-v3_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/gpt4_data_lmys_1m_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/gpteacher-instruct-special-alpaca.json
    ds_type: json
    type: gpteacher
    conversation: chatml

  - path: data/merged_all.json
    ds_type: json
    type: alpaca
    conversation: chatml

  - path: data/no_robots_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/oasst_top1_from_fusechatmixture_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/pippa_bagel_repo_3k_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/rpguild_quarter_alignment_lab_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/sharegpt_gpt4_english.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/slimorca_dedup_filtered_95k_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/soda_diaolog_longest_tenth_buzz_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/synthia-v1.3_sharegpt_12500.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/system_conversations_dolphin_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml
  
dataset_prepared_path: last_run_prepared
val_set_size: 0.002

output_dir: ./Einstein-v7-Qwen2-7B-model

sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

wandb_project: Einstein
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:
hub_model_id: Weyaxi/Einstein-v7-Qwen2-7B

gradient_accumulation_steps: 4
micro_batch_size: 6
num_epochs: 2
optimizer: paged_adamw_8bit
lr_scheduler: cosine
learning_rate: 0.00001 # look

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: unsloth
gradient_checkpointing_kwargs:
   use_reentrant: true # look
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:

deepspeed: deepspeed_configs/zero3_bf16.json
weight_decay: 0.05
fsdp:
fsdp_config:
special_tokens:
  eos_token: "<|im_end|>"
  pad_token: "<|end_of_text|>"
tokens:
  - "<|im_start|>"
  - "<|im_end|>"