Einstein-v7-Qwen2-7B Open-source Text Generation Model - Free to Use, Excelling in Multiple Scientific Fields

Einstein V7 Qwen2 7B

Developed by Weyaxi

Einstein-v7-Qwen2-7B is a text generation model obtained by full fine-tuning on Qwen/Qwen2-7B using various scientific field datasets. It performs excellently in multiple fields such as science, physics, chemistry, biology, and mathematics.

Large Language Model

Transformers

EnglishOpen Source License:Other #Science field experts #Multidisciplinary knowledge base #ChatML dialogue optimization

Downloads 1,927

Release Time : 6/24/2024

Model Overview

This model is a full fine-tuning version based on the Qwen2-7B architecture, focusing on text generation tasks in the scientific field. It supports multi-field knowledge Q&A and content generation.

Model Features

Multi-field scientific knowledge

Specifically trained in multiple fields such as science, physics, chemistry, biology, and mathematics, with the ability to generate text in professional fields

High-performance hardware optimization

Fine-tuned using 8xMI300X hardware to fully leverage the hardware performance

ChatML template support

Supports the ChatML dialogue template, facilitating dialogue-style text generation

Long context processing

Supports a sequence length of 8192 and can handle long text content

Model Capabilities

Scientific field text generation

Multi-field knowledge Q&A

Professional content creation

Educational assistance

Research support

Use Cases

Education

Scientific knowledge explanation

Explain complex scientific concepts and principles to students

Provide accurate and easy-to-understand scientific knowledge explanations

Homework tutoring

Help students solve homework problems in subjects such as science and mathematics

Provide step-by-step solutions and detailed explanations

Research

Literature abstract

Generate abstracts and key points of scientific literature for researchers

Quickly understand the core content of the literature

Research idea generation

Help researchers generate new research ideas and experimental designs

Provide innovative research direction suggestions

🚀 🔬 Einstein-v7-Qwen2-7B

This model is a fully fine-tuned version of Qwen/Qwen2-7B on diverse datasets. It's finetuned using 8xMI300X with axolotl and trained with compute resources from TensorWave.

🚀 Quick Start

Model Usage

You can use the ChatML prompt template when using the model. The template is available as a chat template, allowing you to format messages with the tokenizer.apply_chat_template() method.

ChatML Prompt Template

<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
{asistant}<|im_end|>

Python Code Example

messages = [
    {"role": "system", "content": "You are helpful AI asistant."},
    {"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)

✨ Features

Full Fine - Tuning: Based on the Qwen/Qwen2-7B base model, it has been fully fine - tuned on a wide range of datasets.
Diverse Datasets: Trained on various datasets related to science, math, and general instruction, ensuring comprehensive knowledge and strong generalization ability.
ChatML Support: Supports the ChatML prompt template, making it easy to interact with the model.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

messages = [
    {"role": "system", "content": "You are helpful AI asistant."},
    {"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)

📚 Documentation

Datasets Used

The datasets used to train this model are listed in the metadata section of the model card. Note that some datasets in the metadata may have been filtered according to different criteria. The filtering results are in a different repository: Weyaxi/sci - datasets/main

Quantizationed Versions

GGUF @bartowski

https://huggingface.co/bartowski/Einstein-v7-Qwen2-7B-GGUF

ExLlamaV2 @bartowski

https://huggingface.co/bartowski/Einstein-v7-Qwen2-7B-exl2

Evaluation Results

The model's evaluation results on the Open LLM Leaderboard v2 are as follows:

Metric	Value
Avg.	24.01
IFEval (0 - Shot)	41.00
BBH (3 - Shot)	32.84
MATH Lvl 5 (4 - Shot)	15.18
GPQA (0 - shot)	6.60
MuSR (0 - shot)	14.06
MMLU - PRO (5 - shot)	34.40

Detailed results can be found here

Additional Resources

Announcement tweet: https://twitter.com/Weyaxi/status/1809644014515154961
Reddit post in r/LocalLLaMA: https://www.reddit.com/r/LocalLLaMA/comments/1dy6o4l/introducing_einstein_v7_based_on_the_qwen2_7b/

Training Information

This model is fully fine - tuned for 2 epochs, with a total of 500 steps.

Loss graph

image/png

🔧 Technical Details

Axolotl Config

See axolotl config

axolotl version: 0.4.0

base_model: Qwen/Qwen2-7B
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: false
load_in_4bit: false
strict: false

chat_template: chatml
datasets:
  - path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/buzz_unstacked_chosen_math_removed_filtered.json
    ds_type: json
    type: alpaca
    conversation: chatml

  - path: data/capybara_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/everythinglm-data-v3_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/gpt4_data_lmys_1m_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/gpteacher-instruct-special-alpaca.json
    ds_type: json
    type: gpteacher
    conversation: chatml

  - path: data/merged_all.json
    ds_type: json
    type: alpaca
    conversation: chatml

  - path: data/no_robots_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/oasst_top1_from_fusechatmixture_sharegpt.json
    ds_type: json
    type: sharegpt
    strict: false
    conversation: chatml

  - path: data/pippa_bagel_repo_3k_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/rpguild_quarter_alignment_lab_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/sharegpt_gpt4_english.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/slimorca_dedup_filtered_95k_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/soda_diaolog_longest_tenth_buzz_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/synthia-v1.3_sharegpt_12500.json
    ds_type: json
    type: sharegpt
    conversation: chatml

  - path: data/system_conversations_dolphin_sharegpt.json
    ds_type: json
    type: sharegpt
    conversation: chatml
  
dataset_prepared_path: last_run_prepared
val_set_size: 0.002

output_dir: ./Einstein-v7-Qwen2-7B-model

sequence_len: 8192
sample_packing: true
pad_to_sequence_len: true
eval_sample_packing: false

wandb_project: Einstein
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:
hub_model_id: Weyaxi/Einstein-v7-Qwen2-7B

gradient_accumulation_steps: 4
micro_batch_size: 6
num_epochs: 2
optimizer: paged_adamw_8bit
lr_scheduler: cosine
learning_rate: 0.00001 # look

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: unsloth
gradient_checkpointing_kwargs:
   use_reentrant: true # look
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:

deepspeed: deepspeed_configs/zero3_bf16.json
weight_decay: 0.05
fsdp:
fsdp_config:
special_tokens:
  eos_token: "<|im_end|>"
  pad_token: "<|end_of_text|>"
tokens:
  - "<|im_start|>"
  - "<|im_end|>"

📄 License

The license of this model is other.

🤝 Acknowledgments

Thanks to all the dataset authors mentioned in the datasets section. Thanks to axolotl for providing the repository used to create this model. Thanks to all open - source AI communities.

If you would like to support me:

☕ Buy Me a Coffee

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご