Llama3-Chinese-8B-Instruct-Agent-v1 Open Source Model - Adapted to Chinese Scenarios and Supports Agent Calls

Home

Llama3 Chinese 8B Instruct Agent V1

Developed by modelscope

Trained on the Llama3-8b-instruct base model, adapted for Chinese general scenarios, supports ReACT format agent calls

Large Language Model

Transformers

#Chinese Agent #ReACT Format Support #Multi-source Corpus Training

Downloads 17

Release Time : 4/23/2024

Model Overview

This model is optimized for Chinese scenarios, supports agent interaction, and is suitable for various natural language processing tasks

Model Features

Chinese Scenario Optimization

Specially trained for Chinese internet content, including traditional Chinese knowledge, Douban, Zhihu, and other Chinese corpora

Agent Support

Supports ReACT format agent calls and can be used in conjunction with the ModelScopeAgent framework

Multi-source Training Data

Integrates multiple data sources including COIG-CQIA, ModelScope agent training set, alpaca English dataset, and ms-bench Chinese QA set

Efficient Fine-tuning

Uses LoRA technology for efficient fine-tuning, maintaining model performance while reducing training costs

Model Capabilities

Chinese Text Generation

Agent Interaction

QA System

Knowledge Reasoning

Use Cases

Intelligent Customer Service

Chinese QA System

Building a knowledge-based Chinese QA system

Educational Assistance

Traditional Knowledge QA

Answering questions related to Chinese traditional culture

Internet Applications

Community Content Generation

Generating content suitable for platforms like Douban and Zhihu

🚀 Magic Model Llama3 8B Chinese Agent Intelligent Model

This model is trained using the Llama3-8b-instruct base model, adapted to general Chinese scenarios, and supports Agent calls in the ReACT format.

🚀 Quick Start

💻 Usage Examples

Basic Usage

# Install dependencies
pip install ms-swift -U

# Inference
swift infer --model_type llama3-8b-instruct --model_id_or_path swift/Llama3-Chinese-8B-Instruct-Agent-v1

# Deployment
swift deploy --model_type llama3-8b-instruct --model_id_or_path swift/Llama3-Chinese-8B-Instruct-Agent-v1

This model can be used in conjunction with the ModelScopeAgent framework. Please refer to: https://github.com/modelscope/swift/blob/main/docs/source/LLM/Agent%E5%BE%AE%E8%B0%83%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md#%E6%90%AD%E9%85%8Dmodelscope-agent%E4%BD%BF%E7%94%A8

✨ Features

To adapt to Chinese and Agent scenarios, we have mixed and proportioned the corpora. The corpora used for training Llama3 are as follows:

COIG-CQIA: https://modelscope.cn/datasets/AI-ModelScope/COIG-CQIA/summary. This dataset contains Chinese Internet information such as traditional Chinese knowledge, Douban, RuoZhiBa, and Zhihu.
Magic Model General Agent Training Dataset: https://modelscope.cn/datasets/AI-ModelScope/ms-agent-for-agentfabric/summary
alpaca-en: https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary
ms-bench Magic Model General Chinese Q&A Dataset: https://modelscope.cn/datasets/iic/ms_bench/summary

Property	Details
Model Type	Llama3 8B Chinese Agent Intelligent Model
Frameworks	Pytorch
License	llama3
Tasks	Text Generation
Training Data	COIG-CQIA, Magic Model General Agent Training Dataset, alpaca-en, ms-bench Magic Model General Chinese Q&A Dataset

Hyperparameter	Value
lr	5e-5
epoch	2
lora_rank	8
lora_alpha	32
lora_target_modules	ALL
batch_size	2
gradient_accumulation_steps	16

📦 Installation

NPROC_PER_NODE=8 \
swift sft \
  --model_type llama3-8b-instruct \
  --dataset ms-agent-for-agentfabric-default alpaca-en ms-bench ms-agent-for-agentfabric-addition coig-cqia-ruozhiba coig-cqia-zhihu coig-cqia-exam coig-cqia-chinese-traditional coig-cqia-logi-qa coig-cqia-segmentfault coig-cqia-wiki \
  --batch_size 2 \
  --max_length 2048 \
  --use_loss_scale true \
  --gradient_accumulation_steps 16 \
  --learning_rate 5e-5 \
  --use_flash_attn true \
  --eval_steps 500 \
  --save_steps 500 \
  --train_dataset_sample -1 \
  --dataset_test_ratio 0.1 \
  --val_dataset_sample 10000 \
  --num_train_epochs 2 \
  --check_dataset_strategy none \
  --gradient_checkpointing true \
  --weight_decay 0.01 \
  --warmup_ratio 0.03 \
  --save_total_limit 2 \
  --logging_steps 10 \
  --sft_type lora \
  --lora_target_modules ALL \
  --lora_rank 8 \
  --lora_alpha 32

🔧 Technical Details

Evaluation Model	ARC	CEVAL	GSM8K
Llama3-8b-instruct	0.7645	0.5089	0.7475
Llama3-Chinese-8B-Instruct-Agent-v1	0.7577	0.4903	0.652

The English mathematical ability on GSM8K has decreased by about 8 points. Through ablation experiments, we found that removing the alpaca-en corpus would cause a decrease of at least 10 points on GSM8K.

📄 License

This model is under the llama3 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご