Guanaco-7B-LEH-v2 Open-Source Language Model - Supports Multiple Languages and Suitable for Chatting and Instruction-Following Tasks

Guanaco 7b Leh V2

Developed by KBlueLeaf

A multilingual instruction-following language model based on LLaMA 7B, supporting English, Chinese, and Japanese, suitable for chatbots and instruction-following tasks.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Gpl-3.0 #Multilingual Instruction Following #Chatbot Optimization #Embedding Layer Fine-tuning

Downloads 474

Release Time : 4/2/2023

Model Overview

This model is trained using guanaco-lora, with training applied to the lora, embed_tokens, and lm_head components. It outperforms the original LLaMA in Chinese and Japanese and supports instruction-based prompts.

Model Features

Multilingual Support

Supports English, Chinese, and Japanese, with better performance in non-English languages compared to the original LLaMA.

Instruction Following

Capable of understanding and executing instruction-based prompts, suitable for various tasks.

Chatbot Optimization

Training data includes extensive chat data, making it well-suited for chatbot applications.

Improved Training Setup

Uses bf16 training, larger batch sizes, and longer context truncation lengths to enhance model performance.

Model Capabilities

Text generation

Multilingual translation

Instruction following

Chatbot

Use Cases

Chatbot

Multilingual Chat

Use the model for chat conversations in English, Chinese, or Japanese.

The model can respond fluently, but the content may be somewhat disorganized. It is recommended to add guiding content in the system part.

Translation

Multilingual Translation

Translate Chinese articles into Japanese, German, or English.

The model performs well in translation tasks, with GPT-4 evaluations indicating high translation quality.

Instruction Following

Task Execution

Execute specific tasks based on user instructions, such as generating text or answering questions.

The model can understand and execute instructions, generating relevant responses.

🚀 Guanaco-leh-V2: A Multilingual Instruction-Following Language Model Based on LLaMA 7B

Guanaco-leh-V2 is a multilingual instruction-following language model based on LLaMA 7B, trained with specific techniques to enhance performance in multiple languages and chatbot scenarios.

🚀 Quick Start

This model is trained with guanaco-lora, where lora + embed_tokens + lm_head are trained. The dataset is sourced from alpaca-cleaned and guanaco.

With trained embeddings and heads, the model performs better in Chinese and Japanese than the original LLaMA, especially when using instruction-based prompts. This makes the model easier to use.

Since this model is trained on the guanaco dataset, it can also be used as a chatbot. Use the following format:

### Instruction:
User: <Message history>
Assistant: <Message history>

### Input:
System: <System response for next message, optional>
User: <Next message>

### Response:

⚠️ Important Note

I just removed the first line of the original prompt to reduce token consumption. Please consider removing it when using this model.

✨ Features

Difference between previous model

The main differences are:

The model is trained on bf16 instead of 8-bit.
The ctx cut-off length is increased to 1024.
A larger dataset is used (latest guanaco + alpaca cleaned = 540k entries).
A larger batch size is used (64 -> 128).

Since the training data contains more chat-based data, this model is more suitable for chatbot usage.

Try this model

You can try this model with this colab. Or use the generate.py in the guanaco-lora. All the examples are generated by guanaco-lora.

If you want to use the lora model from guanaco-7b-leh-v2-adapter/, remember to turn off the load_in_8bit, or manually merge it into the 7B model!

💡 Usage Tip

Recommended generation parameters:

Temperature: 0.5 - 0.7

Top p: 0.65 - 1.0

Top k: 30 - 50

Repeat penalty: 1.03 - 1.17

🔧 Technical Details

Training Setup

2x3090 with model parallel
Batch size = bsz 8 * grad acc 16 = 128
ctx cut-off length = 1024
Only train on output (with loss mask)
Enable group of len
538k entries, 2 epochs (about 8400 steps)
lr 2e-4

Why use lora+embed+head

First, it is obvious that when a large language model (LLM) is not proficient in a certain language and you want to fine-tune it, you should train the embedding and head parts.

But the question is: "Why not just perform native fine-tuning?"

If you have searched for some alpaca models or training materials, you may notice that many of them have one problem: "memorization". The loss will drop at the beginning of each epoch, similar to a kind of "overfitting".

In my opinion, this is because the number of parameters in LLaMA is too large, so it simply memorizes all the training data.

However, if I use lora for the attention part (ignoring the MLP part), the number of parameters is not large enough for "memorizing the training data", so it is less likely to memorize everything.

💻 Usage Examples

Basic Usage

You can try this model with the provided colab.

Advanced Usage

If you want to use the lora model from guanaco-7b-leh-v2-adapter/, remember to turn off the load_in_8bit, or manually merge it into the 7B model.

📚 Documentation

Some Example

As shown in the following images, although Guanaco can reply fluently, the content can be quite confusing. So you may want to add some information in the system part.

Example 1 Example 2

I used Guanaco with instructions to translate a Chinese article into Japanese, German, and English. Then I used GPT-4 to score them and obtained the following results:

Scoring Results

📄 License

This project is licensed under the GPL-3.0 license.

Information Table

Property	Details
Model Type	Multilingual Instruction-Following Language Model Based on LLaMA 7B
Training Data	alpaca-cleaned, guanaco
Supported Languages	English, Chinese, Japanese
Tags	llama, guanaco, alpaca, lora, finetune

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご