BELLE-7B-2M Open-source AI Model - Free to Use, Efficiently Achieve Chinese and English Instruction Understanding and Generation

BELLE 7B 2M

Developed by BelleGroup

BELLE is a Chinese-English instruction understanding and generation model fine-tuned based on Bloomz-7b1-mt, with excellent Chinese processing capabilities

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Chinese instruction fine-tuning #Multi-turn dialogue support #Chinese-English bilingual generation

Downloads 55

Release Time : 3/20/2023

Model Overview

A large language model fine-tuned with 2 million Chinese data points and 50,000 English instruction data points, excelling in Chinese instruction understanding and text generation

Model Features

Chinese optimization

Specifically optimized for Chinese scenarios, with significant effects on Chinese tasks

Multi-task support

Capable of performing various text tasks such as code generation, sentiment analysis, and translation

Optional data scale

Provides multiple versions trained with different data volumes ranging from 200,000 to 2 million

Model Capabilities

Chinese text generation

English text generation

Code generation

Sentiment analysis

Text translation

Creative writing

Q&A system

Suggestion provision

Use Cases

Programming assistance

Code generation

Generate Python code based on natural language descriptions

Can generate implementations of basic algorithms like binary search

Sentiment analysis

Sentiment classification

Determine the emotional tendency of text (positive/neutral/negative)

Can accurately identify strong emotions like excitement

Content creation

Poetry creation

Generate Chinese poetry based on themes

Can create rhyming poetry that fits the theme

🚀 Model Card for Model ID

BELLE is a large language model based on Bloomz-7b1-mt. It is finetuned with a combination of 2 million Chinese data and 50,000 pieces of English data from Stanford-Alpaca, enabling it to have excellent Chinese instruction understanding and response generation capabilities.

🚀 Quick Start

If you find this model helpful, please like this model and star us on GitHub!

✨ Features

Multilingual Capability: Supports both Chinese and English, suitable for a wide range of language tasks.
Instruction Understanding: Can accurately understand various instructions and generate appropriate responses.
Multiple Model Versions: Trained on datasets of different sizes, providing users with more choices.

📦 Installation

This section is skipped as there is no specific installation step provided in the original document.

💻 Usage Examples

Basic Usage

Please note that the input should be formatted as follows in both training and inference:

Human: {input} \n\nAssistant:

BELLE can be easily loaded with AutoModelForCausalLM.

from transformers import AutoTokenizer, AutoModelForCausalLM
import sys

model_path = "./" # You can modify the path for storing the local model
model =  AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
print("Human:")
line = input()
while line:
        inputs = 'Human: ' + line.strip() + '\n\nAssistant:'
        input_ids = tokenizer(inputs, return_tensors="pt").input_ids
        outputs = model.generate(input_ids, max_new_tokens=200, do_sample = True, top_k = 30, top_p = 0.85, temperature = 0.35, repetition_penalty=1.2)
        rets = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        print("Assistant:\n" + rets[0].strip().replace(inputs, ""))
        print("\n------------------------------------------------\nHuman:")
        line = input()

Advanced Usage

The above example did not utilize the context from previous turns of the conversation. If you want to perform multi-turn dialogue understanding, you can concatenate the previous turns of the conversation in the format of Human:{input}\n\nAssistant:{output} and input it to the generate function.

📚 Documentation

Model description

BELLE is based on Bloomz-7b1-mt and finetuned with 2M Chinese data combined with 50,000 pieces of English data from the open source Stanford-Alpaca, resulting in good Chinese instruction understanding and response generation capabilities.

The code of Chinese data generation and other detailed information can be found in our Github project repository.

We trained models using datasets of different sizes (200,000, 600,000, 1,000,000, and 2,000,000 samples) for instruction learning, and we obtained different model versions as shown below:

Datasize	200,000	600,000	1,000,000	2,000,000
Finetuned Model	BELLE-7B-0.2M	BELLE-7B-0.6M	BELLE-7B-1M	BELLE-7B-2M

Training hyper-parameters

Parameter	Value
Batch size	64
Learning rate	3e-6
Epochs	3
Weight_decay	0.001
Warmup_rate	0.1
LR_scheduler	linear

Some use cases

Human	Assistant
Help write a prose poem with the theme of "taking root. One day, we will become towering trees."	In the journey of life, each of us will encounter various challenges and difficulties. However, these difficulties are not reasons for us to stay stagnant but opportunities for us to grow and progress. Only by constantly striving to take root can we eventually become towering trees. Firstly, taking root requires us to have firm beliefs and goals. People without goals and beliefs hardly have the motivation to move forward. Therefore, we need to clarify our ideals and pursuits and make efforts to achieve them. Only in the process of continuous effort can we gradually take root and form a strong inner strength. Secondly, taking root requires us to have a diligent spirit. Success does not depend on talent but on hard work and perseverance. Only by constantly learning, practicing, and accumulating experience can we gradually improve our abilities and levels. This spirit is also what we need when facing various difficulties because only by persevering can we overcome all difficulties and move towards the path of success. Finally, taking root requires us to have the courage to accept failure. Setbacks and failures are inevitable in life, but we should not be discouraged or give up. Instead, we should learn from failures, summarize experiences and lessons, and constantly adjust our directions and methods. Only by being brave in trying and innovating can we truly take deep roots and realize our life values. In conclusion, taking root is a long and challenging process that requires our continuous effort and exploration. Only by constantly strengthening our roots can we walk more steadily on the future path and get closer to our dreams.
Generate a dialogue between a real estate agent and a customer.	Agent: Hello! I'm an agent from XX Real Estate Agency. What information would you like to know? Customer: I'm looking for an apartment where I can live comfortably and safely. Agent: There are many such listings in our company. I can introduce some suitable options to you. Customer: Okay, please tell me some details. Agent: This apartment is located in the city center, within walking distance of many shops and restaurants, and the transportation is very convenient. In addition, it is equipped with modern facilities and a security system to ensure a safer and more comfortable living experience for you. Customer: Sounds great. I'd like to see the photos and videos. Agent: Of course, here are several listings in this area. You can choose the most suitable apartment from them. Customer: Okay, I'll consider it. Agent: Thank you very much for your consultation. If you have any other questions or need help, please feel free to contact me.
Classify the sentiment of the following sentence into Positive, Neutral, or Negative: Super excited about teaching Stanford’s first course on Large Language Models! Check the syllabus out here	Positive
List 3 different machine learning algorithms and explain their application scopes.	1. Support Vector Machine (SVM) - Suitable for classification and regression problems, especially for handling non-linear datasets. 2. Decision Tree - Suitable for classification, regression, and clustering problems, and can be well understood and visualized. 3. Random Forest - Suitable for classification and regression problems, capable of effectively handling high-dimensional data and problems with difficult feature selection.
When was Jay Chou born?	Jay Chou was born on January 18, 1979.
When should women get married?	It depends on personal意愿 and circumstances. Some women may want to get married earlier, while others may prefer to get married later. In addition, factors such as women's education levels and career development will also affect their decisions.
Help me write a weekly report with the themes of "ChatGPT" and "customer service".	Dear leaders and colleagues: This week, our team's main work was to use ChatGPT (a chatbot platform) to improve the quality of customer service. Here is a brief report on this topic: 1. Using ChatGPT for online chat In the past week, our team used ChatGPT for online chat. ChatGPT can provide more intelligent answers, enabling customers to get the help they need more quickly. In addition, it can also automatically recommend relevant products and services according to customers' needs. This has greatly improved the efficiency and quality of our service. 2. Optimizing the customer service process Last week, our team made some improvements to the customer service process. We found that if customers encounter complex problems or need more support, their satisfaction will decrease. Therefore, we added some new functions to the customer service platform, such as self-service Q&A and knowledge bases, which can help customers better solve problems and obtain more information. 3. Strengthening training and support To further improve our customer service level, our team is strengthening employee training and support. We will regularly organize internal training and external learning opportunities to enable employees to continuously update their skills and knowledge. At the same time, we have also strengthened communication with customers to understand their needs and feedback to better meet their requirements. In general, this week's key work was to use ChatGPT to improve the customer service experience and optimize the customer service process. We will also continue to strengthen employee training and support to improve the service ability and level of the entire team. Thank you for your attention and support!
Translate the sentence "Playing a good game of Go requires strong computing ability" into English.	To play well in a game of Go requires strong computing ability.

🔧 Technical Details

This section is skipped as there is no specific technical detail provided in the original document.

📄 License

The model is licensed under the Apache-2.0 license.

Citation

Please cite us when using our code, data or model.

@misc{BELLE,
  author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
  title = {BELLE: Bloom-Enhanced Large Language model Engine },
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}

Also, please cite the original BLOOM, Stanford Alpaca, and Self-Instruct papers.

⚠️ Important Note

There still exist a few issues in the model trained on current base model and data:

The model might generate factual errors when asked to follow instructions related to facts.

Occasionally generates harmful responses since the model still struggles to identify potential harmful instructions.

Needs improvements on reasoning and coding.

Since the model still has its limitations, we require developers only use the open-sourced code, data, model and any other artifacts generated via this project for research purposes. Commercial use and other potential harmful use cases are not allowed.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご