Model Overview
Model Features
Model Capabilities
Use Cases
🚀 Model Card for Model ID
BELLE is a large language model based on Bloomz-7b1-mt. It is finetuned with a combination of 2 million Chinese data and 50,000 pieces of English data from Stanford-Alpaca, enabling it to have excellent Chinese instruction understanding and response generation capabilities.
🚀 Quick Start
If you find this model helpful, please like this model and star us on GitHub!
✨ Features
- Multilingual Capability: Supports both Chinese and English, suitable for a wide range of language tasks.
- Instruction Understanding: Can accurately understand various instructions and generate appropriate responses.
- Multiple Model Versions: Trained on datasets of different sizes, providing users with more choices.
📦 Installation
This section is skipped as there is no specific installation step provided in the original document.
💻 Usage Examples
Basic Usage
Please note that the input should be formatted as follows in both training and inference:
Human: {input} \n\nAssistant:
BELLE can be easily loaded with AutoModelForCausalLM
.
from transformers import AutoTokenizer, AutoModelForCausalLM
import sys
model_path = "./" # You can modify the path for storing the local model
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
print("Human:")
line = input()
while line:
inputs = 'Human: ' + line.strip() + '\n\nAssistant:'
input_ids = tokenizer(inputs, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_new_tokens=200, do_sample = True, top_k = 30, top_p = 0.85, temperature = 0.35, repetition_penalty=1.2)
rets = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print("Assistant:\n" + rets[0].strip().replace(inputs, ""))
print("\n------------------------------------------------\nHuman:")
line = input()
Advanced Usage
The above example did not utilize the context from previous turns of the conversation. If you want to perform multi-turn dialogue understanding, you can concatenate the previous turns of the conversation in the format of Human:{input}\n\nAssistant:{output}
and input it to the generate
function.
📚 Documentation
Model description
BELLE is based on Bloomz-7b1-mt and finetuned with 2M Chinese data combined with 50,000 pieces of English data from the open source Stanford-Alpaca, resulting in good Chinese instruction understanding and response generation capabilities.
The code of Chinese data generation and other detailed information can be found in our Github project repository.
We trained models using datasets of different sizes (200,000, 600,000, 1,000,000, and 2,000,000 samples) for instruction learning, and we obtained different model versions as shown below:
Datasize | 200,000 | 600,000 | 1,000,000 | 2,000,000 |
---|---|---|---|---|
Finetuned Model | BELLE-7B-0.2M | BELLE-7B-0.6M | BELLE-7B-1M | BELLE-7B-2M |
Training hyper-parameters
Parameter | Value |
---|---|
Batch size | 64 |
Learning rate | 3e-6 |
Epochs | 3 |
Weight_decay | 0.001 |
Warmup_rate | 0.1 |
LR_scheduler | linear |
Some use cases
Human | Assistant |
---|---|
Help write a prose poem with the theme of "taking root. One day, we will become towering trees." | In the journey of life, each of us will encounter various challenges and difficulties. However, these difficulties are not reasons for us to stay stagnant but opportunities for us to grow and progress. Only by constantly striving to take root can we eventually become towering trees. Firstly, taking root requires us to have firm beliefs and goals. People without goals and beliefs hardly have the motivation to move forward. Therefore, we need to clarify our ideals and pursuits and make efforts to achieve them. Only in the process of continuous effort can we gradually take root and form a strong inner strength. Secondly, taking root requires us to have a diligent spirit. Success does not depend on talent but on hard work and perseverance. Only by constantly learning, practicing, and accumulating experience can we gradually improve our abilities and levels. This spirit is also what we need when facing various difficulties because only by persevering can we overcome all difficulties and move towards the path of success. Finally, taking root requires us to have the courage to accept failure. Setbacks and failures are inevitable in life, but we should not be discouraged or give up. Instead, we should learn from failures, summarize experiences and lessons, and constantly adjust our directions and methods. Only by being brave in trying and innovating can we truly take deep roots and realize our life values. In conclusion, taking root is a long and challenging process that requires our continuous effort and exploration. Only by constantly strengthening our roots can we walk more steadily on the future path and get closer to our dreams. |
Generate a dialogue between a real estate agent and a customer. | Agent: Hello! I'm an agent from XX Real Estate Agency. What information would you like to know? Customer: I'm looking for an apartment where I can live comfortably and safely. Agent: There are many such listings in our company. I can introduce some suitable options to you. Customer: Okay, please tell me some details. Agent: This apartment is located in the city center, within walking distance of many shops and restaurants, and the transportation is very convenient. In addition, it is equipped with modern facilities and a security system to ensure a safer and more comfortable living experience for you. Customer: Sounds great. I'd like to see the photos and videos. Agent: Of course, here are several listings in this area. You can choose the most suitable apartment from them. Customer: Okay, I'll consider it. Agent: Thank you very much for your consultation. If you have any other questions or need help, please feel free to contact me. |
Classify the sentiment of the following sentence into Positive, Neutral, or Negative: Super excited about teaching Stanford’s first course on Large Language Models! Check the syllabus out here |
Positive |
List 3 different machine learning algorithms and explain their application scopes. | 1. Support Vector Machine (SVM) - Suitable for classification and regression problems, especially for handling non-linear datasets. 2. Decision Tree - Suitable for classification, regression, and clustering problems, and can be well understood and visualized. 3. Random Forest - Suitable for classification and regression problems, capable of effectively handling high-dimensional data and problems with difficult feature selection. |
When was Jay Chou born? | Jay Chou was born on January 18, 1979. |
When should women get married? | It depends on personal意愿 and circumstances. Some women may want to get married earlier, while others may prefer to get married later. In addition, factors such as women's education levels and career development will also affect their decisions. |
Help me write a weekly report with the themes of "ChatGPT" and "customer service". | Dear leaders and colleagues: This week, our team's main work was to use ChatGPT (a chatbot platform) to improve the quality of customer service. Here is a brief report on this topic: 1. Using ChatGPT for online chat In the past week, our team used ChatGPT for online chat. ChatGPT can provide more intelligent answers, enabling customers to get the help they need more quickly. In addition, it can also automatically recommend relevant products and services according to customers' needs. This has greatly improved the efficiency and quality of our service. 2. Optimizing the customer service process Last week, our team made some improvements to the customer service process. We found that if customers encounter complex problems or need more support, their satisfaction will decrease. Therefore, we added some new functions to the customer service platform, such as self-service Q&A and knowledge bases, which can help customers better solve problems and obtain more information. 3. Strengthening training and support To further improve our customer service level, our team is strengthening employee training and support. We will regularly organize internal training and external learning opportunities to enable employees to continuously update their skills and knowledge. At the same time, we have also strengthened communication with customers to understand their needs and feedback to better meet their requirements. In general, this week's key work was to use ChatGPT to improve the customer service experience and optimize the customer service process. We will also continue to strengthen employee training and support to improve the service ability and level of the entire team. Thank you for your attention and support! |
Translate the sentence "Playing a good game of Go requires strong computing ability" into English. | To play well in a game of Go requires strong computing ability. |
🔧 Technical Details
This section is skipped as there is no specific technical detail provided in the original document.
📄 License
The model is licensed under the Apache-2.0 license.
Citation
Please cite us when using our code, data or model.
@misc{BELLE,
author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
title = {BELLE: Bloom-Enhanced Large Language model Engine },
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}
Also, please cite the original BLOOM, Stanford Alpaca, and Self-Instruct papers.
⚠️ Important Note
There still exist a few issues in the model trained on current base model and data:
- The model might generate factual errors when asked to follow instructions related to facts.
- Occasionally generates harmful responses since the model still struggles to identify potential harmful instructions.
- Needs improvements on reasoning and coding.
Since the model still has its limitations, we require developers only use the open-sourced code, data, model and any other artifacts generated via this project for research purposes. Commercial use and other potential harmful use cases are not allowed.

