🚀 Mistral-7B-Banking
A fine - tuned model for the banking domain, optimized to answer banking - related questions and assist with transactions.
🚀 Quick Start
The "Mistral-7B-Banking" model is a fine - tuned version of mistralai/Mistral-7B-Instruct-v0.2, specifically designed for the banking domain. It can answer questions and assist users with various banking transactions.
✨ Features
- Domain - specific: Tailored for the banking domain, providing accurate answers for banking - related queries.
- Customizable: A generic verticalized model that makes customization for a final use case much easier. For example, banks can further fine - tune it with their own data.
- Comprehensive training: Trained on a diverse dataset of banking - related intents.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = AutoModelForCausalLM.from_pretrained("bitext/Mistral-7B-Banking-v2")
tokenizer = AutoTokenizer.from_pretrained("bitext/Mistral-7B-Banking-v2")
messages = [
{"role": "system", "content": "You are an expert in customer support for Banking."},
{"role": "user", "content": "I want to open a bank account"},
]
encoded = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = encoded.to(device)
model.to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
📚 Documentation
Model Architecture
This model utilizes the MistralForCausalLM
architecture with a LlamaTokenizer
, ensuring it retains the foundational capabilities of the base model while being specifically enhanced for banking - related interactions.
Training Data
The model was fine - tuned on a dataset comprising various banking - related intents, including transactions like balance checks, money transfers, loan applications, and more, totaling 89 intents each represented by approximately 1000 examples. This comprehensive training helps the model address a broad spectrum of banking - related questions effectively. The dataset follows the same structured approach as our dataset published on Hugging Face as [bitext/Bitext - customer - support - llm - chatbot - training - dataset](https://huggingface.co/datasets/bitext/Bitext - customer - support - llm - chatbot - training - dataset), but with a focus on banking.
Training Procedure
Hyperparameters
- Optimizer: AdamW
- Learning Rate: 0.0002 with a cosine learning rate scheduler
- Epochs: 3
- Batch Size: 4
- Gradient Accumulation Steps: 4
- Maximum Sequence Length: 8192 tokens
Environment
- Transformers Version: 4.43.4
- Framework: PyTorch 2.3.1+cu121
- Tokenizers: Tokenizers 0.19.1
Intended Use
- Recommended applications: This model is designed to be used as the first step in Bitext’s two - step approach to LLM fine - tuning for the creation of chatbots, virtual assistants and copilots for the Banking domain, providing customers with fast and accurate answers about their banking needs.
- Out - of - scope: This model is not suited for non - banking related questions and should not be used for providing health, legal, or critical safety advice.
Limitations and Bias
- The model is trained for banking - specific contexts but may underperform in unrelated areas.
- Potential biases in the training data could affect the neutrality of the responses; users are encouraged to evaluate responses critically.
Ethical Considerations
It is important to use this technology thoughtfully, ensuring it does not substitute for human judgment where necessary, especially in sensitive financial situations.
Acknowledgments
This model was developed and trained by Bitext using proprietary data and technology.
📄 License
This model, "Mistral-7B-Banking", is licensed under the Apache License 2.0 by Bitext Innovations International, Inc. This open - source license allows for free use, modification, and distribution of the model but requires that proper credit be given to Bitext.
Key Points of the Apache 2.0 License
- Permissibility: Users are allowed to use, modify, and distribute this software freely.
- Attribution: You must provide proper credit to Bitext Innovations International, Inc. when using this model, in accordance with the original copyright notices and the license.
- Patent Grant: The license includes a grant of patent rights from the contributors of the model.
- No Warranty: The model is provided "as is" without warranties of any kind.
You may view the full license text at [Apache License 2.0](http://www.apache.org/licenses/LICENSE - 2.0).
Property |
Details |
Model Type |
mistral |
Training Data |
The model was fine - tuned on a dataset comprising various banking - related intents, including transactions like balance checks, money transfers, loan applications, and more, totaling 89 intents each represented by approximately 1000 examples. The dataset follows the same structured approach as [bitext/Bitext - customer - support - llm - chatbot - training - dataset](https://huggingface.co/datasets/bitext/Bitext - customer - support - llm - chatbot - training - dataset), but with a focus on banking. |
⚠️ Important Note
The model is trained for banking - specific contexts but may underperform in unrelated areas. Potential biases in the training data could affect the neutrality of the responses; users are encouraged to evaluate responses critically.
💡 Usage Tip
Use this model as the first step in Bitext’s two - step approach to LLM fine - tuning for the creation of chatbots, virtual assistants and copilots for the Banking domain. Ensure not to use it for non - banking related questions or for providing health, legal, or critical safety advice.