đ Jamba-v0.1 Chat Multilingual Model
This is a small - scale experiment model based on ai21labs/Jamba - v0.1, fine - tuned to work as a chatbot, capable of chatting in multiple languages.
đ Quick Start
This model was trained as a small - scale experiment to figure out how easy it is to fine - tune [ai21labs/Jamba - v0.1](https://huggingface.co/ai21labs/Jamba - v0.1) to function as a chatbot. The goal was to see how intelligently and reliably Jamba can chat in English and other languages after only a few hours of QLoRA finetuning. Initial subjective testing indicates that this model can chat reasonably well in both English and other languages, so feel free to give it a try!
⨠Features
- Multilingual Chat: Can chat in both English and other languages.
- System Message Control: Can be steered using system messages, with varying success in rhyming and modulating English difficulty.
- Multi - turn Interaction: Shows decent multi - turn accuracy.
đĻ Installation
Jamba requires you to use transformers
version 4.39.0 or higher:
pip install transformers>=4.39.0
To run optimized Mamba implementations, you first need to install mamba - ssm
and causal - conv1d
:
pip install mamba - ssm causal - conv1d>=1.2.0
You also have to have the model on a CUDA device.
You can run the model without using the optimized Mamba kernels, but it is not recommended as it will result in significantly lower latencies. To do that, you'll need to specify use_mamba_kernels = False
when loading the model.
đģ Usage Examples
Basic Usage
âģ - This code automatically appends the "<|startoftext|>" special token to any input. Appending this to all inputs is required for inference, as initial testing shows that leaving it out leads to output errors.
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
double_quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.float16
)
model = AutoModelForCausalLM.from_pretrained("lightblue/Jamba - v0.1 - chat - multilingual", device_map="auto", quantization_config=double_quant_config, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("lightblue/Jamba - v0.1 - chat - multilingual")
input_text = """<|im_start|>system
You are a helpful AI assistant.
<|im_end|>
<|im_start|>user
What is the most interesting fact about kangaroos that you know?
<|im_end|>
<|im_start|>assistant
"""
input_ids = tokenizer(input_text, return_tensors='pt').to(model.device)["input_ids"]
outputs = model.generate(input_ids, max_new_tokens=256, temperature=0.0, repetition_penalty=1.1)
print(tokenizer.batch_decode(outputs)[0])
Advanced Usage
The model can be controlled by system messages. For example, you can make the model write in simple or complex English, or make it answer in rhyme. Here are some examples:
<|startoftext|><|startoftext|><|im_start|>system
You are a helpful AI assistant. You write all answers in very simple English.
<|im_end|>
<|im_start|>user
Write a 50 word analysis of why sausages are better than bacon.
<|im_end|>
<|im_start|>assistant
Sausages are better than bacon because they have more variety and can be eaten for breakfast, lunch, or dinner. They come in different flavors and sizes, making them versatile. Sausages also contain less fat and calories compared to bacon, which is healthier for people who want to watch their diet.<|im_end|>
<|endoftext|>
đ Documentation
Initial testing results
Overall, the model seems to have decent knowledge in some areas while hallucinating incorrect information in others. It also seems fairly easy to steer the model using the system message, being able to rhyme and modulate English difficulty to varying levels of success. The multi - turn accuracy seems decent from very little testing. The multilingual tests show that this model fairly reliably replies in whatever language it has been prompted in.
Training details
The model was trained on 2 open - source datasets (one multilingual) for one epoch on an A100 (80GB) x 4 environment for 3 hours.
Training data
- [jondurbin/airoboros - 3.2](https://huggingface.co/datasets/jondurbin/airoboros - 3.2)
A ~59K example dataset of curated LLM tasks in English, primarily generated with GPT - 4. This dataset has been used by some of the best - performing open - source LLMs in the world (e.g. [jondurbin/bagel - 7b - v0.4](https://huggingface.co/jondurbin/bagel - 7b - v0.4), [NousResearch/Nous - Hermes - 2 - Mixtral - 8x7B - DPO](https://huggingface.co/NousResearch/Nous - Hermes - 2 - Mixtral - 8x7B - DPO))
đ§ Technical Details
Property |
Details |
Model Type |
Joint Attention and Mamba (Jamba) |
License |
Apache 2.0 |
Context length |
256K |
Knowledge cutoff date |
March 5, 2024 |
đ License
This model is licensed under the Apache 2.0 license.