Jamba-v0.1-chat-multilingual Open Source Chat Model - Supports Smooth Multilingual Conversations

Jamba V0.1 Chat Multilingual

Developed by lightblue

A chatbot model fine-tuned based on ai21labs/Jamba-v0.1, supporting multilingual dialogue. After several hours of QLoRA fine-tuning, it can conduct reasonably fluent conversations in English and other languages.

Large Language Model

Transformers

Open Source License:Apache-2.0 #256K Ultra-Long Context #Multilingual Dialogue #QLoRA Efficient Fine-Tuning

Downloads 22

Release Time : 3/30/2024

Model Overview

This model is a small-scale experimental training project aimed at exploring how to fine-tune the Jamba model to function as a chatbot. Initial tests indicate that the model can conduct reasonably fluent conversations in both English and other languages.

Model Features

Multilingual Support

The model can conduct conversations in multiple languages, including English, Japanese, Polish, etc.

Long Context Handling

Supports context lengths of up to 256K, suitable for long conversations and complex tasks.

Rapid Fine-Tuning

Only a few hours of QLoRA fine-tuning are needed to achieve good conversational capabilities.

System Message Control

Model behavior can be easily guided via system messages, such as adjusting response style, language difficulty, etc.

Model Capabilities

Text Generation

Multilingual Dialogue

System Message Guidance

Multi-Turn Dialogue

Use Cases

Chatbot

English Dialogue

The model can conduct fluent English conversations and answer various questions.

Responses are reasonably fluent, but may occasionally produce hallucinatory misinformation in certain domains.

Multilingual Dialogue

The model can reply in the prompted language and supports multiple languages.

Tests show the model can reliably reply in the prompted language.

System Message Control

System messages can control the model's response style, such as simple English, complex English, or rhyming answers.

Results vary but generally achieve the intended effect.

Information Query

Factual Queries

Answers factual questions on various topics.

Performs well in some domains but may produce incorrect information in others.

🚀 Jamba-v0.1 Chat Multilingual Model

This is a small - scale experiment model based on ai21labs/Jamba - v0.1, fine - tuned to work as a chatbot, capable of chatting in multiple languages.

🚀 Quick Start

This model was trained as a small - scale experiment to figure out how easy it is to fine - tune [ai21labs/Jamba - v0.1](https://huggingface.co/ai21labs/Jamba - v0.1) to function as a chatbot. The goal was to see how intelligently and reliably Jamba can chat in English and other languages after only a few hours of QLoRA finetuning. Initial subjective testing indicates that this model can chat reasonably well in both English and other languages, so feel free to give it a try!

✨ Features

Multilingual Chat: Can chat in both English and other languages.
System Message Control: Can be steered using system messages, with varying success in rhyming and modulating English difficulty.
Multi - turn Interaction: Shows decent multi - turn accuracy.

📦 Installation

Jamba requires you to use transformers version 4.39.0 or higher:

pip install transformers>=4.39.0

To run optimized Mamba implementations, you first need to install mamba - ssm and causal - conv1d:

pip install mamba - ssm causal - conv1d>=1.2.0

You also have to have the model on a CUDA device.

You can run the model without using the optimized Mamba kernels, but it is not recommended as it will result in significantly lower latencies. To do that, you'll need to specify use_mamba_kernels = False when loading the model.

💻 Usage Examples

Basic Usage

※ - This code automatically appends the "<|startoftext|>" special token to any input. Appending this to all inputs is required for inference, as initial testing shows that leaving it out leads to output errors.

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

double_quant_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.float16
)

model = AutoModelForCausalLM.from_pretrained("lightblue/Jamba - v0.1 - chat - multilingual", device_map="auto", quantization_config=double_quant_config, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("lightblue/Jamba - v0.1 - chat - multilingual")

input_text = """<|im_start|>system 
You are a helpful AI assistant.
<|im_end|> 
<|im_start|>user
What is the most interesting fact about kangaroos that you know?
<|im_end|> 
<|im_start|>assistant
"""

input_ids = tokenizer(input_text, return_tensors='pt').to(model.device)["input_ids"]

outputs = model.generate(input_ids, max_new_tokens=256, temperature=0.0, repetition_penalty=1.1)

print(tokenizer.batch_decode(outputs)[0])
# <|startoftext|><|im_start|>system 
# You are a helpful AI assistant.
# <|im_end|> 
# <|im_start|>user
# What is the most interesting fact about kangaroos that you know?
# <|im_end|> 
# <|im_start|>assistant
# One of the most interesting facts about kangaroos is their unique reproductive system, which involves embryonic diapause and multiple births. Female kangaroos can have up to three joeys at different stages of development simultaneously: one fully developed in the pouch, another developing inside her body, and an embryo waiting for its turn.<|im_end|>
# <|endoftext|>

Advanced Usage

The model can be controlled by system messages. For example, you can make the model write in simple or complex English, or make it answer in rhyme. Here are some examples:

<|startoftext|><|startoftext|><|im_start|>system 
You are a helpful AI assistant. You write all answers in very simple English.
<|im_end|> 
<|im_start|>user
Write a 50 word analysis of why sausages are better than bacon.
<|im_end|> 
<|im_start|>assistant
Sausages are better than bacon because they have more variety and can be eaten for breakfast, lunch, or dinner. They come in different flavors and sizes, making them versatile. Sausages also contain less fat and calories compared to bacon, which is healthier for people who want to watch their diet.<|im_end|>
<|endoftext|>

📚 Documentation

Initial testing results

Overall, the model seems to have decent knowledge in some areas while hallucinating incorrect information in others. It also seems fairly easy to steer the model using the system message, being able to rhyme and modulate English difficulty to varying levels of success. The multi - turn accuracy seems decent from very little testing. The multilingual tests show that this model fairly reliably replies in whatever language it has been prompted in.

Training details

The model was trained on 2 open - source datasets (one multilingual) for one epoch on an A100 (80GB) x 4 environment for 3 hours.

Training data

[jondurbin/airoboros - 3.2](https://huggingface.co/datasets/jondurbin/airoboros - 3.2)

A ~59K example dataset of curated LLM tasks in English, primarily generated with GPT - 4. This dataset has been used by some of the best - performing open - source LLMs in the world (e.g. [jondurbin/bagel - 7b - v0.4](https://huggingface.co/jondurbin/bagel - 7b - v0.4), [NousResearch/Nous - Hermes - 2 - Mixtral - 8x7B - DPO](https://huggingface.co/NousResearch/Nous - Hermes - 2 - Mixtral - 8x7B - DPO))

🔧 Technical Details

Property	Details
Model Type	Joint Attention and Mamba (Jamba)
License	Apache 2.0
Context length	256K
Knowledge cutoff date	March 5, 2024

📄 License

This model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご