Karakuri-lm-70b-chat-v0.1 Open-source Language Model - Free Deployment, Enhanced Japanese and Multilingual Processing Capabilities

Karakuri Lm 70b Chat V0.1

Developed by karakuri-ai

KARAKURI LM is a pre-trained language model built on Llama 2, which enhances Japanese processing capabilities and is further pre-trained on Japanese and multilingual corpora.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Japanese enhancement #Continuous learning fine-tuning #Multi-round dialogue optimization

Downloads 2,300

Release Time : 1/26/2024

Model Overview

KARAKURI LM Chat is a fine-tuned version of KARAKURI LM, trained using SteerLM technology, and performs excellently in Japanese and English dialogue tasks.

Model Features

Enhanced Japanese capabilities

Integrate additional Japanese vocabulary on the basis of Llama 2 and further pre-train on Japanese and multilingual corpora.

Continuous learning fine-tuning

Adopt a continuous learning method during fine-tuning, combining structured dialogue datasets and unstructured corpora.

Excellent performance

Outperform other Japanese open-source models on MT-Bench-jp and be comparable to Llama 2 70B Chat on MT-Bench.

Model Capabilities

Japanese text generation

English text generation

Multi-round dialogue

Attribute-controlled output

Use Cases

Dialogue system

Japanese customer service robot

Used to handle customer service consultations from Japanese users

Outperform other Japanese open-source models on MT-Bench-jp

Multilingual dialogue assistant

Support multi-round dialogue in English and Japanese

Be comparable to Llama 2 70B Chat on MT-Bench

🚀 KARAKURI LM

KARAKURI LM is a pre - trained language model built on Llama 2. It enhances Llama 2's capabilities by adding more Japanese vocabulary and further pre - training on a mix of Japanese and multilingual corpora. KARAKURI LM Chat, a fine - tuned version, is trained on various datasets using the SteerLM technique.

🚀 Quick Start

You can run the model using the pipeline() function from 🤗 Transformers:

from transformers import pipeline, Conversation

chatbot = pipeline("conversational", model="karakuri-ai/karakuri-lm-70b-chat-v0.1", device_map="auto", torch_dtype="auto")

conversation = Conversation("ÈÄ±Êú´„Å´Êó•Â∏∞„Çä„ÅßÊù±‰∫¨„Å´ÈÅä„Å≥„Å´Ë°å„Åì„ÅÜ„Å®ÊÄù„Å£„Å¶„ÅÑ„Åæ„Åô„ÄÇÊó•Â∏∞„Çä„Å™„ÅÆ„Åß„ÄÅÁü≠ÊôÇÈñì„ÅßÂõû„Çå„Çã„Åä„Åô„Åô„ÇÅ„ÅÆË¶≥ÂÖâ„Éó„É©„É≥„ÇíÊïô„Åà„Å¶„Åè„Å†„Åï„ÅÑ„ÄÇ")
conversation = chatbot(conversation, max_new_tokens=512)
conversation.messages[-1]["content"]

✨ Features

Enhanced Language Capabilities: Incorporates additional Japanese vocabulary and is pre - trained on a mixture of Japanese and multilingual corpora.
Fine - Tuned for Conversations: KARAKURI LM Chat is fine - tuned using the SteerLM technique on a combination of public and private datasets.
Continual Learning Approach: During fine - tuning, it uses a continual learning approach and includes unstructured corpora.
High Performance: Achieves top performance among Japanese open models on [MT - Bench - jp](https://api.wandb.ai/links/wandb - japan/6ff86bp3) and comparable performance to Llama 2 70B Chat on the original English [MT - Bench](https://huggingface.co/spaces/lmsys/mt - bench).

📦 Installation

The README does not provide specific installation steps, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import pipeline, Conversation

chatbot = pipeline("conversational", model="karakuri-ai/karakuri-lm-70b-chat-v0.1", device_map="auto", torch_dtype="auto")

conversation = Conversation("ÈÄ±Êú´„Å´Êó•Â∏∞„Çä„ÅßÊù±‰∫¨„Å´ÈÅä„Å≥„Å´Ë°å„Åì„ÅÜ„Å®ÊÄù„Å£„Å¶„ÅÑ„Åæ„Åô„ÄÇÊó•Â∏∞„Çä„Å™„ÅÆ„Åß„ÄÅÁü≠ÊôÇÈñì„ÅßÂõû„Çå„Çã„Åä„Åô„Åô„ÇÅ„ÅÆË¶≥ÂÖâ„Éó„É©„É≥„ÇíÊïô„Åà„Å¶„Åè„Å†„Åï„ÅÑ„ÄÇ")
conversation = chatbot(conversation, max_new_tokens=512)
conversation.messages[-1]["content"]

Advanced Usage

We use the following prompt template of multi - turn conversation in the Llama format, which includes an encoded string of multiple attribute values.

messages = [
    {"role": "system", "content": "System prompt"},
    {"role": "user", "content": "User prompt"},
    {"role": "assistant", "content": "Model response"},
    {"role": "user", "content": "User prompt"},
]
chatbot.tokenizer.apply_chat_template(messages, tokenize=False)
# <s>[INST] <<SYS>>
# System prompt
# <</SYS>>
#
# User prompt [ATTR] helpfulness: 4 correctness: 4 coherence: 4 complexity: 4 verbosity: 4 quality: 4 toxicity: 0 humor: 0 creativity: 0 [/ATTR] [/INST] Model response </s><s>[INST] User prompt [ATTR] helpfulness: 4 correctness: 4 coherence: 4 complexity: 4 verbosity: 4 quality: 4 toxicity: 0 humor: 0 creativity: 0 [/ATTR] [/INST]

If you want to change attribute values from the default values specified in the template, you can modify them to any values by adding the attribute values to the user messages:

messages = [
    {"role": "user", "content": "User prompt", "helpfulness": 0, "complexity": 0},
]
chatbot.tokenizer.apply_chat_template(messages, tokenize=False)
# <s>[INST] User prompt [ATTR] helpfulness: 0 correctness: 4 coherence: 4 complexity: 0 verbosity: 4 quality: 4 toxicity: 0 humor: 0 creativity: 0 [/ATTR] [/INST]

📚 Documentation

Model Details

Property	Details
Developed by	KARAKURI Inc.
Model Type	Causal decoder - only transformer language model
Languages	English and Japanese
Finetuned from	[karakuri - ai/karakuri - lm - 70b - v0.1](https://huggingface.co/karakuri - ai/karakuri - lm - 70b - v0.1)
Contact	For questions and comments about the model, please email `karakuri - rd@karakuri.ai`

Performance

At the time of release, KARAKURI LM 70B Chat v0.1 achieves the highest performance among Japanese open models on the [MT - Bench - jp](https://api.wandb.ai/links/wandb - japan/6ff86bp3):

Model	Size	Alignment	MT - Bench - jp
GPT - 4	-	RLHF	8.78
GPT - 3.5 - Turbo	-	RLHF	8.24
Claude 2.1	-	RLHF	8.18
Gemini Pro	-	RLHF	7.17
KARAKURI LM 70B Chat v0.1	70B	SteerLM	6.43
Qarasu - 14B - Chat - Plus - Unleashed	14B	SFT	6.26
Llama 2 70B Chat	70B	RLHF	5.23
ELYZA - Japanese - Llama - 2 - 13B	13B	SFT	5.05
Japanese - StableLM - Instruct - Beta - 70B	70B	SFT	5.03
Swallow - 70B - Instruct	70B	SFT	4.39

It also achieves performance comparable to Llama 2 70B Chat on the original English [MT - Bench](https://huggingface.co/spaces/lmsys/mt - bench):

Model	Average	MT - Bench	MT - Bench - jp
KARAKURI LM 70B Chat v0.1	6.52	6.61	6.43
Llama 2 70B Chat	6.04	6.86	5.23

Training

Training Datasets

OASST2
Our internal conversational datasets

Training Infrastructure

Hardware: KARAKURI LM 70B was trained on 32 nodes of an Amazon EC2 trn1.32xlarge instance.
Software: We use code based on [neuronx - nemo - megatron](https://github.com/aws - neuron/neuronx - nemo - megatron).

🔧 Technical Details

The model uses a continual learning approach during fine - tuning. Unlike the common practice of relying solely on structured conversational datasets, it also incorporates unstructured corpora, similar to what was used during its pretraining phase. The prompt template of multi - turn conversation in the Llama format includes an encoded string of multiple attribute values, with nine attributes in total. The first five are derived from HelpSteer, and the remaining four are derived from OASST2. The values are represented by integers ranging from 0 to 4.

📄 License

Subject to the license above, and except for commercial purposes, you are free to share and adapt KARAKURI LM, provided that you must, in a recognizable and appropriate manner, (i) state that you are using KARAKURI LM developed by KARAKURI Inc., when you publish or make available to third parties KARAKURI LM, its derivative works or modification, or any output or results of KARAKURI LM or its derivative works or modification, and (ii) indicate your contributions, if you modified any material of KARAKURI LM.

If you plan to use KARAKURI LM for commercial purposes, please contact us beforehand. You are not authorized to use KARAKURI LM for commercial purposes unless we expressly grant you such rights.

If you have any questions regarding the interpretation of above terms, please also feel free to contact us.

Acknowledgements

We gratefully acknowledge the support from AWS Japan through the [AWS LLM Development Support Program](https://aws.amazon.com/jp/local/llm - development - support - program/).

Citation

@misc {karakuri_lm_70b_chat_v01,
	author       = { {KARAKURI} {I}nc. },
	title        = { {KARAKURI} {LM} 70{B} {C}hat v0.1 },
	year         = { 2024 },
	url          = { https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1 },
	publisher    = { Hugging Face },
    journal      = { Hugging Face repository }
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご