Synatra-7B-v0.3-RP: An Open-Source Korean Large Model - Free Role-Playing and Dialogue Generation

Synatra 7B V0.3 RP

Developed by maywell

Synatra-7B-v0.3-RP is a Korean large language model fine-tuned based on Mistral-7B-Instruct-v0.1, specializing in role-play and dialogue generation.

Large Language Model

Transformers

Korean#Korean Optimization #Role-Play Specialized #Non-Commercial License

Downloads 3,587

Release Time : 10/29/2023

Model Overview

This model is optimized for Korean dialogue generation, particularly suitable for role-play scenarios, using the ChatML instruction format.

Model Features

Korean Optimization

Specifically optimized for Korean understanding and generation

Role-Play Specialized

Particularly suitable for role-play dialogue scenarios

ChatML Format Support

Uses the standardized ChatML instruction format

Model Capabilities

Korean Text Generation

Role-Play Dialogue

Common Sense Q&A

Multi-Turn Dialogue

Use Cases

Entertainment

Role-Play Chat

Immersive dialogue with virtual characters

Generates natural dialogues that align with character settings

Education

Korean Learning Assistant

Provides a Korean conversation practice partner

🚀 Synatra-7B-v0.3-RP🐧

Synatra-7B-v0.3-RP is a text - generation model based on the Mistral architecture, offering high - quality text generation capabilities.

🚀 Quick Start

Since the chat_template already contains the instruction format, you can use the following code to start using the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("maywell/Synatra-7B-v0.3-RP")
tokenizer = AutoTokenizer.from_pretrained("maywell/Synatra-7B-v0.3-RP")

messages = [
    {"role": "user", "content": "바나나는 원래 하얀색이야?"},
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

✨ Features

Based on the mistralai/Mistral - 7B - Instruct - v0.1 base model.
Follows the ChatML instruction format.
The following tasks have been completed:
- Produce an RP - based tuned model.
- Clean the dataset.
- Supplement common sense.
Planned tasks include improving language understanding and changing the tokenizer.

📦 Installation

The README does not provide specific installation steps, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("maywell/Synatra-7B-v0.3-RP")
tokenizer = AutoTokenizer.from_pretrained("maywell/Synatra-7B-v0.3-RP")

messages = [
    {"role": "user", "content": "바나나는 원래 하얀색이야?"},
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

📚 Documentation

Model Details

Property	Details
Base Model	mistralai/Mistral-7B-Instruct-v0.1
Trained On	A6000 48GB * 8
Instruction format	It follows ChatML format.

Model Benchmark

Ko - LLM - Leaderboard

On Benchmarking...

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	57.38
ARC (25 - shot)	62.2
HellaSwag (10 - shot)	82.29
MMLU (5 - shot)	60.8
TruthfulQA (0 - shot)	52.64
Winogrande (5 - shot)	76.48
GSM8K (5 - shot)	21.15
DROP (3 - shot)	46.06

Why It's benchmark score is lower than preview version?

Apparently, the preview model uses the Alpaca Style prompt which has no pre - fix, but ChatML does.

🔧 Technical Details

The README does not provide specific technical details, so this section is skipped.

📄 License

This model is strictly [non - commercial](https://creativecommons.org/licenses/by - nc/4.0/) (cc - by - nc - 4.0) use only. The "Model" is completely free (i.e., base model, derivates, merges/mixes) to use for non - commercial purposes as long as the included cc - by - nc - 4.0 license in any parent repository, and the non - commercial use statute remains, regardless of other models' licences. The licence can be changed after a new model is released. If you are to use this model for commercial purposes, contact the author.

⚠️ Important Note

Synatra is a personal project developed with the resources of a single person. If you like the model, you can support the research by buying the author a coffee. If you want to be a sponsor, contact the author on Telegram AlzarTakkarsen.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご