Open-source model vntl-llama3-8b-v2-gguf - Free deployment to facilitate English translation of Japanese visual novels

Vntl Llama3 8b V2 Gguf

Developed by lmg-anon

QLoRA fine-tuned version based on LLaMA 3 Youko, specialized in Japanese visual novel English translation tasks

Machine Translation Supports Multiple Languages#Japanese visual novel translation #Literal style optimization #Multi-line text support

Downloads 123

Release Time : 1/2/2025

Model Overview

This model is a language model optimized for translating Japanese visual novels into English, fine-tuned with the new VNTL dataset, showing significant improvements in accuracy and stability

Model Features

High-accuracy literal style

New dataset brings higher translation accuracy with output leaning toward literal style

Multi-line translation support

Compared to previous single-line limitations, it can now handle multi-line continuous text translation

Metadata enhancement

Supports guiding the translation process through character information and background knowledge

Stable output

Maintains low error rates even when running with high temperature parameters

Model Capabilities

Japanese to English translation

Visual novel text processing

Character dialogue translation

Culture-specific term translation

Use Cases

Game localization

Visual novel translation

Translating Japanese visual novel content into English

Translation output with improved accuracy and consistent style

Multimedia content production

Subtitle generation

Generating English subtitles for Japanese anime/games

Translation maintaining character tone consistency

🚀 LLAMA 3 Youko QLoRA Fine-tune for Japanese-English Translation

This project is a QLoRA fine-tune of LLaMA 3 Youko, leveraging a new version of the VNTL dataset. Its core value lies in enhancing the performance of large language models (LLMs) in translating Japanese visual novels to English, offering more accurate and stable translation results.

📄 License

License: llama3

📦 Dataset

Datasets: lmg-anon/VNTL-v5-1k

🗣️ Language

Supported Languages: ja, en

🧠 Base Model

Base Model: rinna/llama-3-youko-8b

🛠️ Pipeline Tag

Pipeline Tag: translation

🚀 Quick Start

This is a LLaMA 3 Youko QLoRA fine-tune, created using a new version of the VNTL dataset. The aim is to improve the performance of LLMs in translating Japanese visual novels to English. Unlike the previous version, this one doesn't include the "chat mode".

✨ Features

Enhanced Performance: The new version of VNTL 8B has been rebuilt and expanded from the ground up. It outperforms the previous version in terms of accuracy and stability, making far fewer mistakes even at high temperatures.
Prompt Format Change: Switched to the default LLaMA3 prompt format to address users' difficulties with the custom one.
Multi-line Translation Support: Added proper support for multi-line translations, while the old version only handled single lines.
Higher Translation Accuracy: Overall better translation accuracy, although the translations tend to be more literal compared to the previous version.

🔧 Technical Details

Training Parameters

This fine-tune uses similar hyperparameters as the previous version, with the only difference being the brand-new dataset.

Parameter	Value
Rank	128
Alpha	32
Effective Batch Size	45
Warmup Ratio	0.02
Learning Rate	6e-5
Embedding Learning Rate	1e-5
Optimizer	grokadamw
LR Schedule	cosine
Weight Decay	0.01

Train Loss: 0.42

📚 Documentation

Notes

For this new version of VNTL 8B, the dataset has been rebuilt and expanded from scratch. It performs really well, outperforming the previous version in accuracy and stability. It makes far fewer mistakes even at high temperatures (though temperature 0 is still recommended for the best accuracy).

Some major changes in this version:

Switched to the default LLaMA3 prompt format since people had trouble with the custom one
Added proper support for multi-line translations (the old version only handled single lines)
Overall better translation accuracy

One thing to note: while the translations are more accurate, they tend to be more literal compared to the previous version.

Sampling Recommendations

💡 Usage Tip

For optimal results, it's highly recommended to use neutral sampling parameters (temperature 0 with no repetition penalty) when using this model.

Translation Prompt

This fine-tune uses the LLaMA 3 prompt format. Here is an example prompt for translation:

<|begin_of_text|><|start_header_id|>Metadata<|end_header_id|>

[character] Name: Uryuu Shingo (瓜生 新吾) | Gender: Male | Aliases: Onii-chan (お兄ちゃん)
[character] Name: Uryuu Sakuno (瓜生 桜乃) | Gender: Female<|eot_id|><|start_header_id|>Japanese<|end_header_id|>

[桜乃]: 『……ごめん』<|eot_id|><|start_header_id|>English<|end_header_id|>

[Sakuno]: 『... Sorry.』<|eot_id|><|start_header_id|>Japanese<|end_header_id|>

[新吾]: 「ううん、こう言っちゃなんだけど、迷子でよかったよ。桜乃は可愛いから、いろいろ心配しちゃってたんだぞ俺」<|eot_id|><|start_header_id|>English<|end_header_id|>

[Shingo]: "Nah, I know it’s weird to say this, but I’m glad you got lost. You’re so cute, Sakuno, so I was really worried about you."<|eot_id|>

The generated translation for that prompt, with temperature 0, is:

[Shingo]: "Nah, I know it’s weird to say this, but I’m glad you got lost. You’re so cute, Sakuno, so I was really worried about you."

Trivia

The Metadata section isn't limited to character information - you can also add trivia and teach the model the correct way to pronounce words it struggles with.

Here's an example:

<|begin_of_text|><|start_header_id|>Metadata<|end_header_id|>

[character] Name: Uryuu Shingo (瓜生 新吾) | Gender: Male | Aliases: Onii-chan (お兄ちゃん)
[character] Name: Uryuu Sakuno (瓜生 桜乃) | Gender: Female
[element] Name: Murasamemaru (叢雨丸) | Type: Quality<|eot_id|><|start_header_id|>Japanese<|end_header_id|>

[桜乃]: 『……ごめん』<|eot_id|><|start_header_id|>English<|end_header_id|>

[Sakuno]: 『... Sorry.』<|eot_id|><|start_header_id|>Japanese<|end_header_id|>

[新吾]: 「ううん、こう言っちゃなんだけど、迷子でよかったよ。桜乃は叢雨丸いから、いろいろ心配しちゃってたんだぞ俺」<|eot_id|><|start_header_id|>English<|end_header_id|>

The generated translation for that prompt, with temperature 0, is:

[Shingo]: "Nah, I know it’s not the best thing to say, but I’m glad you got lost. Sakuno’s Murasamemaru, so I was really worried about you, you know?"

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご