๐ Qwen2.5 32B for Japanese to English Light Novel translation
This model is fine - tuned for translating Japanese light and web novels into English, capable of handling entire chapters with up to 32K tokens for both input and output.
๐ Quick Start
This model was fine - tuned on light and web novel for Japanese to English translation. It can translate entire chapters (up to 32K tokens total for input and output).
โจ Features
- Large - scale translation: Capable of translating entire chapters with a combined input and output token limit of up to 32K.
- Glossary support: Allows users to provide custom translations for nouns and character names at runtime.
๐ฆ Installation
Load in llama.cpp
๐ป Usage Examples
Basic Usage
Prompt format
<|im_start|>system
Translate this text from Japanese to English.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
Example:
<|im_start|>system
Translate this text from Japanese to English.<|im_end|>
<|im_start|>user
<GLOSSARY>
ใใคใณ : Myne
</GLOSSARY>
ใใคใณใใซใใใ่ฟใใซๆฅใใ<|im_end|>
<|im_start|>assistant
Myne, Lutz is here to take you home.
The glossary is optional. Remove it if not needed.
Advanced Usage
Text preprocessing
The Japanese text must be preprocessed with the following clean_string
function that replaces some unicode characters with ASCII equivalents. Failure to do this may cause issues.
import ftfy
FTFY_ADDITIONAL_MAP = {
"โ": "--",
"โ": "-",
"โธป": "----",
"ยซ": "\"",
"ยป": "\"",
"ใ": "\"",
"ใ": "\"",
"โง": "*",
"โฝ": "*",
"โฌค": "*",
"โญ": "*",
"โด": "*",
"โต": "*",
"โฉ": "*",
"ใ": "[",
"ใ": "]",
"ใ": "[",
"ใ": "]",
"ใ": "[",
"ใ": "]",
"ใ": "<",
"ใ": ">",
"ใ": "<<",
"ใ": ">>",
}
def clean_string(text: str, strip: bool = True) -> str:
config = ftfy.TextFixerConfig(normalization="NFC")
s = ftfy.fix_text(text, config=config)
s = "\n".join((x.strip() if strip else x.rstrip()) for x in s.splitlines())
for b, g in FTFY_ADDITIONAL_MAP.items():
s = s.replace(b, g)
return s
Glossary
You can provide up to 30 custom translations for nouns and character names at runtime. Prefix your chapter with glossary terms (one per line) Japanese term : English term
inside <GLOSSARY></GLOSSARY>
tags.
glossary = [
{"ja": "ใใคใณ", "en": "Myne"},
]
chapter_text = "ใใคใณใใซใใใ่ฟใใซๆฅใใ"
def make_glossary_str(glossary: list[dict[str, str]]) -> str:
if glossart is None or len(glossary) == 0:
return ""
unique_glossary = {(term['ja'], term['en']) for term in glossary}
terms = "\n".join([f"{ja} : {en}" for ja, en in unique_glossary])
return f"<GLOSSARY>\n{terms}\n</GLOSSARY>\n"
user_prompt = f"{make_glossary_str(glossary)}{clean_string(chapter_text)}"
<GLOSSARY>
ใใคใณ : Myne
</GLOSSARY>
ใใคใณใใซใใใ่ฟใใซๆฅใใ
๐ License
This model is licensed under the apache - 2.0 license.
๐ Documentation
Property |
Details |
Base Model |
thefrigidliquidation/lightnovel - translate - Qwen2.5 - 32B |
Language |
en, ja |
License |
apache - 2.0 |
Pipeline Tag |
text - generation |