G

Gemma 2 Llama Swallow 9b It V0.1

Developed by tokyotech-llm
The Gemma-2-Llama-Swallow series of models are multilingual large models constructed through continuous pre-training based on Gemma-2, with a particular enhancement in Japanese ability.
Downloads 2,491
Release Time : 4/23/2025

Model Overview

While retaining English ability, this model significantly improves Japanese processing ability through continuous pre-training with approximately 200 billion tokens, and is suitable for multilingual tasks and Japanese instruction tasks.

Model Features

Enhanced multilingual ability
On the basis of retaining the original English ability, the Japanese processing ability is significantly improved.
Large-scale continuous pre-training
Continuous pre-training is carried out using approximately 200 billion token data, including Japanese web corpora, Wikipedia, etc.
Instruction fine-tuning optimization
Supervised fine-tuning is carried out on specially constructed Japanese synthetic data to improve the performance of instruction tasks.

Model Capabilities

Japanese text generation
English text generation
Multi-round dialogue
Machine translation
Mathematical reasoning
Code generation

Use Cases

Language processing
Japanese dialogue system
Build a Japanese intelligent assistant
Scored 0.759 in the Japanese MT-Bench
Multilingual content generation
Generate Japanese and English content
Education
Japanese learning assistance
Help learners practice Japanese
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase