Turkish - Gemma - 9b - v0.1: An Open - source Turkish Text Generation Model for Precise Turkish Content Generation

Turkish Gemma 9b V0.1

Developed by ytu-ce-cosmos

Turkish-Gemma-9b-v0.1 is a Turkish text generation model developed based on Gemma-2-9b, optimized through continued pretraining, supervised fine-tuning (SFT), direct preference optimization (DPO), and model merging techniques.

Large Language Model

Safetensors

#Turkish text generation #Mathematical reasoning #Instruction fine-tuning

Downloads 167

Release Time : 4/18/2025

Model Overview

This model is specifically designed for Turkish text generation tasks, capable of producing coherent and contextually relevant continuations and responses. Suitable for conversational interactions and instruction-following tasks.

Model Features

Turkish language optimization

Specifically optimized for Turkish through continued pretraining and fine-tuning, enhancing language understanding and generation capabilities.

Multi-stage training

Combines various training methods including continued pretraining, supervised fine-tuning (SFT), and direct preference optimization (DPO).

Outstanding performance

Excels in Turkish evaluation benchmarks, achieving higher win rates compared to several similar models.

Model Capabilities

Turkish text generation

Conversational interaction

Instruction following

Mathematical problem-solving

Use Cases

Education

Mathematical problem-solving

Solves Turkish-language mathematical problems, such as those related to functions and equations.

Capable of correctly solving and explaining mathematical problems, such as the solution for RD(X)=X.

Customer service

Turkish customer service dialogue

Generates natural and fluent Turkish customer service responses.

🚀 Turkish-Gemma-9b-v0.1

This is a text generation model based on Google's Gemma-2-9b, specifically optimized for Turkish language tasks.

🚀 Quick Start

The Turkish-Gemma-9b-v0.1 is developed based on Gemma-2-9b through a combination of continual pre - training, supervised fine - tuning (SFT), direct preference optimization (DPO), and model merging. It is designed for Turkish text generation tasks, offering coherent and context - relevant continuations and answers.

However, due to the diverse nature of the training data, which includes large - scale pre - training corpora, instruction - tuning data, and human preference data, the model may have biases. Users should be aware of these and use the model responsibly.

You can easily demo the model here (Coming soon!): https://cosmos.yildiz.edu.tr/cosmosllm

✨ Features

Turkish - Specific Optimization: Tailored for Turkish text generation tasks.
Multiple Training Techniques: Developed using continual pre - training, SFT, DPO, and model merging.
Reliable Evaluation: Evaluated on a carefully designed dataset with human annotations for reliable comparison.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

import transformers
import torch
model_id = "ytu - ce - cosmos/Turkish - Gemma - 9b - v0.1"
pipeline = transformers.pipeline(
    "text - generation",
    model = model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)
messages = [
    {"role": "user", "content": "İsmi RD olan bir fonksiyon ona verilen sayının çarpmaya göre tersini döndürmektedir. Örneğin RD(3)=1/3. Buna göre RD(X)=X ifadesini doğru yapan kaç X değeri vardır?"}
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<end_of_turn>")
]

outputs = pipeline(
    messages,
    max_new_tokens = 512,
    eos_token_id = terminators,
    do_sample = True,
    temperature = 0.6,
    top_p = 0.9,
)
print(outputs[0]["generated_text"][-1])
# RD(X) = X ifadesi, bir sayının çarpmaya göre tersinin kendisiyle eşit olması anlamına gelir. Yani, X ile 1/X aynı olmalıdır. Bu durum yalnızca X'in karesi 1 olduğunda gerçekleşir:

# X² = 1

# Bu denklemin çözümleri:

# X = 1 ve X = -1

# Dolayısıyla, RD(X) = X eşitliğini sağlayan *iki* X değeri vardır: *1* ve *-1*.

Advanced Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "ytu - ce - cosmos/Turkish - Gemma - 9b - v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype = torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "user", "content": "İsmi RD olan bir fonksiyon ona verilen sayının çarpmaya göre tersini döndürmektedir. Örneğin RD(3)=1/3. Buna göre RD(X)=X ifadesini doğru yapan kaç X değeri vardır?"}
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt = True,
    return_tensors = "pt"
).to(model.device)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<end_of_turn>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens = 512,
    eos_token_id = terminators,
    do_sample = False,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens = True))
# RD(X) = X ifadesi, bir sayının çarpmaya göre tersinin kendisiyle eşit olması anlamına gelir. Yani, X ile 1/X aynı olmalıdır. Bu durum yalnızca X'in karesi 1 olduğunda gerçekleşir:

# X² = 1

# Bu denklemin çözümleri:

# X = 1 ve X = -1

# Dolayısıyla, RD(X) = X eşitliğini sağlayan *iki* X değeri vardır: *1* ve *-1*.

📚 Documentation

🏆 Model Comparison: Win Rates

Model Name	Win Rate
Qwen/Qwen3-30B-A3B	62.39%
gpt-4o-mini	62.12%
google/gemma-3-12b-it	61.61%
google/gemma-2-27b-it	57.91%
ytu-ce-cosmos/Turkish-Gemma-9b-v0.1	57.30%
google/gemma-2-9b-it	54.13%
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1	36.89%

Voting Methodology

A question and two answers from different models were presented to human judges. The judges selected the better answer based on their preferences. For example, in the question below, the judge selected the answer on the right: Alt text

📊 Turkish Evaluation Benchmark Results (via `malhajar17/lm-evaluation-harness_turkish`)

Model Name	Average	MMLU	Truthful_QA	ARC	Hellaswag	Gsm8K	Winogrande
Qwen/Qwen2.5-72B-Instruct	67.69	77.28	59.86	61.52	61.98	83.6	61.92
google/gemma-3-27b-it	67.36	70.2	57.06	66.98	66.58	77.52	65.8
google/gemma-2-27b-it	65.57	66.49	57.45	63.65	63.86	76.54	65.4
meta-llama/Llama-3-1-70B-Instruct	63.92	74.00	51.41	59.64	64.31	66.13	66.90
Qwen/Qwen2.5-32B-Instruct	63.74	70.93	57.87	57.00	57.04	77.83	61.77
ytu-ce-cosmos/Turkish-Gemma-9b-v0.1	63.31	63.85	54.21	59.64	64.19	73.42	64.53
google/gemma-3-12b-it	62.94	63.92	57.16	60.67	62.00	72.06	61.77
Qwen/Qwen2.5-14B-it	60.34	65.28	59.00	50.00	52.22	76.77	58.77
google/gemma-2-9b-it	59.14	61.07	55.77	56.31	56.48	63.10	62.09
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1	55.03	51.97	57.56	51.02	52.96	59.87	57.77
Qwen/Qwen2.5-7B-Instruct	53.42	56.31	55.99	42.06	44.71	64.16	59.66

🔧 Technical Details

The model Turkish-Gemma-9b-v0.1 is based on Google's Gemma-2-9b. It is trained using a combination of continual pre - training, supervised fine - tuning (SFT), direct preference optimization (DPO), and model merging. To evaluate model performance, a dataset of 1,450 carefully designed questions across diverse categories was compiled. Each question was reviewed and rated by 18 human annotators, enabling a reliable comparison across multiple models.

📄 License

The model is under the gemma2 license.

Acknowledgments

Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage 🤗
Computing resources used in this work were provided by the National Center for High Performance Computing of Turkey (UHeM) under grant numbers 1016912023 and 1018512024

Contact

COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department
https://cosmos.yildiz.edu.tr/
cosmos@yildiz.edu.tr

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご