🚀 Turkish-Gemma-9b-v0.1
This is a text generation model based on Google's Gemma-2-9b, specifically optimized for Turkish language tasks.
🚀 Quick Start
The Turkish-Gemma-9b-v0.1 is developed based on Gemma-2-9b through a combination of continual pre - training, supervised fine - tuning (SFT), direct preference optimization (DPO), and model merging. It is designed for Turkish text generation tasks, offering coherent and context - relevant continuations and answers.
However, due to the diverse nature of the training data, which includes large - scale pre - training corpora, instruction - tuning data, and human preference data, the model may have biases. Users should be aware of these and use the model responsibly.
You can easily demo the model here (Coming soon!): https://cosmos.yildiz.edu.tr/cosmosllm
✨ Features
- Turkish - Specific Optimization: Tailored for Turkish text generation tasks.
- Multiple Training Techniques: Developed using continual pre - training, SFT, DPO, and model merging.
- Reliable Evaluation: Evaluated on a carefully designed dataset with human annotations for reliable comparison.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
import transformers
import torch
model_id = "ytu - ce - cosmos/Turkish - Gemma - 9b - v0.1"
pipeline = transformers.pipeline(
"text - generation",
model = model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "user", "content": "İsmi RD olan bir fonksiyon ona verilen sayının çarpmaya göre tersini döndürmektedir. Örneğin RD(3)=1/3. Buna göre RD(X)=X ifadesini doğru yapan kaç X değeri vardır?"}
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<end_of_turn>")
]
outputs = pipeline(
messages,
max_new_tokens = 512,
eos_token_id = terminators,
do_sample = True,
temperature = 0.6,
top_p = 0.9,
)
print(outputs[0]["generated_text"][-1])
Advanced Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "ytu - ce - cosmos/Turkish - Gemma - 9b - v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype = torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "user", "content": "İsmi RD olan bir fonksiyon ona verilen sayının çarpmaya göre tersini döndürmektedir. Örneğin RD(3)=1/3. Buna göre RD(X)=X ifadesini doğru yapan kaç X değeri vardır?"}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt = True,
return_tensors = "pt"
).to(model.device)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<end_of_turn>")
]
outputs = model.generate(
input_ids,
max_new_tokens = 512,
eos_token_id = terminators,
do_sample = False,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens = True))
📚 Documentation
🏆 Model Comparison: Win Rates
Model Name |
Win Rate |
Qwen/Qwen3-30B-A3B |
62.39% |
gpt-4o-mini |
62.12% |
google/gemma-3-12b-it |
61.61% |
google/gemma-2-27b-it |
57.91% |
ytu-ce-cosmos/Turkish-Gemma-9b-v0.1 |
57.30% |
google/gemma-2-9b-it |
54.13% |
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1 |
36.89% |
Voting Methodology
A question and two answers from different models were presented to human judges. The judges selected the better answer based on their preferences. For example, in the question below, the judge selected the answer on the right:

📊 Turkish Evaluation Benchmark Results (via malhajar17/lm-evaluation-harness_turkish
)
Model Name |
Average |
MMLU |
Truthful_QA |
ARC |
Hellaswag |
Gsm8K |
Winogrande |
Qwen/Qwen2.5-72B-Instruct |
67.69 |
77.28 |
59.86 |
61.52 |
61.98 |
83.6 |
61.92 |
google/gemma-3-27b-it |
67.36 |
70.2 |
57.06 |
66.98 |
66.58 |
77.52 |
65.8 |
google/gemma-2-27b-it |
65.57 |
66.49 |
57.45 |
63.65 |
63.86 |
76.54 |
65.4 |
meta-llama/Llama-3-1-70B-Instruct |
63.92 |
74.00 |
51.41 |
59.64 |
64.31 |
66.13 |
66.90 |
Qwen/Qwen2.5-32B-Instruct |
63.74 |
70.93 |
57.87 |
57.00 |
57.04 |
77.83 |
61.77 |
ytu-ce-cosmos/Turkish-Gemma-9b-v0.1 |
63.31 |
63.85 |
54.21 |
59.64 |
64.19 |
73.42 |
64.53 |
google/gemma-3-12b-it |
62.94 |
63.92 |
57.16 |
60.67 |
62.00 |
72.06 |
61.77 |
Qwen/Qwen2.5-14B-it |
60.34 |
65.28 |
59.00 |
50.00 |
52.22 |
76.77 |
58.77 |
google/gemma-2-9b-it |
59.14 |
61.07 |
55.77 |
56.31 |
56.48 |
63.10 |
62.09 |
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1 |
55.03 |
51.97 |
57.56 |
51.02 |
52.96 |
59.87 |
57.77 |
Qwen/Qwen2.5-7B-Instruct |
53.42 |
56.31 |
55.99 |
42.06 |
44.71 |
64.16 |
59.66 |
🔧 Technical Details
The model Turkish-Gemma-9b-v0.1 is based on Google's Gemma-2-9b. It is trained using a combination of continual pre - training, supervised fine - tuning (SFT), direct preference optimization (DPO), and model merging. To evaluate model performance, a dataset of 1,450 carefully designed questions across diverse categories was compiled. Each question was reviewed and rated by 18 human annotators, enabling a reliable comparison across multiple models.
📄 License
The model is under the gemma2 license.
Acknowledgments
- Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage 🤗
- Computing resources used in this work were provided by the National Center for High Performance Computing of Turkey (UHeM) under grant numbers 1016912023 and 1018512024
Contact
COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department
https://cosmos.yildiz.edu.tr/
cosmos@yildiz.edu.tr